All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/48] percpu: Consistent per cpu operations V4
@ 2014-02-14 20:18 Christoph Lameter
  2014-02-14 20:18 ` [PATCH 01/48] percpu: Add raw_cpu_ops Christoph Lameter
                   ` (48 more replies)
  0 siblings, 49 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

Can we please get this merged? The first patch alone would at least define
the functions required to enable the merging of the rest in any order and
through any tree.

There is a git tree that can be pulled at 

https://git.kernel.org/cgit/linux/kernel/git/christoph/percpu.git/

It has the following branches:

minimal		Just the first 5 patches to get the preemption checks going
most		All the conversion but not the removal of functionality in 47/48
all		All these patches 

V3->V4:
- Rediff patches
- Put the patches first that define the new API.
- Add patches to convert the __get_cpu_var stuff added in 3.14

V2->V3:
- Rediff patches
- Fix breakage caused by mips patches. Add a mips patch to convert from local_t.
- Update some descriptions.

V1->V2:
- Move legacy definition for __this_cpu_ptr into include/asm-generic/percpu.h
  so that users bypassing include/linux/percpu.h do not break (affects
  tile and s390)
- Merge raw_cpu_ops core and the patch to rename x86 __this_cpu primitives
  into one. Otherwise breakage will occur since x86 __this_cpu ops will fall
  back to generic ops which is not tolerated well by the preempt hackery
  in x86.
- Add notes to each patch that depends on another to avoid mismerges.
  Add acks etc.
- Use quilt-0.61 with the bug fix that ensures all mailing lists
  receive the postings intended for them.


The kernel has never been audited to ensure that this_cpu operations are
consistently used throughout the kernel. The code generated in many
places can be improved through the use of this_cpu operations (which uses
a segment register for relocation of per cpu offsets instead of
performing address calculations).

The patch set also addresses various consistency issues in general with
the per cpu macros.

A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
   because checks are skipped. This is typically shown through a raw_
   prefix. So this patch set changes the places where __this_cpu_ptr()
   is used to raw_cpu_ptr().

B. There has been the long term wish by some that __this_cpu operations
   would check for preemption. However, there are cases where preemption
   checks need to be skipped. This patch set adds raw_cpu operations that
   do not check for preemption and then adds preemption checks to the
   __this_cpu operations.

C. The use of __get_cpu_var is always a reference to a percpu variable
   that can also be handled via a this_cpu operation. This patch set
   replaces all uses of __get_cpu_var with this_cpu operations.

D. We can then use this_cpu RMW operations in various places replacing
   sequences of instructions by a single one.

E. The use of this_cpu operations throughout will allow other arches than
   x86 to implement optimized references and RMV operations to work with
   per cpu local data.

F. The use of this_cpu operations opens up the possibility to
   further optimize code that relies on synchronization through
   per cpu data.


The patch set works in a couple of stages:

I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
    Also converts the existing __this_cpu_xx_# primitive in the x86
    code to raw_cpu_xx_#.

II. Patch 2-4 use the raw_cpu operations in places that would give
     us false positives once they are enabled.

III. Patch 5 adds preemption checks to __this_cpu operations to allow
    checking if preemption is properly disabled when these functions
    are used.

IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
   with this_cpu_ptr. They do not depend on any changes to the percpu
   code. No preemption tests are skipped if they are applied.

V. Patches 21-46 are conversion patches that use this_cpu operations
   in various kernel subsystems/drivers or arch code.

VI. Patches 47/48 remove no longer used functions (__this_cpu_ptr
    and __get_cpu_var).  These should only be applied after all the
    conversion patches have made it and after we have done additional
    passes through the kernel to ensure that none of the uses of these
    functions remain.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 01/48] percpu: Add raw_cpu_ops
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                   ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: raw_cpu_ops --]
[-- Type: text/plain, Size: 30324 bytes --]

The patches following this one will add preemption checks to __this_cpu
ops so we need to have an alternative way to use this_cpu operations
without preemption checks.

raw_cpu_ops will be the basis for all other ops since these will be the
operations that do not implement any checks. 

Primitive operations are renamed by this patch from __this_cpu_xxx to
raw_cpu_xxxx.

Also change the uses of the x86 percpu primitives in preempt.h.
These depend directly on asm/percpu.h (header #include nesting issue).

V1->V2:
- move __this_cpu_ptr legacy definition to include/asm-generic/percpu.h
- Merge formerly separate x86 pieces because some x86 code in preempt.h
  relies on __this_cpu_ops/raw_cpu_ops to not fall back.

Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/percpu.h
===================================================================
--- linux.orig/include/linux/percpu.h	2014-01-30 14:40:36.186810863 -0600
+++ linux/include/linux/percpu.h	2014-01-30 14:40:36.186810863 -0600
@@ -243,6 +243,8 @@
 } while (0)
 
 /*
+ * this_cpu operations (C) 2008-2013 Christoph Lameter <cl@linux.com>
+ *
  * Optimized manipulation for memory allocated through the per cpu
  * allocator or for addresses of per cpu variables.
  *
@@ -296,7 +298,7 @@
 do {									\
 	unsigned long flags;						\
 	raw_local_irq_save(flags);					\
-	*__this_cpu_ptr(&(pcp)) op val;					\
+	*raw_cpu_ptr(&(pcp)) op val;					\
 	raw_local_irq_restore(flags);					\
 } while (0)
 
@@ -381,8 +383,8 @@
 	typeof(pcp) ret__;						\
 	unsigned long flags;						\
 	raw_local_irq_save(flags);					\
-	__this_cpu_add(pcp, val);					\
-	ret__ = __this_cpu_read(pcp);					\
+	raw_cpu_add(pcp, val);					\
+	ret__ = raw_cpu_read(pcp);					\
 	raw_local_irq_restore(flags);					\
 	ret__;								\
 })
@@ -411,8 +413,8 @@
 ({	typeof(pcp) ret__;						\
 	unsigned long flags;						\
 	raw_local_irq_save(flags);					\
-	ret__ = __this_cpu_read(pcp);					\
-	__this_cpu_write(pcp, nval);					\
+	ret__ = raw_cpu_read(pcp);					\
+	raw_cpu_write(pcp, nval);					\
 	raw_local_irq_restore(flags);					\
 	ret__;								\
 })
@@ -439,9 +441,9 @@
 	typeof(pcp) ret__;						\
 	unsigned long flags;						\
 	raw_local_irq_save(flags);					\
-	ret__ = __this_cpu_read(pcp);					\
+	ret__ = raw_cpu_read(pcp);					\
 	if (ret__ == (oval))						\
-		__this_cpu_write(pcp, nval);				\
+		raw_cpu_write(pcp, nval);				\
 	raw_local_irq_restore(flags);					\
 	ret__;								\
 })
@@ -476,7 +478,7 @@
 	int ret__;							\
 	unsigned long flags;						\
 	raw_local_irq_save(flags);					\
-	ret__ = __this_cpu_generic_cmpxchg_double(pcp1, pcp2,		\
+	ret__ = raw_cpu_generic_cmpxchg_double(pcp1, pcp2,		\
 			oval1, oval2, nval1, nval2);			\
 	raw_local_irq_restore(flags);					\
 	ret__;								\
@@ -504,12 +506,8 @@
 #endif
 
 /*
- * Generic percpu operations for context that are safe from preemption/interrupts.
- * Either we do not care about races or the caller has the
- * responsibility of handling preemption/interrupt issues. Arch code can still
- * override these instructions since the arch per cpu code may be more
- * efficient and may actually get race freeness for free (that is the
- * case for x86 for example).
+ * Generic percpu operations for contexts where we do not want to do
+ * any checks for preemptiosn.
  *
  * If there is no other protection through preempt disable and/or
  * disabling interupts then one of these RMW operations can show unexpected
@@ -517,211 +515,272 @@
  * or an interrupt occurred and the same percpu variable was modified from
  * the interrupt context.
  */
-#ifndef __this_cpu_read
-# ifndef __this_cpu_read_1
-#  define __this_cpu_read_1(pcp)	(*__this_cpu_ptr(&(pcp)))
+#ifndef raw_cpu_read
+# ifndef raw_cpu_read_1
+#  define raw_cpu_read_1(pcp)	(*raw_cpu_ptr(&(pcp)))
 # endif
-# ifndef __this_cpu_read_2
-#  define __this_cpu_read_2(pcp)	(*__this_cpu_ptr(&(pcp)))
+# ifndef raw_cpu_read_2
+#  define raw_cpu_read_2(pcp)	(*raw_cpu_ptr(&(pcp)))
 # endif
-# ifndef __this_cpu_read_4
-#  define __this_cpu_read_4(pcp)	(*__this_cpu_ptr(&(pcp)))
+# ifndef raw_cpu_read_4
+#  define raw_cpu_read_4(pcp)	(*raw_cpu_ptr(&(pcp)))
 # endif
-# ifndef __this_cpu_read_8
-#  define __this_cpu_read_8(pcp)	(*__this_cpu_ptr(&(pcp)))
+# ifndef raw_cpu_read_8
+#  define raw_cpu_read_8(pcp)	(*raw_cpu_ptr(&(pcp)))
 # endif
-# define __this_cpu_read(pcp)	__pcpu_size_call_return(__this_cpu_read_, (pcp))
+# define raw_cpu_read(pcp)	__pcpu_size_call_return(raw_cpu_read_, (pcp))
 #endif
 
-#define __this_cpu_generic_to_op(pcp, val, op)				\
+#define raw_cpu_generic_to_op(pcp, val, op)				\
 do {									\
-	*__this_cpu_ptr(&(pcp)) op val;					\
+	*raw_cpu_ptr(&(pcp)) op val;					\
 } while (0)
 
-#ifndef __this_cpu_write
-# ifndef __this_cpu_write_1
-#  define __this_cpu_write_1(pcp, val)	__this_cpu_generic_to_op((pcp), (val), =)
+
+#ifndef raw_cpu_write
+# ifndef raw_cpu_write_1
+#  define raw_cpu_write_1(pcp, val)	raw_cpu_generic_to_op((pcp), (val), =)
 # endif
-# ifndef __this_cpu_write_2
-#  define __this_cpu_write_2(pcp, val)	__this_cpu_generic_to_op((pcp), (val), =)
+# ifndef raw_cpu_write_2
+#  define raw_cpu_write_2(pcp, val)	raw_cpu_generic_to_op((pcp), (val), =)
 # endif
-# ifndef __this_cpu_write_4
-#  define __this_cpu_write_4(pcp, val)	__this_cpu_generic_to_op((pcp), (val), =)
+# ifndef raw_cpu_write_4
+#  define raw_cpu_write_4(pcp, val)	raw_cpu_generic_to_op((pcp), (val), =)
 # endif
-# ifndef __this_cpu_write_8
-#  define __this_cpu_write_8(pcp, val)	__this_cpu_generic_to_op((pcp), (val), =)
+# ifndef raw_cpu_write_8
+#  define raw_cpu_write_8(pcp, val)	raw_cpu_generic_to_op((pcp), (val), =)
 # endif
-# define __this_cpu_write(pcp, val)	__pcpu_size_call(__this_cpu_write_, (pcp), (val))
+# define raw_cpu_write(pcp, val)	__pcpu_size_call(raw_cpu_write_, (pcp), (val))
 #endif
 
-#ifndef __this_cpu_add
-# ifndef __this_cpu_add_1
-#  define __this_cpu_add_1(pcp, val)	__this_cpu_generic_to_op((pcp), (val), +=)
+#ifndef raw_cpu_add
+# ifndef raw_cpu_add_1
+#  define raw_cpu_add_1(pcp, val)	raw_cpu_generic_to_op((pcp), (val), +=)
 # endif
-# ifndef __this_cpu_add_2
-#  define __this_cpu_add_2(pcp, val)	__this_cpu_generic_to_op((pcp), (val), +=)
+# ifndef raw_cpu_add_2
+#  define raw_cpu_add_2(pcp, val)	raw_cpu_generic_to_op((pcp), (val), +=)
 # endif
-# ifndef __this_cpu_add_4
-#  define __this_cpu_add_4(pcp, val)	__this_cpu_generic_to_op((pcp), (val), +=)
+# ifndef raw_cpu_add_4
+#  define raw_cpu_add_4(pcp, val)	raw_cpu_generic_to_op((pcp), (val), +=)
 # endif
-# ifndef __this_cpu_add_8
-#  define __this_cpu_add_8(pcp, val)	__this_cpu_generic_to_op((pcp), (val), +=)
+# ifndef raw_cpu_add_8
+#  define raw_cpu_add_8(pcp, val)	raw_cpu_generic_to_op((pcp), (val), +=)
 # endif
-# define __this_cpu_add(pcp, val)	__pcpu_size_call(__this_cpu_add_, (pcp), (val))
+# define raw_cpu_add(pcp, val)	__pcpu_size_call(raw_cpu_add_, (pcp), (val))
 #endif
 
-#ifndef __this_cpu_sub
-# define __this_cpu_sub(pcp, val)	__this_cpu_add((pcp), -(typeof(pcp))(val))
+#ifndef raw_cpu_sub
+# define raw_cpu_sub(pcp, val)	raw_cpu_add((pcp), -(val))
 #endif
 
-#ifndef __this_cpu_inc
-# define __this_cpu_inc(pcp)		__this_cpu_add((pcp), 1)
+#ifndef raw_cpu_inc
+# define raw_cpu_inc(pcp)		raw_cpu_add((pcp), 1)
 #endif
 
-#ifndef __this_cpu_dec
-# define __this_cpu_dec(pcp)		__this_cpu_sub((pcp), 1)
+#ifndef raw_cpu_dec
+# define raw_cpu_dec(pcp)		raw_cpu_sub((pcp), 1)
 #endif
 
-#ifndef __this_cpu_and
-# ifndef __this_cpu_and_1
-#  define __this_cpu_and_1(pcp, val)	__this_cpu_generic_to_op((pcp), (val), &=)
+#ifndef raw_cpu_and
+# ifndef raw_cpu_and_1
+#  define raw_cpu_and_1(pcp, val)	raw_cpu_generic_to_op((pcp), (val), &=)
 # endif
-# ifndef __this_cpu_and_2
-#  define __this_cpu_and_2(pcp, val)	__this_cpu_generic_to_op((pcp), (val), &=)
+# ifndef raw_cpu_and_2
+#  define raw_cpu_and_2(pcp, val)	raw_cpu_generic_to_op((pcp), (val), &=)
 # endif
-# ifndef __this_cpu_and_4
-#  define __this_cpu_and_4(pcp, val)	__this_cpu_generic_to_op((pcp), (val), &=)
+# ifndef raw_cpu_and_4
+#  define raw_cpu_and_4(pcp, val)	raw_cpu_generic_to_op((pcp), (val), &=)
 # endif
-# ifndef __this_cpu_and_8
-#  define __this_cpu_and_8(pcp, val)	__this_cpu_generic_to_op((pcp), (val), &=)
+# ifndef raw_cpu_and_8
+#  define raw_cpu_and_8(pcp, val)	raw_cpu_generic_to_op((pcp), (val), &=)
 # endif
-# define __this_cpu_and(pcp, val)	__pcpu_size_call(__this_cpu_and_, (pcp), (val))
+# define raw_cpu_and(pcp, val)	__pcpu_size_call(raw_cpu_and_, (pcp), (val))
 #endif
 
-#ifndef __this_cpu_or
-# ifndef __this_cpu_or_1
-#  define __this_cpu_or_1(pcp, val)	__this_cpu_generic_to_op((pcp), (val), |=)
+#ifndef raw_cpu_or
+# ifndef raw_cpu_or_1
+#  define raw_cpu_or_1(pcp, val)	raw_cpu_generic_to_op((pcp), (val), |=)
 # endif
-# ifndef __this_cpu_or_2
-#  define __this_cpu_or_2(pcp, val)	__this_cpu_generic_to_op((pcp), (val), |=)
+# ifndef raw_cpu_or_2
+#  define raw_cpu_or_2(pcp, val)	raw_cpu_generic_to_op((pcp), (val), |=)
 # endif
-# ifndef __this_cpu_or_4
-#  define __this_cpu_or_4(pcp, val)	__this_cpu_generic_to_op((pcp), (val), |=)
+# ifndef raw_cpu_or_4
+#  define raw_cpu_or_4(pcp, val)	raw_cpu_generic_to_op((pcp), (val), |=)
 # endif
-# ifndef __this_cpu_or_8
-#  define __this_cpu_or_8(pcp, val)	__this_cpu_generic_to_op((pcp), (val), |=)
+# ifndef raw_cpu_or_8
+#  define raw_cpu_or_8(pcp, val)	raw_cpu_generic_to_op((pcp), (val), |=)
 # endif
-# define __this_cpu_or(pcp, val)	__pcpu_size_call(__this_cpu_or_, (pcp), (val))
+# define raw_cpu_or(pcp, val)	__pcpu_size_call(raw_cpu_or_, (pcp), (val))
 #endif
 
-#define __this_cpu_generic_add_return(pcp, val)				\
+#define raw_cpu_generic_add_return(pcp, val)				\
 ({									\
-	__this_cpu_add(pcp, val);					\
-	__this_cpu_read(pcp);						\
+	raw_cpu_add(pcp, val);						\
+	raw_cpu_read(pcp);						\
 })
 
-#ifndef __this_cpu_add_return
-# ifndef __this_cpu_add_return_1
-#  define __this_cpu_add_return_1(pcp, val)	__this_cpu_generic_add_return(pcp, val)
+#ifndef raw_cpu_add_return
+# ifndef raw_cpu_add_return_1
+#  define raw_cpu_add_return_1(pcp, val)	raw_cpu_generic_add_return(pcp, val)
 # endif
-# ifndef __this_cpu_add_return_2
-#  define __this_cpu_add_return_2(pcp, val)	__this_cpu_generic_add_return(pcp, val)
+# ifndef raw_cpu_add_return_2
+#  define raw_cpu_add_return_2(pcp, val)	raw_cpu_generic_add_return(pcp, val)
 # endif
-# ifndef __this_cpu_add_return_4
-#  define __this_cpu_add_return_4(pcp, val)	__this_cpu_generic_add_return(pcp, val)
+# ifndef raw_cpu_add_return_4
+#  define raw_cpu_add_return_4(pcp, val)	raw_cpu_generic_add_return(pcp, val)
 # endif
-# ifndef __this_cpu_add_return_8
-#  define __this_cpu_add_return_8(pcp, val)	__this_cpu_generic_add_return(pcp, val)
+# ifndef raw_cpu_add_return_8
+#  define raw_cpu_add_return_8(pcp, val)	raw_cpu_generic_add_return(pcp, val)
 # endif
-# define __this_cpu_add_return(pcp, val)	\
-	__pcpu_size_call_return2(__this_cpu_add_return_, pcp, val)
+# define raw_cpu_add_return(pcp, val)	\
+	__pcpu_size_call_return2(raw_add_return_, pcp, val)
 #endif
 
-#define __this_cpu_sub_return(pcp, val)	__this_cpu_add_return(pcp, -(typeof(pcp))(val))
-#define __this_cpu_inc_return(pcp)	__this_cpu_add_return(pcp, 1)
-#define __this_cpu_dec_return(pcp)	__this_cpu_add_return(pcp, -1)
+#define raw_cpu_sub_return(pcp, val)	raw_cpu_add_return(pcp, -(typeof(pcp))(val))
+#define raw_cpu_inc_return(pcp)	raw_cpu_add_return(pcp, 1)
+#define raw_cpu_dec_return(pcp)	raw_cpu_add_return(pcp, -1)
 
-#define __this_cpu_generic_xchg(pcp, nval)				\
+#define raw_cpu_generic_xchg(pcp, nval)					\
 ({	typeof(pcp) ret__;						\
-	ret__ = __this_cpu_read(pcp);					\
-	__this_cpu_write(pcp, nval);					\
+	ret__ = raw_cpu_read(pcp);					\
+	raw_cpu_write(pcp, nval);					\
 	ret__;								\
 })
 
-#ifndef __this_cpu_xchg
-# ifndef __this_cpu_xchg_1
-#  define __this_cpu_xchg_1(pcp, nval)	__this_cpu_generic_xchg(pcp, nval)
+#ifndef raw_cpu_xchg
+# ifndef raw_cpu_xchg_1
+#  define raw_cpu_xchg_1(pcp, nval)	raw_cpu_generic_xchg(pcp, nval)
 # endif
-# ifndef __this_cpu_xchg_2
-#  define __this_cpu_xchg_2(pcp, nval)	__this_cpu_generic_xchg(pcp, nval)
+# ifndef raw_cpu_xchg_2
+#  define raw_cpu_xchg_2(pcp, nval)	raw_cpu_generic_xchg(pcp, nval)
 # endif
-# ifndef __this_cpu_xchg_4
-#  define __this_cpu_xchg_4(pcp, nval)	__this_cpu_generic_xchg(pcp, nval)
+# ifndef raw_cpu_xchg_4
+#  define raw_cpu_xchg_4(pcp, nval)	raw_cpu_generic_xchg(pcp, nval)
 # endif
-# ifndef __this_cpu_xchg_8
-#  define __this_cpu_xchg_8(pcp, nval)	__this_cpu_generic_xchg(pcp, nval)
+# ifndef raw_cpu_xchg_8
+#  define raw_cpu_xchg_8(pcp, nval)	raw_cpu_generic_xchg(pcp, nval)
 # endif
-# define __this_cpu_xchg(pcp, nval)	\
-	__pcpu_size_call_return2(__this_cpu_xchg_, (pcp), nval)
+# define raw_cpu_xchg(pcp, nval)	\
+	__pcpu_size_call_return2(raw_cpu_xchg_, (pcp), nval)
 #endif
 
-#define __this_cpu_generic_cmpxchg(pcp, oval, nval)			\
+#define raw_cpu_generic_cmpxchg(pcp, oval, nval)			\
 ({									\
 	typeof(pcp) ret__;						\
-	ret__ = __this_cpu_read(pcp);					\
+	ret__ = raw_cpu_read(pcp);					\
 	if (ret__ == (oval))						\
-		__this_cpu_write(pcp, nval);				\
+		raw_cpu_write(pcp, nval);				\
 	ret__;								\
 })
 
-#ifndef __this_cpu_cmpxchg
-# ifndef __this_cpu_cmpxchg_1
-#  define __this_cpu_cmpxchg_1(pcp, oval, nval)	__this_cpu_generic_cmpxchg(pcp, oval, nval)
+#ifndef raw_cpu_cmpxchg
+# ifndef raw_cpu_cmpxchg_1
+#  define raw_cpu_cmpxchg_1(pcp, oval, nval)	raw_cpu_generic_cmpxchg(pcp, oval, nval)
 # endif
-# ifndef __this_cpu_cmpxchg_2
-#  define __this_cpu_cmpxchg_2(pcp, oval, nval)	__this_cpu_generic_cmpxchg(pcp, oval, nval)
+# ifndef raw_cpu_cmpxchg_2
+#  define raw_cpu_cmpxchg_2(pcp, oval, nval)	raw_cpu_generic_cmpxchg(pcp, oval, nval)
 # endif
-# ifndef __this_cpu_cmpxchg_4
-#  define __this_cpu_cmpxchg_4(pcp, oval, nval)	__this_cpu_generic_cmpxchg(pcp, oval, nval)
+# ifndef raw_cpu_cmpxchg_4
+#  define raw_cpu_cmpxchg_4(pcp, oval, nval)	raw_cpu_generic_cmpxchg(pcp, oval, nval)
 # endif
-# ifndef __this_cpu_cmpxchg_8
-#  define __this_cpu_cmpxchg_8(pcp, oval, nval)	__this_cpu_generic_cmpxchg(pcp, oval, nval)
+# ifndef raw_cpu_cmpxchg_8
+#  define raw_cpu_cmpxchg_8(pcp, oval, nval)	raw_cpu_generic_cmpxchg(pcp, oval, nval)
 # endif
-# define __this_cpu_cmpxchg(pcp, oval, nval)	\
-	__pcpu_size_call_return2(__this_cpu_cmpxchg_, pcp, oval, nval)
+# define raw_cpu_cmpxchg(pcp, oval, nval)	\
+	__pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval)
 #endif
 
-#define __this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+#define raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
 ({									\
 	int __ret = 0;							\
-	if (__this_cpu_read(pcp1) == (oval1) &&				\
-			 __this_cpu_read(pcp2)  == (oval2)) {		\
-		__this_cpu_write(pcp1, (nval1));			\
-		__this_cpu_write(pcp2, (nval2));			\
+	if (raw_cpu_read(pcp1) == (oval1) &&				\
+			 raw_cpu_read(pcp2)  == (oval2)) {		\
+		raw_cpu_write(pcp1, (nval1));				\
+		raw_cpu_write(pcp2, (nval2));				\
 		__ret = 1;						\
 	}								\
 	(__ret);							\
 })
 
-#ifndef __this_cpu_cmpxchg_double
-# ifndef __this_cpu_cmpxchg_double_1
-#  define __this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
-# endif
-# ifndef __this_cpu_cmpxchg_double_2
-#  define __this_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
-# endif
-# ifndef __this_cpu_cmpxchg_double_4
-#  define __this_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
-# endif
-# ifndef __this_cpu_cmpxchg_double_8
-#  define __this_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
+#ifndef raw_cpu_cmpxchg_double
+# ifndef raw_cpu_cmpxchg_double_1
+#  define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+	raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
+# endif
+# ifndef raw_cpu_cmpxchg_double_2
+#  define raw_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+	raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
+# endif
+# ifndef raw_cpu_cmpxchg_double_4
+#  define raw_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+	raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
+# endif
+# ifndef raw_cpu_cmpxchg_double_8
+#  define raw_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+	raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)
 # endif
+# define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
+	__pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, (pcp1), (pcp2), (oval1), (oval2), (nval1), (nval2))
+#endif
+
+/*
+ * Generic percpu operations for context that are safe from preemption/interrupts.
+ * Checks will be added here soon.
+ */
+#ifndef __this_cpu_read
+# define __this_cpu_read(pcp)	__pcpu_size_call_return(raw_cpu_read_, (pcp))
+#endif
+
+#ifndef __this_cpu_write
+# define __this_cpu_write(pcp, val)	__pcpu_size_call(raw_cpu_write_, (pcp), (val))
+#endif
+
+#ifndef __this_cpu_add
+# define __this_cpu_add(pcp, val)	__pcpu_size_call(raw_cpu_add_, (pcp), (val))
+#endif
+
+#ifndef __this_cpu_sub
+# define __this_cpu_sub(pcp, val)	__this_cpu_add((pcp), -(typeof(pcp))(val))
+#endif
+
+#ifndef __this_cpu_inc
+# define __this_cpu_inc(pcp)		__this_cpu_add((pcp), 1)
+#endif
+
+#ifndef __this_cpu_dec
+# define __this_cpu_dec(pcp)		__this_cpu_sub((pcp), 1)
+#endif
+
+#ifndef __this_cpu_and
+# define __this_cpu_and(pcp, val)	__pcpu_size_call(raw_cpu_and_, (pcp), (val))
+#endif
+
+#ifndef __this_cpu_or
+# define __this_cpu_or(pcp, val)	__pcpu_size_call(raw_cpu_or_, (pcp), (val))
+#endif
+
+#ifndef __this_cpu_add_return
+# define __this_cpu_add_return(pcp, val)	\
+	__pcpu_size_call_return2(raw_cpu_add_return_, pcp, val)
+#endif
+
+#define __this_cpu_sub_return(pcp, val)	__this_cpu_add_return(pcp, -(typeof(pcp))(val))
+#define __this_cpu_inc_return(pcp)	__this_cpu_add_return(pcp, 1)
+#define __this_cpu_dec_return(pcp)	__this_cpu_add_return(pcp, -1)
+
+#ifndef __this_cpu_xchg
+# define __this_cpu_xchg(pcp, nval)	\
+	__pcpu_size_call_return2(raw_cpu_xchg_, (pcp), nval)
+#endif
+
+#ifndef __this_cpu_cmpxchg
+# define __this_cpu_cmpxchg(pcp, oval, nval)	\
+	__pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval)
+#endif
+
+#ifndef __this_cpu_cmpxchg_double
 # define __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__pcpu_double_call_return_bool(__this_cpu_cmpxchg_double_, (pcp1), (pcp2), (oval1), (oval2), (nval1), (nval2))
+	__pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, (pcp1), (pcp2), (oval1), (oval2), (nval1), (nval2))
 #endif
 
 #endif /* __LINUX_PERCPU_H */
Index: linux/arch/x86/include/asm/percpu.h
===================================================================
--- linux.orig/arch/x86/include/asm/percpu.h	2014-01-30 14:40:36.186810863 -0600
+++ linux/arch/x86/include/asm/percpu.h	2014-01-30 14:40:36.186810863 -0600
@@ -52,7 +52,7 @@
  * Compared to the generic __my_cpu_offset version, the following
  * saves one instruction and avoids clobbering a temp register.
  */
-#define __this_cpu_ptr(ptr)				\
+#define raw_cpu_ptr(ptr)				\
 ({							\
 	unsigned long tcp_ptr__;			\
 	__verify_pcpu_ptr(ptr);				\
@@ -362,25 +362,25 @@
  */
 #define this_cpu_read_stable(var)	percpu_from_op("mov", var, "p" (&(var)))
 
-#define __this_cpu_read_1(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
-#define __this_cpu_read_2(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
-#define __this_cpu_read_4(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
-
-#define __this_cpu_write_1(pcp, val)	percpu_to_op("mov", (pcp), val)
-#define __this_cpu_write_2(pcp, val)	percpu_to_op("mov", (pcp), val)
-#define __this_cpu_write_4(pcp, val)	percpu_to_op("mov", (pcp), val)
-#define __this_cpu_add_1(pcp, val)	percpu_add_op((pcp), val)
-#define __this_cpu_add_2(pcp, val)	percpu_add_op((pcp), val)
-#define __this_cpu_add_4(pcp, val)	percpu_add_op((pcp), val)
-#define __this_cpu_and_1(pcp, val)	percpu_to_op("and", (pcp), val)
-#define __this_cpu_and_2(pcp, val)	percpu_to_op("and", (pcp), val)
-#define __this_cpu_and_4(pcp, val)	percpu_to_op("and", (pcp), val)
-#define __this_cpu_or_1(pcp, val)	percpu_to_op("or", (pcp), val)
-#define __this_cpu_or_2(pcp, val)	percpu_to_op("or", (pcp), val)
-#define __this_cpu_or_4(pcp, val)	percpu_to_op("or", (pcp), val)
-#define __this_cpu_xchg_1(pcp, val)	percpu_xchg_op(pcp, val)
-#define __this_cpu_xchg_2(pcp, val)	percpu_xchg_op(pcp, val)
-#define __this_cpu_xchg_4(pcp, val)	percpu_xchg_op(pcp, val)
+#define raw_cpu_read_1(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
+#define raw_cpu_read_2(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
+#define raw_cpu_read_4(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
+
+#define raw_cpu_write_1(pcp, val)	percpu_to_op("mov", (pcp), val)
+#define raw_cpu_write_2(pcp, val)	percpu_to_op("mov", (pcp), val)
+#define raw_cpu_write_4(pcp, val)	percpu_to_op("mov", (pcp), val)
+#define raw_cpu_add_1(pcp, val)		percpu_add_op((pcp), val)
+#define raw_cpu_add_2(pcp, val)		percpu_add_op((pcp), val)
+#define raw_cpu_add_4(pcp, val)		percpu_add_op((pcp), val)
+#define raw_cpu_and_1(pcp, val)		percpu_to_op("and", (pcp), val)
+#define raw_cpu_and_2(pcp, val)		percpu_to_op("and", (pcp), val)
+#define raw_cpu_and_4(pcp, val)		percpu_to_op("and", (pcp), val)
+#define raw_cpu_or_1(pcp, val)		percpu_to_op("or", (pcp), val)
+#define raw_cpu_or_2(pcp, val)		percpu_to_op("or", (pcp), val)
+#define raw_cpu_or_4(pcp, val)		percpu_to_op("or", (pcp), val)
+#define raw_cpu_xchg_1(pcp, val)	percpu_xchg_op(pcp, val)
+#define raw_cpu_xchg_2(pcp, val)	percpu_xchg_op(pcp, val)
+#define raw_cpu_xchg_4(pcp, val)	percpu_xchg_op(pcp, val)
 
 #define this_cpu_read_1(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
 #define this_cpu_read_2(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
@@ -401,16 +401,16 @@
 #define this_cpu_xchg_2(pcp, nval)	percpu_xchg_op(pcp, nval)
 #define this_cpu_xchg_4(pcp, nval)	percpu_xchg_op(pcp, nval)
 
-#define __this_cpu_add_return_1(pcp, val) percpu_add_return_op(pcp, val)
-#define __this_cpu_add_return_2(pcp, val) percpu_add_return_op(pcp, val)
-#define __this_cpu_add_return_4(pcp, val) percpu_add_return_op(pcp, val)
-#define __this_cpu_cmpxchg_1(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
-#define __this_cpu_cmpxchg_2(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
-#define __this_cpu_cmpxchg_4(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
-
-#define this_cpu_add_return_1(pcp, val)	percpu_add_return_op(pcp, val)
-#define this_cpu_add_return_2(pcp, val)	percpu_add_return_op(pcp, val)
-#define this_cpu_add_return_4(pcp, val)	percpu_add_return_op(pcp, val)
+#define raw_cpu_add_return_1(pcp, val)		percpu_add_return_op(pcp, val)
+#define raw_cpu_add_return_2(pcp, val)		percpu_add_return_op(pcp, val)
+#define raw_cpu_add_return_4(pcp, val)		percpu_add_return_op(pcp, val)
+#define raw_cpu_cmpxchg_1(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
+#define raw_cpu_cmpxchg_2(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
+#define raw_cpu_cmpxchg_4(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
+
+#define this_cpu_add_return_1(pcp, val)		percpu_add_return_op(pcp, val)
+#define this_cpu_add_return_2(pcp, val)		percpu_add_return_op(pcp, val)
+#define this_cpu_add_return_4(pcp, val)		percpu_add_return_op(pcp, val)
 #define this_cpu_cmpxchg_1(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
 #define this_cpu_cmpxchg_2(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
 #define this_cpu_cmpxchg_4(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
@@ -427,7 +427,7 @@
 	__ret;								\
 })
 
-#define __this_cpu_cmpxchg_double_4	percpu_cmpxchg8b_double
+#define raw_cpu_cmpxchg_double_4	percpu_cmpxchg8b_double
 #define this_cpu_cmpxchg_double_4	percpu_cmpxchg8b_double
 #endif /* CONFIG_X86_CMPXCHG64 */
 
@@ -436,22 +436,22 @@
  * 32 bit must fall back to generic operations.
  */
 #ifdef CONFIG_X86_64
-#define __this_cpu_read_8(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
-#define __this_cpu_write_8(pcp, val)	percpu_to_op("mov", (pcp), val)
-#define __this_cpu_add_8(pcp, val)	percpu_add_op((pcp), val)
-#define __this_cpu_and_8(pcp, val)	percpu_to_op("and", (pcp), val)
-#define __this_cpu_or_8(pcp, val)	percpu_to_op("or", (pcp), val)
-#define __this_cpu_add_return_8(pcp, val) percpu_add_return_op(pcp, val)
-#define __this_cpu_xchg_8(pcp, nval)	percpu_xchg_op(pcp, nval)
-#define __this_cpu_cmpxchg_8(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
-
-#define this_cpu_read_8(pcp)		percpu_from_op("mov", (pcp), "m"(pcp))
-#define this_cpu_write_8(pcp, val)	percpu_to_op("mov", (pcp), val)
-#define this_cpu_add_8(pcp, val)	percpu_add_op((pcp), val)
-#define this_cpu_and_8(pcp, val)	percpu_to_op("and", (pcp), val)
-#define this_cpu_or_8(pcp, val)		percpu_to_op("or", (pcp), val)
-#define this_cpu_add_return_8(pcp, val)	percpu_add_return_op(pcp, val)
-#define this_cpu_xchg_8(pcp, nval)	percpu_xchg_op(pcp, nval)
+#define raw_cpu_read_8(pcp)			percpu_from_op("mov", (pcp), "m"(pcp))
+#define raw_cpu_write_8(pcp, val)		percpu_to_op("mov", (pcp), val)
+#define raw_cpu_add_8(pcp, val)			percpu_add_op((pcp), val)
+#define raw_cpu_and_8(pcp, val)			percpu_to_op("and", (pcp), val)
+#define raw_cpu_or_8(pcp, val)			percpu_to_op("or", (pcp), val)
+#define raw_cpu_add_return_8(pcp, val)		percpu_add_return_op(pcp, val)
+#define raw_cpu_xchg_8(pcp, nval)		percpu_xchg_op(pcp, nval)
+#define raw_cpu_cmpxchg_8(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
+
+#define this_cpu_read_8(pcp)			percpu_from_op("mov", (pcp), "m"(pcp))
+#define this_cpu_write_8(pcp, val)		percpu_to_op("mov", (pcp), val)
+#define this_cpu_add_8(pcp, val)		percpu_add_op((pcp), val)
+#define this_cpu_and_8(pcp, val)		percpu_to_op("and", (pcp), val)
+#define this_cpu_or_8(pcp, val)			percpu_to_op("or", (pcp), val)
+#define this_cpu_add_return_8(pcp, val)		percpu_add_return_op(pcp, val)
+#define this_cpu_xchg_8(pcp, nval)		percpu_xchg_op(pcp, nval)
 #define this_cpu_cmpxchg_8(pcp, oval, nval)	percpu_cmpxchg_op(pcp, oval, nval)
 
 /*
@@ -474,7 +474,7 @@
 	__ret;								\
 })
 
-#define __this_cpu_cmpxchg_double_8	percpu_cmpxchg16b_double
+#define raw_cpu_cmpxchg_double_8	percpu_cmpxchg16b_double
 #define this_cpu_cmpxchg_double_8	percpu_cmpxchg16b_double
 
 #endif
@@ -495,9 +495,9 @@
 	unsigned long __percpu *a = (unsigned long *)addr + nr / BITS_PER_LONG;
 
 #ifdef CONFIG_X86_64
-	return ((1UL << (nr % BITS_PER_LONG)) & __this_cpu_read_8(*a)) != 0;
+	return ((1UL << (nr % BITS_PER_LONG)) & raw_cpu_read_8(*a)) != 0;
 #else
-	return ((1UL << (nr % BITS_PER_LONG)) & __this_cpu_read_4(*a)) != 0;
+	return ((1UL << (nr % BITS_PER_LONG)) & raw_cpu_read_4(*a)) != 0;
 #endif
 }
 
Index: linux/include/asm-generic/percpu.h
===================================================================
--- linux.orig/include/asm-generic/percpu.h	2014-01-30 14:40:36.186810863 -0600
+++ linux/include/asm-generic/percpu.h	2014-01-30 14:40:36.186810863 -0600
@@ -56,17 +56,17 @@
 #define per_cpu(var, cpu) \
 	(*SHIFT_PERCPU_PTR(&(var), per_cpu_offset(cpu)))
 
-#ifndef __this_cpu_ptr
-#define __this_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
+#ifndef raw_cpu_ptr
+#define raw_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
 #endif
 #ifdef CONFIG_DEBUG_PREEMPT
 #define this_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, my_cpu_offset)
 #else
-#define this_cpu_ptr(ptr) __this_cpu_ptr(ptr)
+#define this_cpu_ptr(ptr) raw_cpu_ptr(ptr)
 #endif
 
 #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
-#define __raw_get_cpu_var(var) (*__this_cpu_ptr(&(var)))
+#define __raw_get_cpu_var(var) (*raw_cpu_ptr(&(var)))
 
 #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
 extern void setup_per_cpu_areas(void);
@@ -83,7 +83,7 @@
 #define __get_cpu_var(var)	(*VERIFY_PERCPU_PTR(&(var)))
 #define __raw_get_cpu_var(var)	(*VERIFY_PERCPU_PTR(&(var)))
 #define this_cpu_ptr(ptr)	per_cpu_ptr(ptr, 0)
-#define __this_cpu_ptr(ptr)	this_cpu_ptr(ptr)
+#define raw_cpu_ptr(ptr)	this_cpu_ptr(ptr)
 
 #endif	/* SMP */
 
@@ -122,4 +122,7 @@
 #define PER_CPU_DEF_ATTRIBUTES
 #endif
 
+/* Keep until we have removed all uses of __this_cpu_ptr */
+#define __this_cpu_ptr raw_cpu_ptr
+
 #endif /* _ASM_GENERIC_PERCPU_H_ */
Index: linux/arch/x86/include/asm/preempt.h
===================================================================
--- linux.orig/arch/x86/include/asm/preempt.h	2014-01-30 14:40:36.186810863 -0600
+++ linux/arch/x86/include/asm/preempt.h	2014-01-30 14:40:36.186810863 -0600
@@ -19,12 +19,12 @@
  */
 static __always_inline int preempt_count(void)
 {
-	return __this_cpu_read_4(__preempt_count) & ~PREEMPT_NEED_RESCHED;
+	return raw_cpu_read_4(__preempt_count) & ~PREEMPT_NEED_RESCHED;
 }
 
 static __always_inline void preempt_count_set(int pc)
 {
-	__this_cpu_write_4(__preempt_count, pc);
+	raw_cpu_write_4(__preempt_count, pc);
 }
 
 /*
@@ -53,17 +53,17 @@
 
 static __always_inline void set_preempt_need_resched(void)
 {
-	__this_cpu_and_4(__preempt_count, ~PREEMPT_NEED_RESCHED);
+	raw_cpu_and_4(__preempt_count, ~PREEMPT_NEED_RESCHED);
 }
 
 static __always_inline void clear_preempt_need_resched(void)
 {
-	__this_cpu_or_4(__preempt_count, PREEMPT_NEED_RESCHED);
+	raw_cpu_or_4(__preempt_count, PREEMPT_NEED_RESCHED);
 }
 
 static __always_inline bool test_preempt_need_resched(void)
 {
-	return !(__this_cpu_read_4(__preempt_count) & PREEMPT_NEED_RESCHED);
+	return !(raw_cpu_read_4(__preempt_count) & PREEMPT_NEED_RESCHED);
 }
 
 /*
@@ -72,12 +72,12 @@
 
 static __always_inline void __preempt_count_add(int val)
 {
-	__this_cpu_add_4(__preempt_count, val);
+	raw_cpu_add_4(__preempt_count, val);
 }
 
 static __always_inline void __preempt_count_sub(int val)
 {
-	__this_cpu_add_4(__preempt_count, -val);
+	raw_cpu_add_4(__preempt_count, -val);
 }
 
 /*
@@ -95,7 +95,7 @@
  */
 static __always_inline bool should_resched(void)
 {
-	return unlikely(!__this_cpu_read_4(__preempt_count));
+	return unlikely(!raw_cpu_read_4(__preempt_count));
 }
 
 #ifdef CONFIG_PREEMPT


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 02/48] mm: Use raw_cpu ops for determining current NUMA node
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:18   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Alex Shi

[-- Attachment #1: preempt_fix_numa_node --]
[-- Type: text/plain, Size: 4926 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

With the preempt checking logic for __this_cpu_ops we will get false
positives from locations in the code that use numa_node_id.

Before the  __this_cpu ops where introduced there were
no checks for preemption present either. smp_raw_processor_id()
was used. See http://www.spinics.net/lists/linux-numa/msg00641.html

Therefore we need to use raw_cpu_read here to avoid false postives.

Note that this issue has been discussed in prior years.
If the process changes nodes after retrieving the current numa node then
that is acceptable since most uses of numa_node etc are for optimization
and not for correctness.

There were suggestions to implement a raw_numa_node_id in order to
do preempt checks for numa_node_id as well. But I think we better
defer that to another patch since that would mean investigating
how numa_node_id() is used throughout the kernel which would increase
the scope of this patchset significantly. After all preemption was never
checked before when numa_node_id() was used.

Some sample traces:

__this_cpu_read operation in preemptible [00000000] code: login/1456
caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 000000000000013c ffff88001f31ba58 ffffffff8147cf5e ffff88001f31bfd8
 ffff88001f31ba88 ffffffff8127eea9 0000000000000000 ffff88001f3975c0
 00000000f7707000 ffff88001f3975c0 ffff88001f31bac0 ffffffff8127eeef
Call Trace:
 [<ffffffff8147cf5e>] dump_stack+0x4e/0x82
 [<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
 [<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
 [<ffffffff81030ff5>] ? show_stack+0x3b/0x3d
 [<ffffffff810ebee3>] get_task_policy+0x1d/0x49
 [<ffffffff810ed705>] get_vma_policy+0x14/0x76
 [<ffffffff810ed8ff>] alloc_pages_vma+0x35/0xff
 [<ffffffff810dad97>] handle_mm_fault+0x290/0x73b
 [<ffffffff810503da>] __do_page_fault+0x3fe/0x44d
 [<ffffffff8109b360>] ? trace_hardirqs_on_caller+0x142/0x19e
 [<ffffffff8109b3c9>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81278bed>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff81050451>] do_page_fault+0x9/0xc
 [<ffffffff81483602>] page_fault+0x22/0x30
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be4c3>] ? file_read_actor+0x3a/0x15a
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810bffab>] generic_file_aio_read+0x38e/0x624
 [<ffffffff810f6d69>] do_sync_read+0x54/0x73
 [<ffffffff810f7890>] vfs_read+0x9d/0x12a
 [<ffffffff810f7a59>] SyS_read+0x47/0x7e
 [<ffffffff81484f21>] cstar_dispatch+0x7/0x23


caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 00000000000000e8 ffff88001f31bbf8 ffffffff8147cf5e ffff88001f31bfd8
 ffff88001f31bc28 ffffffff8127eea9 ffffffff823c5c40 00000000000213da
 0000000000000000 0000000000000000 ffff88001f31bc60 ffffffff8127eeef
Call Trace:
 [<ffffffff8147cf5e>] dump_stack+0x4e/0x82
 [<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
 [<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
 [<ffffffff810e006e>] ? install_special_mapping+0x11/0xe4
 [<ffffffff810ec8a8>] alloc_pages_current+0x8f/0xbc
 [<ffffffff810bec6b>] __page_cache_alloc+0xb/0xd
 [<ffffffff810c7e90>] __do_page_cache_readahead+0xf4/0x219
 [<ffffffff810c7e0e>] ? __do_page_cache_readahead+0x72/0x219
 [<ffffffff810c827c>] ra_submit+0x1c/0x20
 [<ffffffff810c850c>] ondemand_readahead+0x28c/0x2b4
 [<ffffffff810c85e9>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff810bfe7e>] generic_file_aio_read+0x261/0x624
 [<ffffffff810f6d69>] do_sync_read+0x54/0x73
 [<ffffffff810f7890>] vfs_read+0x9d/0x12a
 [<ffffffff810f7a59>] SyS_read+0x47/0x7e
 [<ffffffff81484f21>] cstar_dispatch+0x7/0x23

Cc: linux-mm@kvack.org
Cc: Alex Shi <alex.shi@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/topology.h
===================================================================
--- linux.orig/include/linux/topology.h	2013-12-02 16:07:51.304591590 -0600
+++ linux/include/linux/topology.h	2013-12-02 16:07:51.304591590 -0600
@@ -188,7 +188,7 @@ DECLARE_PER_CPU(int, numa_node);
 /* Returns the number of the current Node. */
 static inline int numa_node_id(void)
 {
-	return __this_cpu_read(numa_node);
+	return raw_cpu_read(numa_node);
 }
 #endif
 
@@ -245,7 +245,7 @@ static inline void set_numa_mem(int node
 /* Returns the number of the nearest Node with memory */
 static inline int numa_mem_id(void)
 {
-	return __this_cpu_read(_numa_mem_);
+	return raw_cpu_read(_numa_mem_);
 }
 #endif
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 02/48] mm: Use raw_cpu ops for determining current NUMA node
@ 2014-02-14 20:18   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, linux-mm, Alex Shi

[-- Attachment #1: preempt_fix_numa_node --]
[-- Type: text/plain, Size: 5151 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

With the preempt checking logic for __this_cpu_ops we will get false
positives from locations in the code that use numa_node_id.

Before the  __this_cpu ops where introduced there were
no checks for preemption present either. smp_raw_processor_id()
was used. See http://www.spinics.net/lists/linux-numa/msg00641.html

Therefore we need to use raw_cpu_read here to avoid false postives.

Note that this issue has been discussed in prior years.
If the process changes nodes after retrieving the current numa node then
that is acceptable since most uses of numa_node etc are for optimization
and not for correctness.

There were suggestions to implement a raw_numa_node_id in order to
do preempt checks for numa_node_id as well. But I think we better
defer that to another patch since that would mean investigating
how numa_node_id() is used throughout the kernel which would increase
the scope of this patchset significantly. After all preemption was never
checked before when numa_node_id() was used.

Some sample traces:

__this_cpu_read operation in preemptible [00000000] code: login/1456
caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 000000000000013c ffff88001f31ba58 ffffffff8147cf5e ffff88001f31bfd8
 ffff88001f31ba88 ffffffff8127eea9 0000000000000000 ffff88001f3975c0
 00000000f7707000 ffff88001f3975c0 ffff88001f31bac0 ffffffff8127eeef
Call Trace:
 [<ffffffff8147cf5e>] dump_stack+0x4e/0x82
 [<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
 [<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
 [<ffffffff81030ff5>] ? show_stack+0x3b/0x3d
 [<ffffffff810ebee3>] get_task_policy+0x1d/0x49
 [<ffffffff810ed705>] get_vma_policy+0x14/0x76
 [<ffffffff810ed8ff>] alloc_pages_vma+0x35/0xff
 [<ffffffff810dad97>] handle_mm_fault+0x290/0x73b
 [<ffffffff810503da>] __do_page_fault+0x3fe/0x44d
 [<ffffffff8109b360>] ? trace_hardirqs_on_caller+0x142/0x19e
 [<ffffffff8109b3c9>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81278bed>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff81050451>] do_page_fault+0x9/0xc
 [<ffffffff81483602>] page_fault+0x22/0x30
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810be4c3>] ? file_read_actor+0x3a/0x15a
 [<ffffffff810be97f>] ? find_get_pages_contig+0x18e/0x18e
 [<ffffffff810bffab>] generic_file_aio_read+0x38e/0x624
 [<ffffffff810f6d69>] do_sync_read+0x54/0x73
 [<ffffffff810f7890>] vfs_read+0x9d/0x12a
 [<ffffffff810f7a59>] SyS_read+0x47/0x7e
 [<ffffffff81484f21>] cstar_dispatch+0x7/0x23


caller is __this_cpu_preempt_check+0x2b/0x2d
CPU: 0 PID: 1456 Comm: login Not tainted 3.12.0-rc4-cl-00062-g2fe80d3-dirty #185
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 00000000000000e8 ffff88001f31bbf8 ffffffff8147cf5e ffff88001f31bfd8
 ffff88001f31bc28 ffffffff8127eea9 ffffffff823c5c40 00000000000213da
 0000000000000000 0000000000000000 ffff88001f31bc60 ffffffff8127eeef
Call Trace:
 [<ffffffff8147cf5e>] dump_stack+0x4e/0x82
 [<ffffffff8127eea9>] check_preemption_disabled+0xc5/0xe0
 [<ffffffff8127eeef>] __this_cpu_preempt_check+0x2b/0x2d
 [<ffffffff810e006e>] ? install_special_mapping+0x11/0xe4
 [<ffffffff810ec8a8>] alloc_pages_current+0x8f/0xbc
 [<ffffffff810bec6b>] __page_cache_alloc+0xb/0xd
 [<ffffffff810c7e90>] __do_page_cache_readahead+0xf4/0x219
 [<ffffffff810c7e0e>] ? __do_page_cache_readahead+0x72/0x219
 [<ffffffff810c827c>] ra_submit+0x1c/0x20
 [<ffffffff810c850c>] ondemand_readahead+0x28c/0x2b4
 [<ffffffff810c85e9>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff810bfe7e>] generic_file_aio_read+0x261/0x624
 [<ffffffff810f6d69>] do_sync_read+0x54/0x73
 [<ffffffff810f7890>] vfs_read+0x9d/0x12a
 [<ffffffff810f7a59>] SyS_read+0x47/0x7e
 [<ffffffff81484f21>] cstar_dispatch+0x7/0x23

Cc: linux-mm@kvack.org
Cc: Alex Shi <alex.shi@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/topology.h
===================================================================
--- linux.orig/include/linux/topology.h	2013-12-02 16:07:51.304591590 -0600
+++ linux/include/linux/topology.h	2013-12-02 16:07:51.304591590 -0600
@@ -188,7 +188,7 @@ DECLARE_PER_CPU(int, numa_node);
 /* Returns the number of the current Node. */
 static inline int numa_node_id(void)
 {
-	return __this_cpu_read(numa_node);
+	return raw_cpu_read(numa_node);
 }
 #endif
 
@@ -245,7 +245,7 @@ static inline void set_numa_mem(int node
 /* Returns the number of the nearest Node with memory */
 static inline int numa_mem_id(void)
 {
-	return __this_cpu_read(_numa_mem_);
+	return raw_cpu_read(_numa_mem_);
 }
 #endif
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 03/48] modules: Use raw_cpu_write for initialization of per cpu refcount.
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
  2014-02-14 20:18 ` [PATCH 01/48] percpu: Add raw_cpu_ops Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 04/48] net: Replace __this_cpu_inc in route.c with raw_cpu_inc Christoph Lameter
                   ` (45 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: preempt_module --]
[-- Type: text/plain, Size: 1966 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

The initialization of a structure is not subject to synchronization.
The use of __this_cpu would trigger a false positive with the
additional preemption checks for __this_cpu ops.

So simply disable the check through the use of raw_cpu ops.

Trace:

[    0.668066] __this_cpu_write operation in preemptible [00000000] code: modprobe/286
[    0.668108] caller is __this_cpu_preempt_check+0x38/0x60
[    0.668111] CPU: 3 PID: 286 Comm: modprobe Tainted: GF            3.12.0-rc4+ #187
[    0.668112] Hardware name: FUJITSU CELSIUS W530 Power/D3227-A1, BIOS V4.6.5.4 R1.10.0 for D3227-A1x 09/16/2013
[    0.668113]  0000000000000003 ffff8807edda1d18 ffffffff816d5a57 ffff8807edda1fd8
[    0.668117]  ffff8807edda1d48 ffffffff8137359c ffff8807edda1ef8 ffffffffa0002178
[    0.668121]  ffffc90000067730 ffff8807edda1e48 ffff8807edda1d88 ffffffff813735f8
[    0.668124] Call Trace:
[    0.668129]  [<ffffffff816d5a57>] dump_stack+0x4e/0x82
[    0.668132]  [<ffffffff8137359c>] check_preemption_disabled+0xec/0x110
[    0.668135]  [<ffffffff813735f8>] __this_cpu_preempt_check+0x38/0x60
[    0.668139]  [<ffffffff810c24fd>] load_module+0xcfd/0x2650
[    0.668143]  [<ffffffff816dd922>] ? page_fault+0x22/0x30
[    0.668146]  [<ffffffff810c3ef6>] SyS_init_module+0xa6/0xd0
[    0.668150]  [<ffffffff816e4fd3>] tracesys+0xe1/0xe6

Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/module.c
===================================================================
--- linux.orig/kernel/module.c	2013-12-02 16:07:51.644582143 -0600
+++ linux/kernel/module.c	2013-12-02 16:07:51.634582418 -0600
@@ -640,7 +640,7 @@ static int module_unload_init(struct mod
 	INIT_LIST_HEAD(&mod->target_list);
 
 	/* Hold reference count during initialization. */
-	__this_cpu_write(mod->refptr->incs, 1);
+	raw_cpu_write(mod->refptr->incs, 1);
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 04/48] net: Replace __this_cpu_inc in route.c with raw_cpu_inc
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (2 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 03/48] modules: Use raw_cpu_write for initialization of per cpu refcount Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops Christoph Lameter
                   ` (44 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, netdev, edumazet, David S. Miller

[-- Attachment #1: preempt_rt_cache_stat --]
[-- Type: text/plain, Size: 3608 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

The RT_CACHE_STAT_INC macro triggers the new preemption checks
for __this_cpu ops.

I do not see any other synchronization that would allow the use
of a __this_cpu operation here however in commit
dbd2915ce87e811165da0717f8e159276ebb803e Andrew justifies
the use of raw_smp_processor_id() here because "we do not care"
about races. In the past we agreed that the price of disabling
interrupts here to get consistent counters would be too high.
These counters may be inaccurate due to race conditions.

The use of __this_cpu op improves the situation already from what commit
dbd2915ce87e811165da0717f8e159276ebb803e did since the single instruction
emitted on x86 does not allow the race to occur anymore. However,
non x86 platforms could still experience a race here.

Trace:

[ 1277.189084] __this_cpu_add operation in preemptible [00000000] code: avahi-daemon/1193
[ 1277.189085] caller is __this_cpu_preempt_check+0x38/0x60
[ 1277.189086] CPU: 1 PID: 1193 Comm: avahi-daemon Tainted: GF            3.12.0-rc4+ #187
[ 1277.189087] Hardware name: FUJITSU CELSIUS W530 Power/D3227-A1, BIOS V4.6.5.4 R1.10.0 for D3227-A1x 09/16/2013
[ 1277.189088]  0000000000000001 ffff8807ef78fa00 ffffffff816d5a57 ffff8807ef78ffd8
[ 1277.189089]  ffff8807ef78fa30 ffffffff8137359c ffff8807ef78fba0 ffff88079f822b40
[ 1277.189091]  0000000020000000 ffff8807ee32c800 ffff8807ef78fa70 ffffffff813735f8
[ 1277.189093] Call Trace:
[ 1277.189094]  [<ffffffff816d5a57>] dump_stack+0x4e/0x82
[ 1277.189096]  [<ffffffff8137359c>] check_preemption_disabled+0xec/0x110
[ 1277.189097]  [<ffffffff813735f8>] __this_cpu_preempt_check+0x38/0x60
[ 1277.189098]  [<ffffffff81610d65>] __ip_route_output_key+0x575/0x8c0
[ 1277.189100]  [<ffffffff816110d7>] ip_route_output_flow+0x27/0x70
[ 1277.189101]  [<ffffffff81616c80>] ? ip_copy_metadata+0x1a0/0x1a0
[ 1277.189102]  [<ffffffff81640b15>] udp_sendmsg+0x825/0xa20
[ 1277.189104]  [<ffffffff811b4aa9>] ? do_sys_poll+0x449/0x5d0
[ 1277.189105]  [<ffffffff8164c695>] inet_sendmsg+0x85/0xc0
[ 1277.189106]  [<ffffffff815c6e3c>] sock_sendmsg+0x9c/0xd0
[ 1277.189108]  [<ffffffff813735f8>] ? __this_cpu_preempt_check+0x38/0x60
[ 1277.189109]  [<ffffffff815c7550>] ? move_addr_to_kernel+0x40/0xa0
[ 1277.189111]  [<ffffffff815c71ec>] ___sys_sendmsg+0x37c/0x390
[ 1277.189112]  [<ffffffff8136613a>] ? string.isra.3+0x3a/0xd0
[ 1277.189113]  [<ffffffff8136613a>] ? string.isra.3+0x3a/0xd0
[ 1277.189115]  [<ffffffff81367b54>] ? vsnprintf+0x364/0x650
[ 1277.189116]  [<ffffffff81367ee9>] ? snprintf+0x39/0x40
[ 1277.189118]  [<ffffffff813735f8>] ? __this_cpu_preempt_check+0x38/0x60
[ 1277.189119]  [<ffffffff815c7ff9>] __sys_sendmsg+0x49/0x90
[ 1277.189121]  [<ffffffff815c8052>] SyS_sendmsg+0x12/0x20
[ 1277.189122]  [<ffffffff816e4fd3>] tracesys+0xe1/0xe6

Cc: netdev@vger.kernel.org
Cc: edumazet@google.com
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/net/ipv4/route.c
===================================================================
--- linux.orig/net/ipv4/route.c	2014-01-30 14:40:44.276650882 -0600
+++ linux/net/ipv4/route.c	2014-01-30 14:40:44.276650882 -0600
@@ -194,7 +194,7 @@
 EXPORT_SYMBOL(ip_tos2prio);
 
 static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat);
-#define RT_CACHE_STAT_INC(field) __this_cpu_inc(rt_cache_stat.field)
+#define RT_CACHE_STAT_INC(field) raw_cpu_inc(rt_cache_stat.field)
 
 #ifdef CONFIG_PROC_FS
 static void *rt_cache_seq_start(struct seq_file *seq, loff_t *pos)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (3 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 04/48] net: Replace __this_cpu_inc in route.c with raw_cpu_inc Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-03-04 22:27   ` Andrew Morton
  2014-02-14 20:18   ` Christoph Lameter
                   ` (43 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: preempt_check_this_cpu_ops --]
[-- Type: text/plain, Size: 5031 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

We define a check function in order to avoid trouble with the
include files. Then the higher level __this_cpu macros are
modified to invoke the preemption check.

Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/percpu.h
===================================================================
--- linux.orig/include/linux/percpu.h	2014-01-30 14:40:50.936519233 -0600
+++ linux/include/linux/percpu.h	2014-01-30 14:40:50.936519233 -0600
@@ -173,6 +173,12 @@
 
 extern void __bad_size_call_parameter(void);
 
+#ifdef CONFIG_DEBUG_PREEMPT
+extern void __this_cpu_preempt_check(const char *op);
+#else
+static inline void __this_cpu_preempt_check(const char *op) { }
+#endif
+
 #define __pcpu_size_call_return(stem, variable)				\
 ({	typeof(variable) pscr_ret__;					\
 	__verify_pcpu_ptr(&(variable));					\
@@ -725,18 +731,24 @@
 
 /*
  * Generic percpu operations for context that are safe from preemption/interrupts.
- * Checks will be added here soon.
  */
 #ifndef __this_cpu_read
-# define __this_cpu_read(pcp)	__pcpu_size_call_return(raw_cpu_read_, (pcp))
+# define __this_cpu_read(pcp) \
+	(__this_cpu_preempt_check("read"),__pcpu_size_call_return(raw_cpu_read_, (pcp)))
 #endif
 
 #ifndef __this_cpu_write
-# define __this_cpu_write(pcp, val)	__pcpu_size_call(raw_cpu_write_, (pcp), (val))
+# define __this_cpu_write(pcp, val)					\
+do { __this_cpu_preempt_check("write");					\
+     __pcpu_size_call(raw_cpu_write_, (pcp), (val));			\
+} while (0)
 #endif
 
 #ifndef __this_cpu_add
-# define __this_cpu_add(pcp, val)	__pcpu_size_call(raw_cpu_add_, (pcp), (val))
+# define __this_cpu_add(pcp, val)					 \
+do { __this_cpu_preempt_check("add");					\
+	__pcpu_size_call(raw_cpu_add_, (pcp), (val));			\
+} while (0)
 #endif
 
 #ifndef __this_cpu_sub
@@ -752,16 +764,23 @@
 #endif
 
 #ifndef __this_cpu_and
-# define __this_cpu_and(pcp, val)	__pcpu_size_call(raw_cpu_and_, (pcp), (val))
+# define __this_cpu_and(pcp, val)					\
+do { __this_cpu_preempt_check("and");					\
+	__pcpu_size_call(raw_cpu_and_, (pcp), (val));			\
+} while (0)
+
 #endif
 
 #ifndef __this_cpu_or
-# define __this_cpu_or(pcp, val)	__pcpu_size_call(raw_cpu_or_, (pcp), (val))
+# define __this_cpu_or(pcp, val)					\
+do { __this_cpu_preempt_check("or");					\
+	__pcpu_size_call(raw_cpu_or_, (pcp), (val));			\
+} while (0)
 #endif
 
 #ifndef __this_cpu_add_return
 # define __this_cpu_add_return(pcp, val)	\
-	__pcpu_size_call_return2(raw_cpu_add_return_, pcp, val)
+	(__this_cpu_preempt_check("add_return"),__pcpu_size_call_return2(raw_cpu_add_return_, pcp, val))
 #endif
 
 #define __this_cpu_sub_return(pcp, val)	__this_cpu_add_return(pcp, -(typeof(pcp))(val))
@@ -770,17 +789,17 @@
 
 #ifndef __this_cpu_xchg
 # define __this_cpu_xchg(pcp, nval)	\
-	__pcpu_size_call_return2(raw_cpu_xchg_, (pcp), nval)
+	(__this_cpu_preempt_check("xchg"),__pcpu_size_call_return2(raw_cpu_xchg_, (pcp), nval))
 #endif
 
 #ifndef __this_cpu_cmpxchg
 # define __this_cpu_cmpxchg(pcp, oval, nval)	\
-	__pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval)
+	(__this_cpu_preempt_check("cmpxchg"),__pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval))
 #endif
 
 #ifndef __this_cpu_cmpxchg_double
 # define __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2)	\
-	__pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, (pcp1), (pcp2), (oval1), (oval2), (nval1), (nval2))
+	(__this_cpu_preempt_check("cmpxchg_double"),__pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, (pcp1), (pcp2), (oval1), (oval2), (nval1), (nval2)))
 #endif
 
 #endif /* __LINUX_PERCPU_H */
Index: linux/lib/smp_processor_id.c
===================================================================
--- linux.orig/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
+++ linux/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
@@ -7,7 +7,7 @@
 #include <linux/kallsyms.h>
 #include <linux/sched.h>
 
-notrace unsigned int debug_smp_processor_id(void)
+notrace static unsigned int check_preemption_disabled(char *what)
 {
 	int this_cpu = raw_smp_processor_id();
 
@@ -38,9 +38,9 @@
 	if (!printk_ratelimit())
 		goto out_enable;
 
-	printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] "
-			"code: %s/%d\n",
-			preempt_count() - 1, current->comm, current->pid);
+	printk(KERN_ERR "BUG: using %s in preemptible [%08x] code: %s/%d\n",
+		what, preempt_count() - 1, current->comm, current->pid);
+
 	print_symbol("caller is %s\n", (long)__builtin_return_address(0));
 	dump_stack();
 
@@ -50,5 +50,17 @@
 	return this_cpu;
 }
 
+notrace unsigned int debug_smp_processor_id(void)
+{
+	return check_preemption_disabled("smp_processor_id()");
+}
 EXPORT_SYMBOL(debug_smp_processor_id);
 
+notrace void __this_cpu_preempt_check(const char *op)
+{
+	char text[40];
+
+	snprintf(text, sizeof(text), "__this_cpu_%s()", op);
+	check_preemption_disabled(text);
+}
+EXPORT_SYMBOL(__this_cpu_preempt_check);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 06/48] mm: Replace __get_cpu_var uses with this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:18   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, akpm, linux-mm

[-- Attachment #1: this_mm --]
[-- Type: text/plain, Size: 6225 bytes --]

Replace places where __get_cpu_var() is used for an address calculation
with this_cpu_ptr().

Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/lib/radix-tree.c
===================================================================
--- linux.orig/lib/radix-tree.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/lib/radix-tree.c	2014-02-03 13:41:17.822646194 -0600
@@ -221,7 +221,7 @@
 		 * succeed in getting a node here (and never reach
 		 * kmem_cache_alloc)
 		 */
-		rtp = &__get_cpu_var(radix_tree_preloads);
+		rtp = this_cpu_ptr(&radix_tree_preloads);
 		if (rtp->nr) {
 			ret = rtp->nodes[rtp->nr - 1];
 			rtp->nodes[rtp->nr - 1] = NULL;
@@ -277,14 +277,14 @@
 	int ret = -ENOMEM;
 
 	preempt_disable();
-	rtp = &__get_cpu_var(radix_tree_preloads);
+	rtp = this_cpu_ptr(&radix_tree_preloads);
 	while (rtp->nr < ARRAY_SIZE(rtp->nodes)) {
 		preempt_enable();
 		node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
 		if (node == NULL)
 			goto out;
 		preempt_disable();
-		rtp = &__get_cpu_var(radix_tree_preloads);
+		rtp = this_cpu_ptr(&radix_tree_preloads);
 		if (rtp->nr < ARRAY_SIZE(rtp->nodes))
 			rtp->nodes[rtp->nr++] = node;
 		else
Index: linux/mm/memcontrol.c
===================================================================
--- linux.orig/mm/memcontrol.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/memcontrol.c	2014-02-03 13:41:17.822646194 -0600
@@ -2475,7 +2475,7 @@
  */
 static void drain_local_stock(struct work_struct *dummy)
 {
-	struct memcg_stock_pcp *stock = &__get_cpu_var(memcg_stock);
+	struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
 	drain_stock(stock);
 	clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
 }
Index: linux/mm/memory-failure.c
===================================================================
--- linux.orig/mm/memory-failure.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/memory-failure.c	2014-02-03 13:41:17.822646194 -0600
@@ -1297,7 +1297,7 @@
 	unsigned long proc_flags;
 	int gotten;
 
-	mf_cpu = &__get_cpu_var(memory_failure_cpu);
+	mf_cpu = this_cpu_ptr(&memory_failure_cpu);
 	for (;;) {
 		spin_lock_irqsave(&mf_cpu->lock, proc_flags);
 		gotten = kfifo_get(&mf_cpu->fifo, &entry);
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/page-writeback.c	2014-02-03 13:41:17.822646194 -0600
@@ -1623,7 +1623,7 @@
 	 * 1000+ tasks, all of them start dirtying pages at exactly the same
 	 * time, hence all honoured too large initial task->nr_dirtied_pause.
 	 */
-	p =  &__get_cpu_var(bdp_ratelimits);
+	p =  this_cpu_ptr(&bdp_ratelimits);
 	if (unlikely(current->nr_dirtied >= ratelimit))
 		*p = 0;
 	else if (unlikely(*p >= ratelimit_pages)) {
@@ -1635,7 +1635,7 @@
 	 * short-lived tasks (eg. gcc invocations in a kernel build) escaping
 	 * the dirty throttling and livelock other long-run dirtiers.
 	 */
-	p = &__get_cpu_var(dirty_throttle_leaks);
+	p = this_cpu_ptr(&dirty_throttle_leaks);
 	if (*p > 0 && current->nr_dirtied < ratelimit) {
 		unsigned long nr_pages_dirtied;
 		nr_pages_dirtied = min(*p, ratelimit - current->nr_dirtied);
Index: linux/mm/swap.c
===================================================================
--- linux.orig/mm/swap.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/swap.c	2014-02-03 13:41:17.822646194 -0600
@@ -441,7 +441,7 @@
 
 		page_cache_get(page);
 		local_irq_save(flags);
-		pvec = &__get_cpu_var(lru_rotate_pvecs);
+		pvec = this_cpu_ptr(&lru_rotate_pvecs);
 		if (!pagevec_add(pvec, page))
 			pagevec_move_tail(pvec);
 		local_irq_restore(flags);
Index: linux/mm/vmalloc.c
===================================================================
--- linux.orig/mm/vmalloc.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/vmalloc.c	2014-02-03 13:41:17.822646194 -0600
@@ -1488,7 +1488,7 @@
 	if (!addr)
 		return;
 	if (unlikely(in_interrupt())) {
-		struct vfree_deferred *p = &__get_cpu_var(vfree_deferred);
+		struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
 		if (llist_add((struct llist_node *)addr, &p->list))
 			schedule_work(&p->wq);
 	} else
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/slub.c	2014-02-03 13:41:17.822646194 -0600
@@ -2190,7 +2190,7 @@
 
 	page = new_slab(s, flags, node);
 	if (page) {
-		c = __this_cpu_ptr(s->cpu_slab);
+		c = raw_cpu_ptr(s->cpu_slab);
 		if (c->page)
 			flush_slab(s, c);
 
@@ -2410,7 +2410,7 @@
 	 * and the retrieval of the tid.
 	 */
 	preempt_disable();
-	c = __this_cpu_ptr(s->cpu_slab);
+	c = this_cpu_ptr(s->cpu_slab);
 
 	/*
 	 * The transaction ids are globally unique per cpu and per operation on
@@ -2666,7 +2666,7 @@
 	 * during the cmpxchg then the free will succedd.
 	 */
 	preempt_disable();
-	c = __this_cpu_ptr(s->cpu_slab);
+	c = this_cpu_ptr(s->cpu_slab);
 
 	tid = c->tid;
 	preempt_enable();
Index: linux/mm/vmstat.c
===================================================================
--- linux.orig/mm/vmstat.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/vmstat.c	2014-02-03 13:41:17.822646194 -0600
@@ -489,7 +489,7 @@
 			continue;
 
 		if (__this_cpu_read(p->pcp.count))
-			drain_zone_pages(zone, __this_cpu_ptr(&p->pcp));
+			drain_zone_pages(zone, this_cpu_ptr(&p->pcp));
 #endif
 	}
 	fold_diff(global_diff);
@@ -1218,7 +1218,7 @@
 static void vmstat_update(struct work_struct *w)
 {
 	refresh_cpu_vm_stats();
-	schedule_delayed_work(&__get_cpu_var(vmstat_work),
+	schedule_delayed_work(this_cpu_ptr(&vmstat_work),
 		round_jiffies_relative(sysctl_stat_interval));
 }
 
Index: linux/mm/zsmalloc.c
===================================================================
--- linux.orig/mm/zsmalloc.c	2014-01-31 09:15:37.674121110 -0600
+++ linux/mm/zsmalloc.c	2014-02-03 13:42:11.281526141 -0600
@@ -1071,7 +1071,7 @@
 	class = &pool->size_class[class_idx];
 	off = obj_idx_to_offset(page, obj_idx, class->size);
 
-	area = &__get_cpu_var(zs_map_area);
+	area = this_cpu_ptr(&zs_map_area);
 	if (off + class->size <= PAGE_SIZE)
 		kunmap_atomic(area->vm_addr);
 	else {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 06/48] mm: Replace __get_cpu_var uses with this_cpu_ptr
@ 2014-02-14 20:18   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, akpm, linux-mm

[-- Attachment #1: this_mm --]
[-- Type: text/plain, Size: 6450 bytes --]

Replace places where __get_cpu_var() is used for an address calculation
with this_cpu_ptr().

Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/lib/radix-tree.c
===================================================================
--- linux.orig/lib/radix-tree.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/lib/radix-tree.c	2014-02-03 13:41:17.822646194 -0600
@@ -221,7 +221,7 @@
 		 * succeed in getting a node here (and never reach
 		 * kmem_cache_alloc)
 		 */
-		rtp = &__get_cpu_var(radix_tree_preloads);
+		rtp = this_cpu_ptr(&radix_tree_preloads);
 		if (rtp->nr) {
 			ret = rtp->nodes[rtp->nr - 1];
 			rtp->nodes[rtp->nr - 1] = NULL;
@@ -277,14 +277,14 @@
 	int ret = -ENOMEM;
 
 	preempt_disable();
-	rtp = &__get_cpu_var(radix_tree_preloads);
+	rtp = this_cpu_ptr(&radix_tree_preloads);
 	while (rtp->nr < ARRAY_SIZE(rtp->nodes)) {
 		preempt_enable();
 		node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
 		if (node == NULL)
 			goto out;
 		preempt_disable();
-		rtp = &__get_cpu_var(radix_tree_preloads);
+		rtp = this_cpu_ptr(&radix_tree_preloads);
 		if (rtp->nr < ARRAY_SIZE(rtp->nodes))
 			rtp->nodes[rtp->nr++] = node;
 		else
Index: linux/mm/memcontrol.c
===================================================================
--- linux.orig/mm/memcontrol.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/memcontrol.c	2014-02-03 13:41:17.822646194 -0600
@@ -2475,7 +2475,7 @@
  */
 static void drain_local_stock(struct work_struct *dummy)
 {
-	struct memcg_stock_pcp *stock = &__get_cpu_var(memcg_stock);
+	struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
 	drain_stock(stock);
 	clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
 }
Index: linux/mm/memory-failure.c
===================================================================
--- linux.orig/mm/memory-failure.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/memory-failure.c	2014-02-03 13:41:17.822646194 -0600
@@ -1297,7 +1297,7 @@
 	unsigned long proc_flags;
 	int gotten;
 
-	mf_cpu = &__get_cpu_var(memory_failure_cpu);
+	mf_cpu = this_cpu_ptr(&memory_failure_cpu);
 	for (;;) {
 		spin_lock_irqsave(&mf_cpu->lock, proc_flags);
 		gotten = kfifo_get(&mf_cpu->fifo, &entry);
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/page-writeback.c	2014-02-03 13:41:17.822646194 -0600
@@ -1623,7 +1623,7 @@
 	 * 1000+ tasks, all of them start dirtying pages at exactly the same
 	 * time, hence all honoured too large initial task->nr_dirtied_pause.
 	 */
-	p =  &__get_cpu_var(bdp_ratelimits);
+	p =  this_cpu_ptr(&bdp_ratelimits);
 	if (unlikely(current->nr_dirtied >= ratelimit))
 		*p = 0;
 	else if (unlikely(*p >= ratelimit_pages)) {
@@ -1635,7 +1635,7 @@
 	 * short-lived tasks (eg. gcc invocations in a kernel build) escaping
 	 * the dirty throttling and livelock other long-run dirtiers.
 	 */
-	p = &__get_cpu_var(dirty_throttle_leaks);
+	p = this_cpu_ptr(&dirty_throttle_leaks);
 	if (*p > 0 && current->nr_dirtied < ratelimit) {
 		unsigned long nr_pages_dirtied;
 		nr_pages_dirtied = min(*p, ratelimit - current->nr_dirtied);
Index: linux/mm/swap.c
===================================================================
--- linux.orig/mm/swap.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/swap.c	2014-02-03 13:41:17.822646194 -0600
@@ -441,7 +441,7 @@
 
 		page_cache_get(page);
 		local_irq_save(flags);
-		pvec = &__get_cpu_var(lru_rotate_pvecs);
+		pvec = this_cpu_ptr(&lru_rotate_pvecs);
 		if (!pagevec_add(pvec, page))
 			pagevec_move_tail(pvec);
 		local_irq_restore(flags);
Index: linux/mm/vmalloc.c
===================================================================
--- linux.orig/mm/vmalloc.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/vmalloc.c	2014-02-03 13:41:17.822646194 -0600
@@ -1488,7 +1488,7 @@
 	if (!addr)
 		return;
 	if (unlikely(in_interrupt())) {
-		struct vfree_deferred *p = &__get_cpu_var(vfree_deferred);
+		struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
 		if (llist_add((struct llist_node *)addr, &p->list))
 			schedule_work(&p->wq);
 	} else
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/slub.c	2014-02-03 13:41:17.822646194 -0600
@@ -2190,7 +2190,7 @@
 
 	page = new_slab(s, flags, node);
 	if (page) {
-		c = __this_cpu_ptr(s->cpu_slab);
+		c = raw_cpu_ptr(s->cpu_slab);
 		if (c->page)
 			flush_slab(s, c);
 
@@ -2410,7 +2410,7 @@
 	 * and the retrieval of the tid.
 	 */
 	preempt_disable();
-	c = __this_cpu_ptr(s->cpu_slab);
+	c = this_cpu_ptr(s->cpu_slab);
 
 	/*
 	 * The transaction ids are globally unique per cpu and per operation on
@@ -2666,7 +2666,7 @@
 	 * during the cmpxchg then the free will succedd.
 	 */
 	preempt_disable();
-	c = __this_cpu_ptr(s->cpu_slab);
+	c = this_cpu_ptr(s->cpu_slab);
 
 	tid = c->tid;
 	preempt_enable();
Index: linux/mm/vmstat.c
===================================================================
--- linux.orig/mm/vmstat.c	2014-02-03 13:41:17.832645984 -0600
+++ linux/mm/vmstat.c	2014-02-03 13:41:17.822646194 -0600
@@ -489,7 +489,7 @@
 			continue;
 
 		if (__this_cpu_read(p->pcp.count))
-			drain_zone_pages(zone, __this_cpu_ptr(&p->pcp));
+			drain_zone_pages(zone, this_cpu_ptr(&p->pcp));
 #endif
 	}
 	fold_diff(global_diff);
@@ -1218,7 +1218,7 @@
 static void vmstat_update(struct work_struct *w)
 {
 	refresh_cpu_vm_stats();
-	schedule_delayed_work(&__get_cpu_var(vmstat_work),
+	schedule_delayed_work(this_cpu_ptr(&vmstat_work),
 		round_jiffies_relative(sysctl_stat_interval));
 }
 
Index: linux/mm/zsmalloc.c
===================================================================
--- linux.orig/mm/zsmalloc.c	2014-01-31 09:15:37.674121110 -0600
+++ linux/mm/zsmalloc.c	2014-02-03 13:42:11.281526141 -0600
@@ -1071,7 +1071,7 @@
 	class = &pool->size_class[class_idx];
 	off = obj_idx_to_offset(page, obj_idx, class->size);
 
-	area = &__get_cpu_var(zs_map_area);
+	area = this_cpu_ptr(&zs_map_area);
 	if (off + class->size <= PAGE_SIZE)
 		kunmap_atomic(area->vm_addr);
 	else {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 07/48] tracing: Replace __get_cpu_var uses with this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (5 preceding siblings ...)
  2014-02-14 20:18   ` Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 08/48] percpu: Replace __get_cpu_var " Christoph Lameter
                   ` (41 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu

[-- Attachment #1: this_trace --]
[-- Type: text/plain, Size: 1923 bytes --]

Replace uses of &__get_cpu_var for address calculation with this_cpu_ptr.

CC: Steven Rostedt <rostedt@goodmis.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/kprobes.h
===================================================================
--- linux.orig/include/linux/kprobes.h	2014-01-30 14:39:56.047604482 -0600
+++ linux/include/linux/kprobes.h	2014-01-30 14:39:56.037604677 -0600
@@ -355,7 +355,7 @@
 
 static inline struct kprobe_ctlblk *get_kprobe_ctlblk(void)
 {
-	return (&__get_cpu_var(kprobe_ctlblk));
+	return this_cpu_ptr(&kprobe_ctlblk);
 }
 
 int register_kprobe(struct kprobe *p);
Index: linux/kernel/trace/ftrace.c
===================================================================
--- linux.orig/kernel/trace/ftrace.c	2014-01-30 14:39:56.047604482 -0600
+++ linux/kernel/trace/ftrace.c	2014-01-30 14:39:56.037604677 -0600
@@ -898,7 +898,7 @@
 
 	local_irq_save(flags);
 
-	stat = &__get_cpu_var(ftrace_profile_stats);
+	stat = this_cpu_ptr(&ftrace_profile_stats);
 	if (!stat->hash || !ftrace_profile_enabled)
 		goto out;
 
@@ -929,7 +929,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	stat = &__get_cpu_var(ftrace_profile_stats);
+	stat = this_cpu_ptr(&ftrace_profile_stats);
 	if (!stat->hash || !ftrace_profile_enabled)
 		goto out;
 
Index: linux/kernel/trace/trace.c
===================================================================
--- linux.orig/kernel/trace/trace.c	2014-01-30 14:39:56.047604482 -0600
+++ linux/kernel/trace/trace.c	2014-01-30 14:39:56.037604677 -0600
@@ -1718,7 +1718,7 @@
 	 */
 	barrier();
 	if (use_stack == 1) {
-		trace.entries		= &__get_cpu_var(ftrace_stack).calls[0];
+		trace.entries		= this_cpu_ptr(ftrace_stack.calls);
 		trace.max_entries	= FTRACE_STACK_MAX_ENTRIES;
 
 		if (regs)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 08/48] percpu: Replace __get_cpu_var with this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (6 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 07/48] tracing: " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 09/48] kernel misc: Replace __get_cpu_var uses Christoph Lameter
                   ` (40 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_percpu --]
[-- Type: text/plain, Size: 596 bytes --]

One case of using __get_cpu_var in the get_cpu_var macro
for address calculation.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/percpu.h
===================================================================
--- linux.orig/include/linux/percpu.h	2014-01-30 14:40:02.667473594 -0600
+++ linux/include/linux/percpu.h	2014-01-30 14:40:02.667473594 -0600
@@ -29,7 +29,7 @@
  */
 #define get_cpu_var(var) (*({				\
 	preempt_disable();				\
-	&__get_cpu_var(var); }))
+	this_cpu_ptr(&var); }))
 
 /*
  * The weird & is necessary because sparse considers (void)(var) to be


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 09/48] kernel misc: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (7 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 08/48] percpu: Replace __get_cpu_var " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 10/48] drivers/char/random: " Christoph Lameter
                   ` (39 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, akpm

[-- Attachment #1: this_misc --]
[-- Type: text/plain, Size: 2212 bytes --]

Replace uses of __get_cpu_var for address calculation with this_cpu_ptr.

Cc: akpm@linux-foundation.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/printk/printk.c
===================================================================
--- linux.orig/kernel/printk/printk.c	2014-02-03 13:20:17.278906256 -0600
+++ linux/kernel/printk/printk.c	2014-02-03 13:20:17.278906256 -0600
@@ -2446,7 +2446,7 @@
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
 	if (pending & PRINTK_PENDING_SCHED) {
-		char *buf = __get_cpu_var(printk_sched_buf);
+		char *buf = this_cpu_ptr(printk_sched_buf);
 		pr_warn("[sched_delayed] %s", buf);
 	}
 
@@ -2464,7 +2464,7 @@
 	preempt_disable();
 	if (waitqueue_active(&log_wait)) {
 		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
 	}
 	preempt_enable();
 }
@@ -2477,14 +2477,14 @@
 	int r;
 
 	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
+	buf = this_cpu_ptr(printk_sched_buf);
 
 	va_start(args, fmt);
 	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
 	va_end(args);
 
 	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
-	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
 	local_irq_restore(flags);
 
 	return r;
Index: linux/kernel/smp.c
===================================================================
--- linux.orig/kernel/smp.c	2014-02-03 13:20:17.278906256 -0600
+++ linux/kernel/smp.c	2014-02-03 13:20:48.918248998 -0600
@@ -158,7 +158,7 @@
 	 */
 	WARN_ON_ONCE(!cpu_online(smp_processor_id()));
 
-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	entry = llist_del_all(this_cpu_ptr(&call_single_queue));
 	entry = llist_reverse_order(entry);
 
 	while (entry) {
@@ -218,7 +218,7 @@
 			struct call_single_data *csd = &d;
 
 			if (!wait)
-				csd = &__get_cpu_var(csd_data);
+				csd = this_cpu_ptr(&csd_data);
 
 			csd_lock(csd);
 
@@ -366,7 +366,7 @@
 		return;
 	}
 
-	cfd = &__get_cpu_var(cfd_data);
+	cfd = this_cpu_ptr(&cfd_data);
 
 	cpumask_and(cfd->cpumask, mask, cpu_online_mask);
 	cpumask_clear_cpu(this_cpu, cfd->cpumask);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 10/48] drivers/char/random: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (8 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 09/48] kernel misc: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 11/48] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
                   ` (38 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Arnd Bergmann, Greg Kroah-Hartman

[-- Attachment #1: this_drivers_char --]
[-- Type: text/plain, Size: 832 bytes --]

A single case of using __get_cpu_var for address calculation.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/char/random.c
===================================================================
--- linux.orig/drivers/char/random.c	2013-12-02 16:07:47.924685509 -0600
+++ linux/drivers/char/random.c	2013-12-02 16:07:47.924685509 -0600
@@ -838,7 +838,7 @@ static DEFINE_PER_CPU(struct fast_pool,
 void add_interrupt_randomness(int irq, int irq_flags)
 {
 	struct entropy_store	*r;
-	struct fast_pool	*fast_pool = &__get_cpu_var(irq_randomness);
+	struct fast_pool	*fast_pool = this_cpu_ptr(&irq_randomness);
 	struct pt_regs		*regs = get_irq_regs();
 	unsigned long		now = jiffies;
 	cycles_t		cycles = random_get_entropy();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 11/48] drivers/cpuidle: Replace __get_cpu_var uses for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (9 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 10/48] drivers/char/random: " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 12/48] drivers/oprofile: " Christoph Lameter
                   ` (37 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Daniel Lezcano, linux-pm, Rafael J. Wysocki

[-- Attachment #1: this_drivers_cpuidle --]
[-- Type: text/plain, Size: 2642 bytes --]

All of these are for address calculation. Replace with
this_cpu_ptr().

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
[cpufreq changes]
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/cpuidle/governors/ladder.c
===================================================================
--- linux.orig/drivers/cpuidle/governors/ladder.c	2013-12-02 16:07:48.284675507 -0600
+++ linux/drivers/cpuidle/governors/ladder.c	2013-12-02 16:07:48.274675782 -0600
@@ -66,7 +66,7 @@ static inline void ladder_do_selection(s
 static int ladder_select_state(struct cpuidle_driver *drv,
 				struct cpuidle_device *dev)
 {
-	struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
+	struct ladder_device *ldev = this_cpu_ptr(&ladder_devices);
 	struct ladder_device_state *last_state;
 	int last_residency, last_idx = ldev->last_state_idx;
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
@@ -170,7 +170,7 @@ static int ladder_enable_device(struct c
  */
 static void ladder_reflect(struct cpuidle_device *dev, int index)
 {
-	struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
+	struct ladder_device *ldev = this_cpu_ptr(&ladder_devices);
 	if (index > 0)
 		ldev->last_state_idx = index;
 }
Index: linux/drivers/cpuidle/governors/menu.c
===================================================================
--- linux.orig/drivers/cpuidle/governors/menu.c	2013-12-02 16:07:48.284675507 -0600
+++ linux/drivers/cpuidle/governors/menu.c	2013-12-02 16:07:48.274675782 -0600
@@ -286,7 +286,7 @@ again:
  */
 static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	int multiplier;
@@ -375,7 +375,7 @@ static int menu_select(struct cpuidle_dr
  */
 static void menu_reflect(struct cpuidle_device *dev, int index)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	data->last_state_idx = index;
 	if (index >= 0)
 		data->needs_update = 1;
@@ -388,7 +388,7 @@ static void menu_reflect(struct cpuidle_
  */
 static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	int last_idx = data->last_state_idx;
 	unsigned int last_idle_us = cpuidle_get_last_residency(dev);
 	struct cpuidle_state *target = &drv->states[last_idx];


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 12/48] drivers/oprofile: Replace __get_cpu_var uses for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (10 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 11/48] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 13/48] drivers/leds: Replace __get_cpu_var use through this_cpu_ptr Christoph Lameter
                   ` (36 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Robert Richter, oprofile-list

[-- Attachment #1: this_drivers_oprofile --]
[-- Type: text/plain, Size: 2503 bytes --]

Replace the uses of __get_cpu_var for address calculation with this_cpu_ptr.

Cc: Robert Richter <rric@kernel.org>
Cc: oprofile-list@lists.sf.net
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/oprofile/cpu_buffer.c
===================================================================
--- linux.orig/drivers/oprofile/cpu_buffer.c	2013-12-02 16:07:48.614666335 -0600
+++ linux/drivers/oprofile/cpu_buffer.c	2013-12-02 16:07:48.614666335 -0600
@@ -45,7 +45,7 @@ unsigned long oprofile_get_cpu_buffer_si
 
 void oprofile_cpu_buffer_inc_smpl_lost(void)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	cpu_buf->sample_lost_overflow++;
 }
@@ -297,7 +297,7 @@ __oprofile_add_ext_sample(unsigned long
 			  unsigned long event, int is_kernel,
 			  struct task_struct *task)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 	unsigned long backtrace = oprofile_backtrace_depth;
 
 	/*
@@ -357,7 +357,7 @@ oprofile_write_reserve(struct op_entry *
 {
 	struct op_sample *sample;
 	int is_kernel = !user_mode(regs);
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	cpu_buf->sample_received++;
 
@@ -412,13 +412,13 @@ int oprofile_write_commit(struct op_entr
 
 void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 	log_sample(cpu_buf, pc, 0, is_kernel, event, NULL);
 }
 
 void oprofile_add_trace(unsigned long pc)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	if (!cpu_buf->tracing)
 		return;
Index: linux/drivers/oprofile/timer_int.c
===================================================================
--- linux.orig/drivers/oprofile/timer_int.c	2013-12-02 16:07:48.614666335 -0600
+++ linux/drivers/oprofile/timer_int.c	2013-12-02 16:07:48.614666335 -0600
@@ -32,7 +32,7 @@ static enum hrtimer_restart oprofile_hrt
 
 static void __oprofile_hrtimer_start(void *unused)
 {
-	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
+	struct hrtimer *hrtimer = this_cpu_ptr(&oprofile_hrtimer);
 
 	if (!ctr_running)
 		return;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 13/48] drivers/leds: Replace __get_cpu_var use through this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (11 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 12/48] drivers/oprofile: " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 14/48] drivers/clocksource: Replace __get_cpu_var used for address calculation Christoph Lameter
                   ` (35 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Bryan Wu

[-- Attachment #1: this_drivers_leds --]
[-- Type: text/plain, Size: 731 bytes --]

Use this_cpu_ptr for the address calculation instead of __get_cpu_var.

Acked-by: Bryan Wu <cooloney@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/leds/trigger/ledtrig-cpu.c
===================================================================
--- linux.orig/drivers/leds/trigger/ledtrig-cpu.c	2013-12-02 16:07:48.974656330 -0600
+++ linux/drivers/leds/trigger/ledtrig-cpu.c	2013-12-02 16:07:48.964656610 -0600
@@ -46,7 +46,7 @@ static DEFINE_PER_CPU(struct led_trigger
  */
 void ledtrig_cpu(enum cpu_led_event ledevt)
 {
-	struct led_trigger_cpu *trig = &__get_cpu_var(cpu_trig);
+	struct led_trigger_cpu *trig = this_cpu_ptr(&cpu_trig);
 
 	/* Locate the correct CPU LED */
 	switch (ledevt) {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 14/48] drivers/clocksource: Replace __get_cpu_var used for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (12 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 13/48] drivers/leds: Replace __get_cpu_var use through this_cpu_ptr Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                   ` (34 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

[-- Attachment #1: this_drivers_clocksource --]
[-- Type: text/plain, Size: 750 bytes --]

Replace __get_cpu_var used for address calculation with this_cpu_ptr.

Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/clocksource/metag_generic.c
===================================================================
--- linux.orig/drivers/clocksource/metag_generic.c	2013-12-02 16:07:49.294647442 -0600
+++ linux/drivers/clocksource/metag_generic.c	2013-12-02 16:07:49.284647718 -0600
@@ -90,7 +90,7 @@ static struct clocksource clocksource_me
 
 static irqreturn_t metag_timer_interrupt(int irq, void *dummy)
 {
-	struct clock_event_device *evt = &__get_cpu_var(local_clockevent);
+	struct clock_event_device *evt = this_cpu_ptr(&local_clockevent);
 
 	evt->event_handler(evt);
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 15/48] parisc: Replace __get_cpu_var uses for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:18   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James E.J. Bottomley, Helge Deller,
	linux-parisc

Convert to the use of this_cpu_ptr().

Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/parisc/lib/memcpy.c
===================================================================
--- linux.orig/arch/parisc/lib/memcpy.c	2013-12-02 16:07:49.844632157 -0600
+++ linux/arch/parisc/lib/memcpy.c	2013-12-02 16:07:49.844632157 -0600
@@ -470,7 +470,7 @@ static unsigned long pa_memcpy(void *dst
 		return 0;
 
 	/* if a load or store fault occured we can get the faulty addr */
-	d = &__get_cpu_var(exception_data);
+	d = this_cpu_ptr(&exception_data);
 	fault_addr = d->fault_addr;
 
 	/* error in load or store? */
Index: linux/arch/parisc/mm/fault.c
===================================================================
--- linux.orig/arch/parisc/mm/fault.c	2013-12-02 16:07:49.844632157 -0600
+++ linux/arch/parisc/mm/fault.c	2013-12-02 16:07:49.844632157 -0600
@@ -151,7 +151,7 @@ int fixup_exception(struct pt_regs *regs
 	fix = search_exception_tables(regs->iaoq[0]);
 	if (fix) {
 		struct exception_data *d;
-		d = &__get_cpu_var(exception_data);
+		d = this_cpu_ptr(&exception_data);
 		d->fault_ip = regs->iaoq[0];
 		d->fault_space = regs->isr;
 		d->fault_addr = regs->ior;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 15/48] parisc: Replace __get_cpu_var uses for address calculation
@ 2014-02-14 20:18   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James E.J. Bottomley, Helge Deller,
	linux-parisc

[-- Attachment #1: this_parisc --]
[-- Type: text/plain, Size: 1310 bytes --]

Convert to the use of this_cpu_ptr().

Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/parisc/lib/memcpy.c
===================================================================
--- linux.orig/arch/parisc/lib/memcpy.c	2013-12-02 16:07:49.844632157 -0600
+++ linux/arch/parisc/lib/memcpy.c	2013-12-02 16:07:49.844632157 -0600
@@ -470,7 +470,7 @@ static unsigned long pa_memcpy(void *dst
 		return 0;
 
 	/* if a load or store fault occured we can get the faulty addr */
-	d = &__get_cpu_var(exception_data);
+	d = this_cpu_ptr(&exception_data);
 	fault_addr = d->fault_addr;
 
 	/* error in load or store? */
Index: linux/arch/parisc/mm/fault.c
===================================================================
--- linux.orig/arch/parisc/mm/fault.c	2013-12-02 16:07:49.844632157 -0600
+++ linux/arch/parisc/mm/fault.c	2013-12-02 16:07:49.844632157 -0600
@@ -151,7 +151,7 @@ int fixup_exception(struct pt_regs *regs
 	fix = search_exception_tables(regs->iaoq[0]);
 	if (fix) {
 		struct exception_data *d;
-		d = &__get_cpu_var(exception_data);
+		d = this_cpu_ptr(&exception_data);
 		d->fault_ip = regs->iaoq[0];
 		d->fault_space = regs->isr;
 		d->fault_addr = regs->ior;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 16/48] metag: Replace __get_cpu_var uses for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (14 preceding siblings ...)
  2014-02-14 20:18   ` Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 17/48] drivers/net/ethernet/tile: " Christoph Lameter
                   ` (32 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

[-- Attachment #1: this_metag --]
[-- Type: text/plain, Size: 2694 bytes --]

Replace __get_cpu_var uses for address calculation with this_cpu_ptr().

Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/metag/kernel/perf/perf_event.c
===================================================================
--- linux.orig/arch/metag/kernel/perf/perf_event.c	2013-12-02 16:07:50.134624099 -0600
+++ linux/arch/metag/kernel/perf/perf_event.c	2013-12-02 16:07:50.134624099 -0600
@@ -258,7 +258,7 @@ int metag_pmu_event_set_period(struct pe
 
 static void metag_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -306,7 +306,7 @@ static void metag_pmu_stop(struct perf_e
 
 static int metag_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = 0, ret = 0;
 
@@ -348,7 +348,7 @@ out:
 
 static void metag_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -607,7 +607,7 @@ static int _hw_perf_event_init(struct pe
 
 static void metag_pmu_enable_counter(struct hw_perf_event *event, int idx)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	unsigned int config = event->config;
 	unsigned int tmp = config & 0xf0;
 	unsigned long flags;
@@ -680,7 +680,7 @@ unlock:
 
 static void metag_pmu_disable_counter(struct hw_perf_event *event, int idx)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	unsigned int tmp = 0;
 	unsigned long flags;
 
@@ -728,7 +728,7 @@ out:
 
 static void metag_pmu_write_counter(int idx, u32 val)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	u32 tmp = 0;
 	unsigned long flags;
 
@@ -761,7 +761,7 @@ static int metag_pmu_event_map(int idx)
 static irqreturn_t metag_pmu_counter_overflow(int irq, void *dev)
 {
 	int idx = (int)dev;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event = cpuhw->events[idx];
 	struct hw_perf_event *hwc = &event->hw;
 	struct pt_regs *regs = get_irq_regs();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 17/48] drivers/net/ethernet/tile: Replace __get_cpu_var uses for address calculation
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (15 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 16/48] metag: " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:18 ` [PATCH 18/48] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
                   ` (31 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

[-- Attachment #1: this_driver_net_ethernet_tile --]
[-- Type: text/plain, Size: 4295 bytes --]

Replace with this_cpu_ptr.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/net/ethernet/tile/tilegx.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilegx.c	2014-02-03 13:55:50.504361347 -0600
+++ linux/drivers/net/ethernet/tile/tilegx.c	2014-02-03 13:57:14.172605727 -0600
@@ -423,7 +423,7 @@
 /* Provide linux buffers to mPIPE. */
 static void tile_net_provide_needed_buffers(void)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	int instance, kind;
 	for (instance = 0; instance < NR_MPIPE_MAX &&
 		     info->mpipe[instance].has_iqueue; instance++)	{
@@ -585,7 +585,7 @@
 /* Handle a packet.  Return true if "processed", false if "filtered". */
 static bool tile_net_handle_packet(int instance, gxio_mpipe_idesc_t *idesc)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct mpipe_data *md = &mpipe_data[instance];
 	struct net_device *dev = md->tile_net_devs_for_channel[idesc->channel];
 	uint8_t l2_offset;
@@ -651,7 +651,7 @@
  */
 static int tile_net_poll(struct napi_struct *napi, int budget)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	unsigned int work = 0;
 	gxio_mpipe_idesc_t *idesc;
 	int instance, i, n;
@@ -697,7 +697,7 @@
 /* Handle an ingress interrupt from an instance on the current cpu. */
 static irqreturn_t tile_net_handle_ingress_irq(int irq, void *id)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	napi_schedule(&info->mpipe[(uint64_t)id].napi);
 	return IRQ_HANDLED;
 }
@@ -760,7 +760,7 @@
 /* Make sure the egress timer is scheduled. */
 static void tile_net_schedule_egress_timer(void)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 
 	if (!info->egress_timer_scheduled) {
 		hrtimer_start(&info->egress_timer,
@@ -777,7 +777,7 @@
  */
 static enum hrtimer_restart tile_net_handle_egress_timer(struct hrtimer *t)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	unsigned long irqflags;
 	bool pending = false;
 	int i, instance;
@@ -1992,7 +1992,7 @@
 /* Help the kernel transmit a packet. */
 static int tile_net_tx(struct sk_buff *skb, struct net_device *dev)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int instance = priv->instance;
 	struct mpipe_data *md = &mpipe_data[instance];
@@ -2134,7 +2134,7 @@
 static void tile_net_netpoll(struct net_device *dev)
 {
 	int instance = mpipe_instance(dev);
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct mpipe_data *md = &mpipe_data[instance];
 
 	disable_percpu_irq(md->ingress_irq);
@@ -2241,7 +2241,7 @@
 /* Per-cpu module initialization. */
 static void tile_net_init_module_percpu(void *unused)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	int my_cpu = smp_processor_id();
 	int instance;
 
Index: linux/drivers/net/ethernet/tile/tilepro.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilepro.c	2014-02-03 13:55:50.504361347 -0600
+++ linux/drivers/net/ethernet/tile/tilepro.c	2014-02-03 13:55:50.504361347 -0600
@@ -993,13 +993,13 @@
 	PDEBUG("tile_net_register(queue_id %d)\n", queue_id);
 
 	if (!strcmp(dev->name, "xgbe0"))
-		info = &__get_cpu_var(hv_xgbe0);
+		info = this_cpu_ptr(&hv_xgbe0);
 	else if (!strcmp(dev->name, "xgbe1"))
-		info = &__get_cpu_var(hv_xgbe1);
+		info = this_cpu_ptr(&hv_xgbe1);
 	else if (!strcmp(dev->name, "gbe0"))
-		info = &__get_cpu_var(hv_gbe0);
+		info = this_cpu_ptr(&hv_gbe0);
 	else if (!strcmp(dev->name, "gbe1"))
-		info = &__get_cpu_var(hv_gbe1);
+		info = this_cpu_ptr(&hv_gbe1);
 	else
 		BUG();
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 18/48] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (16 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 17/48] drivers/net/ethernet/tile: " Christoph Lameter
@ 2014-02-14 20:18 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 19/48] tilegx: Another case of get_cpu_var Christoph Lameter
                   ` (30 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: tilegx --]
[-- Type: text/plain, Size: 730 bytes --]

Another case was merged for 3.14-rc1

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/net/ethernet/tile/tilegx.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilegx.c	2014-02-03 13:48:07.334069941 -0600
+++ linux/drivers/net/ethernet/tile/tilegx.c	2014-02-03 13:49:25.882425863 -0600
@@ -551,7 +551,7 @@
 static void tile_net_receive_skb(struct net_device *dev, struct sk_buff *skb,
 				 gxio_mpipe_idesc_t *idesc, unsigned long len)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int instance = priv->instance;
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 19/48] tilegx: Another case of get_cpu_var
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (17 preceding siblings ...)
  2014-02-14 20:18 ` [PATCH 18/48] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 20/48] time: Replace __get_cpu_var uses Christoph Lameter
                   ` (29 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: tilegx_new --]
[-- Type: text/plain, Size: 715 bytes --]

Seems to have been introduced in 3.14-rc1.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/net/ethernet/tile/tilegx.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilegx.c	2014-02-03 14:00:01.159103055 -0600
+++ linux/drivers/net/ethernet/tile/tilegx.c	2014-02-03 14:01:06.237738404 -0600
@@ -1923,7 +1923,7 @@
  */
 static int tile_net_tx_tso(struct sk_buff *skb, struct net_device *dev)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int channel = priv->echannel;
 	int instance = priv->instance;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 20/48] time: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (18 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 19/48] tilegx: Another case of get_cpu_var Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-15 11:33   ` Thomas Gleixner
  2014-02-14 20:19 ` [PATCH 21/48] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
                   ` (28 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_time --]
[-- Type: text/plain, Size: 11778 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

Convert uses of __get_cpu_var for creating a address from a percpu
offset to this_cpu_ptr.

The two cases where get_cpu_var is used to actually access a percpu
variable are changed to use this_cpu_read/raw_cpu_read.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/hrtimer.c
===================================================================
--- linux.orig/kernel/hrtimer.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/hrtimer.c	2014-02-03 13:22:35.576033413 -0600
@@ -598,7 +598,7 @@
 static int hrtimer_reprogram(struct hrtimer *timer,
 			     struct hrtimer_clock_base *base)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 	int res;
 
@@ -681,7 +681,7 @@
  */
 static void retrigger_next_event(void *arg)
 {
-	struct hrtimer_cpu_base *base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
 	if (!hrtimer_hres_active())
 		return;
@@ -955,7 +955,7 @@
 		 */
 		debug_deactivate(timer);
 		timer_stats_hrtimer_clear_start_info(timer);
-		reprogram = base->cpu_base == &__get_cpu_var(hrtimer_bases);
+		reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
 		/*
 		 * We must preserve the CALLBACK state flag here,
 		 * otherwise we could move the timer base in
@@ -1010,7 +1010,7 @@
 	 *
 	 * XXX send_remote_softirq() ?
 	 */
-	if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases)
+	if (leftmost && new_base->cpu_base == this_cpu_ptr(&hrtimer_bases)
 		&& hrtimer_enqueue_reprogram(timer, new_base)) {
 		if (wakeup) {
 			/*
@@ -1143,7 +1143,7 @@
  */
 ktime_t hrtimer_get_next_event(void)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t delta, mindelta = { .tv64 = KTIME_MAX };
 	unsigned long flags;
@@ -1184,7 +1184,7 @@
 
 	memset(timer, 0, sizeof(struct hrtimer));
 
-	cpu_base = &__raw_get_cpu_var(hrtimer_bases);
+	cpu_base = raw_cpu_ptr(&hrtimer_bases);
 
 	if (clock_id == CLOCK_REALTIME && mode != HRTIMER_MODE_ABS)
 		clock_id = CLOCK_MONOTONIC;
@@ -1227,7 +1227,7 @@
 	struct hrtimer_cpu_base *cpu_base;
 	int base = hrtimer_clockid_to_base(which_clock);
 
-	cpu_base = &__raw_get_cpu_var(hrtimer_bases);
+	cpu_base = raw_cpu_ptr(&hrtimer_bases);
 	*tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
 
 	return 0;
@@ -1282,7 +1282,7 @@
  */
 void hrtimer_interrupt(struct clock_event_device *dev)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires_next, now, entry_time, delta;
 	int i, retries = 0;
 
@@ -1416,7 +1416,7 @@
 	if (!hrtimer_hres_active())
 		return;
 
-	td = &__get_cpu_var(tick_cpu_device);
+	td = this_cpu_ptr(&tick_cpu_device);
 	if (td && td->evtdev)
 		hrtimer_interrupt(td->evtdev);
 }
@@ -1480,7 +1480,7 @@
 void hrtimer_run_queues(void)
 {
 	struct timerqueue_node *node;
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base;
 	int index, gettime = 1;
 
@@ -1718,7 +1718,7 @@
 
 	local_irq_disable();
 	old_base = &per_cpu(hrtimer_bases, scpu);
-	new_base = &__get_cpu_var(hrtimer_bases);
+	new_base = this_cpu_ptr(&hrtimer_bases);
 	/*
 	 * The caller is globally serialized and nobody else
 	 * takes two locks at once, deadlock is not possible.
Index: linux/kernel/irq_work.c
===================================================================
--- linux.orig/kernel/irq_work.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/irq_work.c	2014-02-03 13:22:35.576033413 -0600
@@ -70,7 +70,7 @@
 	/* Queue the entry and raise the IPI if needed. */
 	preempt_disable();
 
-	llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+	llist_add(&work->llnode, this_cpu_ptr(&irq_work_list));
 
 	/*
 	 * If the work is not "lazy" or the tick is stopped, raise the irq
@@ -90,7 +90,7 @@
 {
 	struct llist_head *this_list;
 
-	this_list = &__get_cpu_var(irq_work_list);
+	this_list = this_cpu_ptr(&irq_work_list);
 	if (llist_empty(this_list))
 		return false;
 
@@ -115,7 +115,7 @@
 	__this_cpu_write(irq_work_raised, 0);
 	barrier();
 
-	this_list = &__get_cpu_var(irq_work_list);
+	this_list = this_cpu_ptr(&irq_work_list);
 	if (llist_empty(this_list))
 		return;
 
Index: linux/kernel/sched/clock.c
===================================================================
--- linux.orig/kernel/sched/clock.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/sched/clock.c	2014-02-03 13:22:35.576033413 -0600
@@ -133,7 +133,7 @@
 
 static inline struct sched_clock_data *this_scd(void)
 {
-	return &__get_cpu_var(sched_clock_data);
+	return this_cpu_ptr(&sched_clock_data);
 }
 
 static inline struct sched_clock_data *cpu_sdc(int cpu)
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/softirq.c	2014-02-03 13:22:35.586033208 -0600
@@ -486,7 +486,7 @@
 	local_irq_disable();
 	list = __this_cpu_read(tasklet_vec.head);
 	__this_cpu_write(tasklet_vec.head, NULL);
-	__this_cpu_write(tasklet_vec.tail, &__get_cpu_var(tasklet_vec).head);
+	__this_cpu_write(tasklet_vec.tail, this_cpu_ptr(&tasklet_vec.head));
 	local_irq_enable();
 
 	while (list) {
@@ -522,7 +522,7 @@
 	local_irq_disable();
 	list = __this_cpu_read(tasklet_hi_vec.head);
 	__this_cpu_write(tasklet_hi_vec.head, NULL);
-	__this_cpu_write(tasklet_hi_vec.tail, &__get_cpu_var(tasklet_hi_vec).head);
+	__this_cpu_write(tasklet_hi_vec.tail, this_cpu_ptr(&tasklet_hi_vec.head));
 	local_irq_enable();
 
 	while (list) {
Index: linux/kernel/time/tick-common.c
===================================================================
--- linux.orig/kernel/time/tick-common.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/time/tick-common.c	2014-02-03 13:22:35.586033208 -0600
@@ -224,7 +224,7 @@
 
 void tick_install_replacement(struct clock_event_device *newdev)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	int cpu = smp_processor_id();
 
 	clockevents_exchange_device(td->evtdev, newdev);
@@ -374,14 +374,14 @@
 
 void tick_suspend(void)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 
 	clockevents_shutdown(td->evtdev);
 }
 
 void tick_resume(void)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	int broadcast = tick_resume_broadcast();
 
 	clockevents_set_mode(td->evtdev, CLOCK_EVT_MODE_RESUME);
Index: linux/kernel/time/tick-oneshot.c
===================================================================
--- linux.orig/kernel/time/tick-oneshot.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/time/tick-oneshot.c	2014-02-03 13:22:35.586033208 -0600
@@ -59,7 +59,7 @@
  */
 int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *))
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	struct clock_event_device *dev = td->evtdev;
 
 	if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT) ||
Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/time/tick-sched.c	2014-02-03 13:22:35.586033208 -0600
@@ -201,7 +201,7 @@
  */
 void __tick_nohz_full_check(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (tick_nohz_full_cpu(smp_processor_id())) {
 		if (ts->tick_stopped && !is_idle_task(current)) {
@@ -227,7 +227,7 @@
 void tick_nohz_full_kick(void)
 {
 	if (tick_nohz_full_cpu(smp_processor_id()))
-		irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+		irq_work_queue(this_cpu_ptr(&nohz_full_kick_work));
 }
 
 static void nohz_full_kick_ipi(void *info)
@@ -530,7 +530,7 @@
 	unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
 	ktime_t last_update, expires, ret = { .tv64 = 0 };
 	unsigned long rcu_delta_jiffies;
-	struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;
+	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
 	u64 time_delta;
 
 	time_delta = timekeeping_max_deferment();
@@ -798,7 +798,7 @@
 
 	local_irq_disable();
 
-	ts = &__get_cpu_var(tick_cpu_sched);
+	ts = this_cpu_ptr(&tick_cpu_sched);
 	ts->inidle = 1;
 	__tick_nohz_idle_enter(ts);
 
@@ -816,7 +816,7 @@
  */
 void tick_nohz_irq_exit(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (ts->inidle)
 		__tick_nohz_idle_enter(ts);
@@ -831,7 +831,7 @@
  */
 ktime_t tick_nohz_get_sleep_length(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	return ts->sleep_length;
 }
@@ -944,7 +944,7 @@
  */
 static void tick_nohz_handler(struct clock_event_device *dev)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	struct pt_regs *regs = get_irq_regs();
 	ktime_t now = ktime_get();
 
@@ -964,7 +964,7 @@
  */
 static void tick_nohz_switch_to_nohz(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t next;
 
 	if (!tick_nohz_active)
@@ -1100,7 +1100,7 @@
  */
 void tick_setup_sched_timer(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now = ktime_get();
 
 	/*
@@ -1169,7 +1169,7 @@
  */
 void tick_oneshot_notify(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	set_bit(0, &ts->check_clocks);
 }
@@ -1184,7 +1184,7 @@
  */
 int tick_check_oneshot_change(int allow_nohz)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (!test_and_clear_bit(0, &ts->check_clocks))
 		return 0;
Index: linux/kernel/timer.c
===================================================================
--- linux.orig/kernel/timer.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/kernel/timer.c	2014-02-03 13:22:35.586033208 -0600
@@ -621,7 +621,7 @@
 static void do_init_timer(struct timer_list *timer, unsigned int flags,
 			  const char *name, struct lock_class_key *key)
 {
-	struct tvec_base *base = __raw_get_cpu_var(tvec_bases);
+	struct tvec_base *base = raw_cpu_read(tvec_bases);
 
 	timer->entry.next = NULL;
 	timer->base = (void *)((unsigned long)base | flags);
Index: linux/drivers/clocksource/dummy_timer.c
===================================================================
--- linux.orig/drivers/clocksource/dummy_timer.c	2014-02-03 13:22:35.586033208 -0600
+++ linux/drivers/clocksource/dummy_timer.c	2014-02-03 13:22:35.586033208 -0600
@@ -28,7 +28,7 @@
 static void dummy_timer_setup(void)
 {
 	int cpu = smp_processor_id();
-	struct clock_event_device *evt = __this_cpu_ptr(&dummy_timer_evt);
+	struct clock_event_device *evt = raw_cpu_ptr(&dummy_timer_evt);
 
 	evt->name	= "dummy_timer";
 	evt->features	= CLOCK_EVT_FEAT_PERIODIC |


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 21/48] scheduler: Replace __get_cpu_var with this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (19 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 20/48] time: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr Christoph Lameter
                   ` (27 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_scheduler --]
[-- Type: text/plain, Size: 8005 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

Convert all uses of __get_cpu_var for address calculation to use
this_cpu_ptr instead.

Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/kernel_stat.h
===================================================================
--- linux.orig/include/linux/kernel_stat.h	2014-01-30 14:41:01.816304114 -0600
+++ linux/include/linux/kernel_stat.h	2014-01-30 14:41:01.816304114 -0600
@@ -44,8 +44,8 @@
 DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
 
 /* Must have preemption disabled for this to be meaningful. */
-#define kstat_this_cpu (&__get_cpu_var(kstat))
-#define kcpustat_this_cpu (&__get_cpu_var(kernel_cpustat))
+#define kstat_this_cpu this_cpu_ptr(&kstat)
+#define kcpustat_this_cpu this_cpu_ptr(&kernel_cpustat)
 #define kstat_cpu(cpu) per_cpu(kstat, cpu)
 #define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)
 
Index: linux/kernel/events/callchain.c
===================================================================
--- linux.orig/kernel/events/callchain.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/events/callchain.c	2014-01-30 14:41:01.816304114 -0600
@@ -137,7 +137,7 @@
 	int cpu;
 	struct callchain_cpus_entries *entries;
 
-	*rctx = get_recursion_context(__get_cpu_var(callchain_recursion));
+	*rctx = get_recursion_context(this_cpu_ptr(callchain_recursion));
 	if (*rctx == -1)
 		return NULL;
 
@@ -153,7 +153,7 @@
 static void
 put_callchain_entry(int rctx)
 {
-	put_recursion_context(__get_cpu_var(callchain_recursion), rctx);
+	put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
 }
 
 struct perf_callchain_entry *
Index: linux/kernel/events/core.c
===================================================================
--- linux.orig/kernel/events/core.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/events/core.c	2014-01-30 14:41:01.816304114 -0600
@@ -241,10 +241,10 @@
 		return;
 
 	/* decay the counter by 1 average sample */
-	local_samples_len = __get_cpu_var(running_sample_length);
+	local_samples_len = __this_cpu_read(running_sample_length);
 	local_samples_len -= local_samples_len/NR_ACCUMULATED_SAMPLES;
 	local_samples_len += sample_len_ns;
-	__get_cpu_var(running_sample_length) = local_samples_len;
+	__this_cpu_write(running_sample_length, local_samples_len);
 
 	/*
 	 * note: this will be biased artifically low until we have
@@ -870,7 +870,7 @@
 static void perf_pmu_rotate_start(struct pmu *pmu)
 {
 	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
-	struct list_head *head = &__get_cpu_var(rotation_list);
+	struct list_head *head = this_cpu_ptr(&rotation_list);
 
 	WARN_ON(!irqs_disabled());
 
@@ -2366,7 +2366,7 @@
 	 * to check if we have to switch out PMU state.
 	 * cgroup event are system-wide mode only
 	 */
-	if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+	if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
 		perf_cgroup_sched_out(task, next);
 }
 
@@ -2611,11 +2611,11 @@
 	 * to check if we have to switch in PMU state.
 	 * cgroup event are system-wide mode only
 	 */
-	if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+	if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
 		perf_cgroup_sched_in(prev, task);
 
 	/* check for system-wide branch_stack events */
-	if (atomic_read(&__get_cpu_var(perf_branch_stack_events)))
+	if (atomic_read(this_cpu_ptr(&perf_branch_stack_events)))
 		perf_branch_stack_sched_in(prev, task);
 }
 
@@ -2870,7 +2870,7 @@
 
 void perf_event_task_tick(void)
 {
-	struct list_head *head = &__get_cpu_var(rotation_list);
+	struct list_head *head = this_cpu_ptr(&rotation_list);
 	struct perf_cpu_context *cpuctx, *tmp;
 	struct perf_event_context *ctx;
 	int throttled;
@@ -5584,7 +5584,7 @@
 				    struct perf_sample_data *data,
 				    struct pt_regs *regs)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 	struct perf_event *event;
 	struct hlist_head *head;
 
@@ -5603,7 +5603,7 @@
 
 int perf_swevent_get_recursion_context(void)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 
 	return get_recursion_context(swhash->recursion);
 }
@@ -5611,7 +5611,7 @@
 
 inline void perf_swevent_put_recursion_context(int rctx)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 
 	put_recursion_context(swhash->recursion, rctx);
 }
@@ -5640,7 +5640,7 @@
 
 static int perf_swevent_add(struct perf_event *event, int flags)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 	struct hw_perf_event *hwc = &event->hw;
 	struct hlist_head *head;
 
Index: linux/kernel/sched/fair.c
===================================================================
--- linux.orig/kernel/sched/fair.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/sched/fair.c	2014-01-30 14:41:01.816304114 -0600
@@ -6123,7 +6123,7 @@
 	struct sched_group *group;
 	struct rq *busiest;
 	unsigned long flags;
-	struct cpumask *cpus = __get_cpu_var(load_balance_mask);
+	struct cpumask *cpus = this_cpu_ptr(load_balance_mask);
 
 	struct lb_env env = {
 		.sd		= sd,
Index: linux/kernel/sched/rt.c
===================================================================
--- linux.orig/kernel/sched/rt.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/sched/rt.c	2014-01-30 14:41:01.816304114 -0600
@@ -1401,7 +1401,7 @@
 static int find_lowest_rq(struct task_struct *task)
 {
 	struct sched_domain *sd;
-	struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
+	struct cpumask *lowest_mask = this_cpu_ptr(local_cpu_mask);
 	int this_cpu = smp_processor_id();
 	int cpu      = task_cpu(task);
 
Index: linux/kernel/sched/sched.h
===================================================================
--- linux.orig/kernel/sched/sched.h	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/sched/sched.h	2014-01-30 14:41:01.816304114 -0600
@@ -668,10 +668,10 @@
 DECLARE_PER_CPU(struct rq, runqueues);
 
 #define cpu_rq(cpu)		(&per_cpu(runqueues, (cpu)))
-#define this_rq()		(&__get_cpu_var(runqueues))
+#define this_rq()		this_cpu_ptr(&runqueues)
 #define task_rq(p)		cpu_rq(task_cpu(p))
 #define cpu_curr(cpu)		(cpu_rq(cpu)->curr)
-#define raw_rq()		(&__raw_get_cpu_var(runqueues))
+#define raw_rq()		raw_cpu_ptr(&runqueues)
 
 static inline u64 rq_clock(struct rq *rq)
 {
Index: linux/kernel/user-return-notifier.c
===================================================================
--- linux.orig/kernel/user-return-notifier.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/user-return-notifier.c	2014-01-30 14:41:01.816304114 -0600
@@ -14,7 +14,7 @@
 void user_return_notifier_register(struct user_return_notifier *urn)
 {
 	set_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
-	hlist_add_head(&urn->link, &__get_cpu_var(return_notifier_list));
+	hlist_add_head(&urn->link, this_cpu_ptr(&return_notifier_list));
 }
 EXPORT_SYMBOL_GPL(user_return_notifier_register);
 
@@ -25,7 +25,7 @@
 void user_return_notifier_unregister(struct user_return_notifier *urn)
 {
 	hlist_del(&urn->link);
-	if (hlist_empty(&__get_cpu_var(return_notifier_list)))
+	if (hlist_empty(this_cpu_ptr(&return_notifier_list)))
 		clear_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
 }
 EXPORT_SYMBOL_GPL(user_return_notifier_unregister);
Index: linux/kernel/taskstats.c
===================================================================
--- linux.orig/kernel/taskstats.c	2014-01-30 14:41:01.816304114 -0600
+++ linux/kernel/taskstats.c	2014-01-30 14:41:01.816304114 -0600
@@ -638,7 +638,7 @@
 		fill_tgid_exit(tsk);
 	}
 
-	listeners = __this_cpu_ptr(&listener_array);
+	listeners = raw_cpu_ptr(&listener_array);
 	if (list_empty(&listeners->list))
 		return;
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (20 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 21/48] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-15 11:33   ` Thomas Gleixner
  2014-02-14 20:19 ` [PATCH 23/48] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (26 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: fix_sched --]
[-- Type: text/plain, Size: 806 bytes --]

Two new uses introduced in 3.14-rc1.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c	2014-02-03 13:36:18.968910485 -0600
+++ linux/kernel/time/tick-sched.c	2014-02-03 13:36:18.968910485 -0600
@@ -909,7 +909,7 @@
  */
 void tick_nohz_idle_exit(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now;
 
 	local_irq_disable();
@@ -1026,7 +1026,7 @@
 
 static inline void tick_nohz_irq_enter(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now;
 
 	if (!ts->idle_active && !ts->tick_stopped)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 23/48] block: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (21 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 24/48] rcu: Replace __this_cpu_ptr uses " Christoph Lameter
                   ` (25 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Jens Axboe

[-- Attachment #1: this_block --]
[-- Type: text/plain, Size: 755 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__this_cpu_ptr is being phased out.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/fs/ext4/mballoc.c
===================================================================
--- linux.orig/fs/ext4/mballoc.c	2014-02-03 13:22:58.795551096 -0600
+++ linux/fs/ext4/mballoc.c	2014-02-03 13:22:58.785551304 -0600
@@ -4090,7 +4090,7 @@
 	 * per cpu locality group is to reduce the contention between block
 	 * request from multiple CPUs.
 	 */
-	ac->ac_lg = __this_cpu_ptr(sbi->s_locality_groups);
+	ac->ac_lg = raw_cpu_ptr(sbi->s_locality_groups);
 
 	/* we're going to use group allocation */
 	ac->ac_flags |= EXT4_MB_HINT_GROUP_ALLOC;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 24/48] rcu: Replace __this_cpu_ptr uses with raw_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (22 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 23/48] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-16 16:17   ` Paul E. McKenney
  2014-02-14 20:19 ` [PATCH 25/48] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
                   ` (24 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Dipankar Sarma, Paul E. McKenney

[-- Attachment #1: this_rcu --]
[-- Type: text/plain, Size: 2163 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__this_cpu_ptr is being phased out.

One special case is increment_cpu_stall_ticks().
A per cpu variable is incremented so use raw_cpu_inc().

Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/rcu/tree.c
===================================================================
--- linux.orig/kernel/rcu/tree.c	2014-02-03 13:23:21.855072103 -0600
+++ linux/kernel/rcu/tree.c	2014-02-03 13:23:21.845072311 -0600
@@ -1951,7 +1951,7 @@
 static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, unsigned long flags)
 {
 	int i;
-	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
+	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
 
 	/* No-CBs CPUs are handled specially. */
 	if (rcu_nocb_adopt_orphan_cbs(rsp, rdp, flags))
@@ -2334,7 +2334,7 @@
 __rcu_process_callbacks(struct rcu_state *rsp)
 {
 	unsigned long flags;
-	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
+	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
 
 	WARN_ON_ONCE(rdp->beenonline == 0);
 
@@ -2936,7 +2936,7 @@
 static void rcu_barrier_func(void *type)
 {
 	struct rcu_state *rsp = type;
-	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
+	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
 
 	_rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done);
 	atomic_inc(&rsp->barrier_cpu_count);
Index: linux/kernel/rcu/tree_plugin.h
===================================================================
--- linux.orig/kernel/rcu/tree_plugin.h	2014-02-03 13:23:21.855072103 -0600
+++ linux/kernel/rcu/tree_plugin.h	2014-02-03 13:23:21.845072311 -0600
@@ -1848,7 +1848,7 @@
 	struct rcu_data *rdp;
 
 	for_each_rcu_flavor(rsp) {
-		rdp = __this_cpu_ptr(rsp->rda);
+		rdp = raw_cpu_ptr(rsp->rda);
 		if (rdp->qlen_lazy != 0) {
 			atomic_inc(&oom_callback_count);
 			rsp->call(&rdp->oom_head, rcu_oom_callback);
@@ -1990,7 +1990,7 @@
 	struct rcu_state *rsp;
 
 	for_each_rcu_flavor(rsp)
-		__this_cpu_ptr(rsp->rda)->ticks_this_gp++;
+		raw_cpu_inc(rsp->rda->ticks_this_gp);
 }
 
 #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 25/48] watchdog: Replace __raw_get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (23 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 24/48] rcu: Replace __this_cpu_ptr uses " Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 26/48] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
                   ` (23 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Wim Van Sebroeck, linux-watchdog

[-- Attachment #1: this_watchdog --]
[-- Type: text/plain, Size: 1876 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

Most of these are the uses of &__raw_get_cpu_var for address calculation.

touch_softlockup_watchdog_sync() uses __raw_get_cpu_var to write to
per cpu variables. Use __this_cpu_write instead.

Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/watchdog.c
===================================================================
--- linux.orig/kernel/watchdog.c	2013-12-02 16:07:54.234510172 -0600
+++ linux/kernel/watchdog.c	2013-12-02 16:07:54.234510172 -0600
@@ -174,8 +174,8 @@ EXPORT_SYMBOL(touch_nmi_watchdog);
 
 void touch_softlockup_watchdog_sync(void)
 {
-	__raw_get_cpu_var(softlockup_touch_sync) = true;
-	__raw_get_cpu_var(watchdog_touch_ts) = 0;
+	__this_cpu_write(softlockup_touch_sync, true);
+	__this_cpu_write(watchdog_touch_ts, 0);
 }
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
@@ -341,7 +341,7 @@ static void watchdog_set_prio(unsigned i
 
 static void watchdog_enable(unsigned int cpu)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 
 	/* kick off the timer for the hardlockup detector */
 	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
@@ -361,7 +361,7 @@ static void watchdog_enable(unsigned int
 
 static void watchdog_disable(unsigned int cpu)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 
 	watchdog_set_prio(SCHED_NORMAL, 0);
 	hrtimer_cancel(hrtimer);
@@ -488,7 +488,7 @@ static struct smp_hotplug_thread watchdo
 
 static void restart_watchdog_hrtimer(void *info)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 	int ret;
 
 	/*


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 26/48] net: Replace get_cpu_var through this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (24 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 25/48] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 27/48] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (22 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, netdev, Eric Dumazet, David S. Miller

[-- Attachment #1: this_net --]
[-- Type: text/plain, Size: 7880 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

Replace uses of get_cpu_var for address calculation through this_cpu_ptr.

Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/net/core/dev.c
===================================================================
--- linux.orig/net/core/dev.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/core/dev.c	2014-02-03 13:25:26.382485504 -0600
@@ -2134,7 +2134,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	sd = &__get_cpu_var(softnet_data);
+	sd = this_cpu_ptr(&softnet_data);
 	q->next_sched = NULL;
 	*sd->output_queue_tailp = q;
 	sd->output_queue_tailp = &q->next_sched;
@@ -3125,7 +3125,7 @@
 static int rps_ipi_queued(struct softnet_data *sd)
 {
 #ifdef CONFIG_RPS
-	struct softnet_data *mysd = &__get_cpu_var(softnet_data);
+	struct softnet_data *mysd = this_cpu_ptr(&softnet_data);
 
 	if (sd != mysd) {
 		sd->rps_ipi_next = mysd->rps_ipi_list;
@@ -3152,7 +3152,7 @@
 	if (qlen < (netdev_max_backlog >> 1))
 		return false;
 
-	sd = &__get_cpu_var(softnet_data);
+	sd = this_cpu_ptr(&softnet_data);
 
 	rcu_read_lock();
 	fl = rcu_dereference(sd->flow_limit);
@@ -3303,7 +3303,7 @@
 
 static void net_tx_action(struct softirq_action *h)
 {
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 
 	if (sd->completion_queue) {
 		struct sk_buff *clist;
@@ -3733,7 +3733,7 @@
 static void flush_backlog(void *arg)
 {
 	struct net_device *dev = arg;
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	struct sk_buff *skb, *tmp;
 
 	rps_lock(sd);
@@ -4205,7 +4205,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	____napi_schedule(&__get_cpu_var(softnet_data), n);
+	____napi_schedule(this_cpu_ptr(&softnet_data), n);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL(__napi_schedule);
@@ -4326,7 +4326,7 @@
 
 static void net_rx_action(struct softirq_action *h)
 {
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies + 2;
 	int budget = netdev_budget;
 	void *have;
Index: linux/net/core/drop_monitor.c
===================================================================
--- linux.orig/net/core/drop_monitor.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/core/drop_monitor.c	2014-02-03 13:23:33.314834063 -0600
@@ -146,7 +146,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	data = &__get_cpu_var(dm_cpu_data);
+	data = this_cpu_ptr(&dm_cpu_data);
 	spin_lock(&data->lock);
 	dskb = data->skb;
 
Index: linux/net/core/skbuff.c
===================================================================
--- linux.orig/net/core/skbuff.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/core/skbuff.c	2014-02-03 13:23:33.314834063 -0600
@@ -344,7 +344,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	nc = &__get_cpu_var(netdev_alloc_cache);
+	nc = this_cpu_ptr(&netdev_alloc_cache);
 	if (unlikely(!nc->frag.page)) {
 refill:
 		for (order = NETDEV_FRAG_PAGE_MAX_ORDER; ;) {
Index: linux/net/ipv4/tcp_output.c
===================================================================
--- linux.orig/net/ipv4/tcp_output.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/ipv4/tcp_output.c	2014-02-03 13:23:33.314834063 -0600
@@ -817,7 +817,7 @@
 
 		/* queue this socket to tasklet queue */
 		local_irq_save(flags);
-		tsq = &__get_cpu_var(tsq_tasklet);
+		tsq = this_cpu_ptr(&tsq_tasklet);
 		list_add(&tp->tsq_node, &tsq->head);
 		tasklet_schedule(&tsq->tasklet);
 		local_irq_restore(flags);
Index: linux/net/ipv6/syncookies.c
===================================================================
--- linux.orig/net/ipv6/syncookies.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/ipv6/syncookies.c	2014-02-03 13:23:33.314834063 -0600
@@ -67,7 +67,7 @@
 
 	net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret));
 
-	tmp  = __get_cpu_var(ipv6_cookie_scratch);
+	tmp  = this_cpu_ptr(ipv6_cookie_scratch);
 
 	/*
 	 * we have 320 bits of information to hash, copy in the remaining
Index: linux/net/rds/ib_rdma.c
===================================================================
--- linux.orig/net/rds/ib_rdma.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/rds/ib_rdma.c	2014-02-03 13:23:33.314834063 -0600
@@ -267,7 +267,7 @@
 	unsigned long *flag;
 
 	preempt_disable();
-	flag = &__get_cpu_var(clean_list_grace);
+	flag = this_cpu_ptr(&clean_list_grace);
 	set_bit(CLEAN_LIST_BUSY_BIT, flag);
 	ret = llist_del_first(&pool->clean_list);
 	if (ret)
Index: linux/include/net/netfilter/nf_conntrack.h
===================================================================
--- linux.orig/include/net/netfilter/nf_conntrack.h	2014-02-03 13:23:33.324833855 -0600
+++ linux/include/net/netfilter/nf_conntrack.h	2014-02-03 13:23:33.314834063 -0600
@@ -235,7 +235,7 @@
 DECLARE_PER_CPU(struct nf_conn, nf_conntrack_untracked);
 static inline struct nf_conn *nf_ct_untracked_get(void)
 {
-	return &__raw_get_cpu_var(nf_conntrack_untracked);
+	return raw_cpu_ptr(&nf_conntrack_untracked);
 }
 void nf_ct_untracked_status_or(unsigned long bits);
 
Index: linux/include/net/snmp.h
===================================================================
--- linux.orig/include/net/snmp.h	2014-02-03 13:23:33.324833855 -0600
+++ linux/include/net/snmp.h	2014-02-03 13:23:33.314834063 -0600
@@ -170,7 +170,7 @@
 
 #define SNMP_ADD_STATS64_BH(mib, field, addend) 			\
 	do {								\
-		__typeof__(*mib[0]) *ptr = __this_cpu_ptr((mib)[0]);	\
+		__typeof__(*mib[0]) *ptr = raw_cpu_ptr((mib)[0]);	\
 		u64_stats_update_begin(&ptr->syncp);			\
 		ptr->mibs[field] += addend;				\
 		u64_stats_update_end(&ptr->syncp);			\
@@ -192,7 +192,7 @@
 #define SNMP_UPD_PO_STATS64_BH(mib, basefield, addend)			\
 	do {								\
 		__typeof__(*mib[0]) *ptr;				\
-		ptr = __this_cpu_ptr((mib)[0]);				\
+		ptr = raw_cpu_ptr((mib)[0]);				\
 		u64_stats_update_begin(&ptr->syncp);			\
 		ptr->mibs[basefield##PKTS]++;				\
 		ptr->mibs[basefield##OCTETS] += addend;			\
Index: linux/net/ipv4/route.c
===================================================================
--- linux.orig/net/ipv4/route.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/ipv4/route.c	2014-02-03 13:23:33.314834063 -0600
@@ -1303,7 +1303,7 @@
 	if (rt_is_input_route(rt)) {
 		p = (struct rtable **)&nh->nh_rth_input;
 	} else {
-		p = (struct rtable **)__this_cpu_ptr(nh->nh_pcpu_rth_output);
+		p = (struct rtable **)raw_cpu_ptr(nh->nh_pcpu_rth_output);
 	}
 	orig = *p;
 
@@ -1929,7 +1929,7 @@
 				do_cache = false;
 				goto add;
 			}
-			prth = __this_cpu_ptr(nh->nh_pcpu_rth_output);
+			prth = raw_cpu_ptr(nh->nh_pcpu_rth_output);
 		}
 		rth = rcu_dereference(*prth);
 		if (rt_cache_valid(rth)) {
Index: linux/net/ipv4/tcp.c
===================================================================
--- linux.orig/net/ipv4/tcp.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/ipv4/tcp.c	2014-02-03 13:23:33.314834063 -0600
@@ -3024,7 +3024,7 @@
 	local_bh_disable();
 	p = ACCESS_ONCE(tcp_md5sig_pool);
 	if (p)
-		return __this_cpu_ptr(p);
+		return raw_cpu_ptr(p);
 
 	local_bh_enable();
 	return NULL;
Index: linux/net/ipv4/syncookies.c
===================================================================
--- linux.orig/net/ipv4/syncookies.c	2014-02-03 13:23:33.324833855 -0600
+++ linux/net/ipv4/syncookies.c	2014-02-03 13:23:33.324833855 -0600
@@ -40,7 +40,7 @@
 
 	net_get_random_once(syncookie_secret, sizeof(syncookie_secret));
 
-	tmp  = __get_cpu_var(ipv4_cookie_scratch);
+	tmp  = this_cpu_ptr(ipv4_cookie_scratch);
 	memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c]));
 	tmp[0] = (__force u32)saddr;
 	tmp[1] = (__force u32)daddr;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 27/48] md: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (25 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 26/48] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 28/48] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
                   ` (21 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_drivers_md --]
[-- Type: text/plain, Size: 814 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__this_cpu_ptr is being phased out.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/md/dm-stats.c
===================================================================
--- linux.orig/drivers/md/dm-stats.c	2013-12-02 16:07:54.904491557 -0600
+++ linux/drivers/md/dm-stats.c	2013-12-02 16:07:54.904491557 -0600
@@ -548,7 +548,7 @@ void dm_stats_account_io(struct dm_stats
 		 * A race condition can at worst result in the merged flag being
 		 * misrepresented, so we don't have to disable preemption here.
 		 */
-		last = __this_cpu_ptr(stats->last);
+		last = raw_cpu_ptr(stats->last);
 		stats_aux->merged =
 			(bi_sector == (ACCESS_ONCE(last->last_sector) &&
 				       ((bi_rw & (REQ_WRITE | REQ_DISCARD)) ==


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 28/48] irqchips: Replace __this_cpu_ptr uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (26 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 27/48] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 29/48] x86: Replace __get_cpu_var uses Christoph Lameter
                   ` (20 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, nicolas.pitre, Russell King

[-- Attachment #1: this_irqchip --]
[-- Type: text/plain, Size: 2639 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

These are generally replaced with raw_cpu_ptr. However, in
gic_get_percpu_base() we immediately dereference the pointer. This is
equivalent to a raw_cpu_read. So use that operation there.

Cc: nicolas.pitre@linaro.org
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/drivers/irqchip/irq-gic.c
===================================================================
--- linux.orig/drivers/irqchip/irq-gic.c	2013-12-02 16:07:55.564473217 -0600
+++ linux/drivers/irqchip/irq-gic.c	2013-12-02 16:07:55.554473493 -0600
@@ -102,7 +102,7 @@ static struct gic_chip_data gic_data[MAX
 #ifdef CONFIG_GIC_NON_BANKED
 static void __iomem *gic_get_percpu_base(union gic_base *base)
 {
-	return *__this_cpu_ptr(base->percpu_base);
+	return raw_cpu_read(base->percpu_base);
 }
 
 static void __iomem *gic_get_common_base(union gic_base *base)
@@ -552,11 +552,11 @@ static void gic_cpu_save(unsigned int gi
 	if (!dist_base || !cpu_base)
 		return;
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
 	for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
 		ptr[i] = readl_relaxed(dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
 	for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
 		ptr[i] = readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
 
@@ -578,11 +578,11 @@ static void gic_cpu_restore(unsigned int
 	if (!dist_base || !cpu_base)
 		return;
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
 	for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
 		writel_relaxed(ptr[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
 	for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
 		writel_relaxed(ptr[i], dist_base + GIC_DIST_CONFIG + i * 4);
 
Index: linux/kernel/irq/chip.c
===================================================================
--- linux.orig/kernel/irq/chip.c	2013-12-02 16:07:55.564473217 -0600
+++ linux/kernel/irq/chip.c	2013-12-02 16:07:55.554473493 -0600
@@ -638,7 +638,7 @@ void handle_percpu_devid_irq(unsigned in
 {
 	struct irq_chip *chip = irq_desc_get_chip(desc);
 	struct irqaction *action = desc->action;
-	void *dev_id = __this_cpu_ptr(action->percpu_dev_id);
+	void *dev_id = raw_cpu_ptr(action->percpu_dev_id);
 	irqreturn_t res;
 
 	kstat_incr_irqs_this_cpu(irq, desc);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 29/48] x86: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (27 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 28/48] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 30/48] x86: Change __get_cpu_var calls introduced in 3.14 Christoph Lameter
                   ` (19 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, x86, H. Peter Anvin

[-- Attachment #1: this_x86 --]
[-- Type: text/plain, Size: 45035 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/x86/kernel/cpu/mcheck/mce_intel.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/mcheck/mce_intel.c	2014-02-03 13:25:42.032160449 -0600
@@ -85,7 +85,7 @@
 {
 	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
 		return;
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 }
 
 void mce_intel_hcpu_update(unsigned long cpu)
@@ -178,7 +178,7 @@
 {
 	if (cmci_storm_detect())
 		return;
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 	mce_notify_irq();
 }
 
@@ -189,7 +189,7 @@
  */
 static void cmci_discover(int banks)
 {
-	unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned);
+	unsigned long *owned = (void *)this_cpu_ptr(&mce_banks_owned);
 	unsigned long flags;
 	int i;
 	int bios_wrong_thresh = 0;
@@ -211,7 +211,7 @@
 		/* Already owned by someone else? */
 		if (val & MCI_CTL2_CMCI_EN) {
 			clear_bit(i, owned);
-			__clear_bit(i, __get_cpu_var(mce_poll_banks));
+			__clear_bit(i, this_cpu_ptr(mce_poll_banks));
 			continue;
 		}
 
@@ -235,7 +235,7 @@
 		/* Did the enable bit stick? -- the bank supports CMCI */
 		if (val & MCI_CTL2_CMCI_EN) {
 			set_bit(i, owned);
-			__clear_bit(i, __get_cpu_var(mce_poll_banks));
+			__clear_bit(i, this_cpu_ptr(mce_poll_banks));
 			/*
 			 * We are able to set thresholds for some banks that
 			 * had a threshold of 0. This means the BIOS has not
@@ -246,7 +246,7 @@
 					(val & MCI_CTL2_CMCI_THRESHOLD_MASK))
 				bios_wrong_thresh = 1;
 		} else {
-			WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
+			WARN_ON(!test_bit(i, this_cpu_ptr(mce_poll_banks)));
 		}
 	}
 	raw_spin_unlock_irqrestore(&cmci_discover_lock, flags);
@@ -267,10 +267,10 @@
 	unsigned long flags;
 	int banks;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)) || !cmci_supported(&banks))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)) || !cmci_supported(&banks))
 		return;
 	local_irq_save(flags);
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 	local_irq_restore(flags);
 }
 
@@ -279,12 +279,12 @@
 {
 	u64 val;
 
-	if (!test_bit(bank, __get_cpu_var(mce_banks_owned)))
+	if (!test_bit(bank, this_cpu_ptr(mce_banks_owned)))
 		return;
 	rdmsrl(MSR_IA32_MCx_CTL2(bank), val);
 	val &= ~MCI_CTL2_CMCI_EN;
 	wrmsrl(MSR_IA32_MCx_CTL2(bank), val);
-	__clear_bit(bank, __get_cpu_var(mce_banks_owned));
+	__clear_bit(bank, this_cpu_ptr(mce_banks_owned));
 }
 
 /*
Index: linux/arch/x86/kernel/irq_64.c
===================================================================
--- linux.orig/arch/x86/kernel/irq_64.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/irq_64.c	2014-02-03 13:25:42.032160449 -0600
@@ -52,13 +52,13 @@
 	    regs->sp <= curbase + THREAD_SIZE)
 		return;
 
-	irq_stack_top = (u64)__get_cpu_var(irq_stack_union.irq_stack) +
+	irq_stack_top = (u64)this_cpu_ptr(irq_stack_union.irq_stack) +
 			STACK_TOP_MARGIN;
-	irq_stack_bottom = (u64)__get_cpu_var(irq_stack_ptr);
+	irq_stack_bottom = (u64)__this_cpu_read(irq_stack_ptr);
 	if (regs->sp >= irq_stack_top && regs->sp <= irq_stack_bottom)
 		return;
 
-	oist = &__get_cpu_var(orig_ist);
+	oist = this_cpu_ptr(&orig_ist);
 	estack_top = (u64)oist->ist[0] - EXCEPTION_STKSZ + STACK_TOP_MARGIN;
 	estack_bottom = (u64)oist->ist[N_EXCEPTION_STACKS - 1];
 	if (regs->sp >= estack_top && regs->sp <= estack_bottom)
Index: linux/arch/x86/kernel/kvm.c
===================================================================
--- linux.orig/arch/x86/kernel/kvm.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/kvm.c	2014-02-03 13:25:42.032160449 -0600
@@ -243,9 +243,9 @@
 {
 	u32 reason = 0;
 
-	if (__get_cpu_var(apf_reason).enabled) {
-		reason = __get_cpu_var(apf_reason).reason;
-		__get_cpu_var(apf_reason).reason = 0;
+	if (__this_cpu_read(apf_reason.enabled)) {
+		reason = __this_cpu_read(apf_reason.reason);
+		__this_cpu_write(apf_reason.reason, 0);
 	}
 
 	return reason;
@@ -316,7 +316,7 @@
 	 * there's no need for lock or memory barriers.
 	 * An optimization barrier is implied in apic write.
 	 */
-	if (__test_and_clear_bit(KVM_PV_EOI_BIT, &__get_cpu_var(kvm_apic_eoi)))
+	if (__test_and_clear_bit(KVM_PV_EOI_BIT, this_cpu_ptr(&kvm_apic_eoi)))
 		return;
 	apic_write(APIC_EOI, APIC_EOI_ACK);
 }
@@ -327,13 +327,13 @@
 		return;
 
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
-		u64 pa = slow_virt_to_phys(&__get_cpu_var(apf_reason));
+		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
-		__get_cpu_var(apf_reason).enabled = 1;
+		__this_cpu_write(apf_reason.enabled, 1);
 		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
 		       smp_processor_id());
 	}
@@ -342,8 +342,8 @@
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
-		__get_cpu_var(kvm_apic_eoi) = 0;
-		pa = slow_virt_to_phys(&__get_cpu_var(kvm_apic_eoi))
+		__this_cpu_write(kvm_apic_eoi, 0);
+		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
 	}
@@ -354,11 +354,11 @@
 
 static void kvm_pv_disable_apf(void)
 {
-	if (!__get_cpu_var(apf_reason).enabled)
+	if (!__this_cpu_read(apf_reason.enabled))
 		return;
 
 	wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
-	__get_cpu_var(apf_reason).enabled = 0;
+	__this_cpu_write(apf_reason.enabled, 0);
 
 	printk(KERN_INFO"Unregister pv shared memory for cpu %d\n",
 	       smp_processor_id());
@@ -715,7 +715,7 @@
 	if (in_nmi())
 		return;
 
-	w = &__get_cpu_var(klock_waiting);
+	w = this_cpu_ptr(&klock_waiting);
 	cpu = smp_processor_id();
 	start = spin_time_start();
 
Index: linux/arch/x86/kvm/svm.c
===================================================================
--- linux.orig/arch/x86/kvm/svm.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kvm/svm.c	2014-02-03 13:25:42.032160449 -0600
@@ -654,7 +654,7 @@
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR)) {
 		wrmsrl(MSR_AMD64_TSC_RATIO, TSC_RATIO_DEFAULT);
-		__get_cpu_var(current_tsc_ratio) = TSC_RATIO_DEFAULT;
+		__this_cpu_write(current_tsc_ratio, TSC_RATIO_DEFAULT);
 	}
 
 
@@ -1312,8 +1312,8 @@
 		rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR) &&
-	    svm->tsc_ratio != __get_cpu_var(current_tsc_ratio)) {
-		__get_cpu_var(current_tsc_ratio) = svm->tsc_ratio;
+	    svm->tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
+		__this_cpu_write(current_tsc_ratio, svm->tsc_ratio);
 		wrmsrl(MSR_AMD64_TSC_RATIO, svm->tsc_ratio);
 	}
 }
Index: linux/arch/x86/kvm/x86.c
===================================================================
--- linux.orig/arch/x86/kvm/x86.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kvm/x86.c	2014-02-03 13:25:42.032160449 -0600
@@ -1535,7 +1535,7 @@
 
 	/* Keep irq disabled to prevent changes to the clock */
 	local_irq_save(flags);
-	this_tsc_khz = __get_cpu_var(cpu_tsc_khz);
+	this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
 	if (unlikely(this_tsc_khz == 0)) {
 		local_irq_restore(flags);
 		kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
Index: linux/arch/x86/oprofile/op_model_p4.c
===================================================================
--- linux.orig/arch/x86/oprofile/op_model_p4.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/oprofile/op_model_p4.c	2014-02-03 13:25:42.042160239 -0600
@@ -372,7 +372,7 @@
 {
 #ifdef CONFIG_SMP
 	int cpu = smp_processor_id();
-	return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+	return cpu != cpumask_first(this_cpu_ptr(cpu_sibling_map));
 #endif
 	return 0;
 }
Index: linux/arch/x86/xen/enlighten.c
===================================================================
--- linux.orig/arch/x86/xen/enlighten.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/xen/enlighten.c	2014-02-03 13:25:42.042160239 -0600
@@ -821,7 +821,7 @@
 
 void xen_copy_trap_info(struct trap_info *traps)
 {
-	const struct desc_ptr *desc = &__get_cpu_var(idt_desc);
+	const struct desc_ptr *desc = this_cpu_ptr(&idt_desc);
 
 	xen_convert_trap_info(desc, traps);
 }
@@ -838,7 +838,7 @@
 
 	spin_lock(&lock);
 
-	__get_cpu_var(idt_desc) = *desc;
+	memcpy(this_cpu_ptr(&idt_desc), desc, sizeof(idt_desc));
 
 	xen_convert_trap_info(desc, traps);
 
Index: linux/arch/x86/kernel/apb_timer.c
===================================================================
--- linux.orig/arch/x86/kernel/apb_timer.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/apb_timer.c	2014-02-03 13:25:42.042160239 -0600
@@ -146,7 +146,7 @@
 static int __init apbt_clockevent_register(void)
 {
 	struct sfi_timer_table_entry *mtmr;
-	struct apbt_dev *adev = &__get_cpu_var(cpu_apbt_dev);
+	struct apbt_dev *adev = this_cpu_ptr(&cpu_apbt_dev);
 
 	mtmr = sfi_get_mtmr(APBT_CLOCKEVENT0_NUM);
 	if (mtmr == NULL) {
@@ -200,7 +200,7 @@
 	if (!cpu)
 		return;
 
-	adev = &__get_cpu_var(cpu_apbt_dev);
+	adev = this_cpu_ptr(&cpu_apbt_dev);
 	if (!adev->timer) {
 		adev->timer = dw_apb_clockevent_init(cpu, adev->name,
 			APBT_CLOCKEVENT_RATING, adev_virt_addr(adev),
Index: linux/arch/x86/kernel/apic/apic.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/apic.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/apic/apic.c	2014-02-03 13:25:42.042160239 -0600
@@ -554,7 +554,7 @@
  */
 static void setup_APIC_timer(void)
 {
-	struct clock_event_device *levt = &__get_cpu_var(lapic_events);
+	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
 
 	if (this_cpu_has(X86_FEATURE_ARAT)) {
 		lapic_clockevent.features &= ~CLOCK_EVT_FEAT_C3STOP;
@@ -689,7 +689,7 @@
 
 static int __init calibrate_APIC_clock(void)
 {
-	struct clock_event_device *levt = &__get_cpu_var(lapic_events);
+	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
 	void (*real_handler)(struct clock_event_device *dev);
 	unsigned long deltaj;
 	long delta, deltatsc;
Index: linux/arch/x86/kernel/cpu/mcheck/mce-inject.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/mcheck/mce-inject.c	2014-02-03 13:25:42.042160239 -0600
@@ -83,7 +83,7 @@
 static int mce_raise_notify(unsigned int cmd, struct pt_regs *regs)
 {
 	int cpu = smp_processor_id();
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 	if (!cpumask_test_cpu(cpu, mce_inject_cpumask))
 		return NMI_DONE;
 	cpumask_clear_cpu(cpu, mce_inject_cpumask);
@@ -97,7 +97,7 @@
 static void mce_irq_ipi(void *info)
 {
 	int cpu = smp_processor_id();
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 
 	if (cpumask_test_cpu(cpu, mce_inject_cpumask) &&
 			m->inject_flags & MCJ_EXCEPTION) {
@@ -109,7 +109,7 @@
 /* Inject mce on current CPU */
 static int raise_local(void)
 {
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 	int context = MCJ_CTX(m->inject_flags);
 	int ret = 0;
 	int cpu = m->extcpu;
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c	2014-02-03 13:25:42.042160239 -0600
@@ -399,7 +399,7 @@
 
 		if (offset < 0)
 			return 0;
-		return *(u64 *)((char *)&__get_cpu_var(injectm) + offset);
+		return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset);
 	}
 
 	if (rdmsrl_safe(msr, &v)) {
@@ -421,7 +421,7 @@
 		int offset = msr_to_offset(msr);
 
 		if (offset >= 0)
-			*(u64 *)((char *)&__get_cpu_var(injectm) + offset) = v;
+			*(u64 *)((char *)this_cpu_ptr(&injectm) + offset) = v;
 		return;
 	}
 	wrmsrl(msr, v);
@@ -477,7 +477,7 @@
 /* Runs with CPU affinity in workqueue */
 static int mce_ring_empty(void)
 {
-	struct mce_ring *r = &__get_cpu_var(mce_ring);
+	struct mce_ring *r = this_cpu_ptr(&mce_ring);
 
 	return r->start == r->end;
 }
@@ -489,7 +489,7 @@
 
 	*pfn = 0;
 	get_cpu();
-	r = &__get_cpu_var(mce_ring);
+	r = this_cpu_ptr(&mce_ring);
 	if (r->start == r->end)
 		goto out;
 	*pfn = r->ring[r->start];
@@ -503,7 +503,7 @@
 /* Always runs in MCE context with preempt off */
 static int mce_ring_add(unsigned long pfn)
 {
-	struct mce_ring *r = &__get_cpu_var(mce_ring);
+	struct mce_ring *r = this_cpu_ptr(&mce_ring);
 	unsigned next;
 
 	next = (r->end + 1) % MCE_RING_SIZE;
@@ -525,7 +525,7 @@
 static void mce_schedule_work(void)
 {
 	if (!mce_ring_empty())
-		schedule_work(&__get_cpu_var(mce_work));
+		schedule_work(this_cpu_ptr(&mce_work));
 }
 
 DEFINE_PER_CPU(struct irq_work, mce_irq_work);
@@ -550,7 +550,7 @@
 		return;
 	}
 
-	irq_work_queue(&__get_cpu_var(mce_irq_work));
+	irq_work_queue(this_cpu_ptr(&mce_irq_work));
 }
 
 /*
@@ -1046,7 +1046,7 @@
 
 	mce_gather_info(&m, regs);
 
-	final = &__get_cpu_var(mces_seen);
+	final = this_cpu_ptr(&mces_seen);
 	*final = m;
 
 	memset(valid_banks, 0, sizeof(valid_banks));
@@ -1280,14 +1280,14 @@
 
 static void mce_timer_fn(unsigned long data)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned long iv;
 
 	WARN_ON(smp_processor_id() != data);
 
-	if (mce_available(__this_cpu_ptr(&cpu_info))) {
+	if (mce_available(this_cpu_ptr(&cpu_info))) {
 		machine_check_poll(MCP_TIMESTAMP,
-				&__get_cpu_var(mce_poll_banks));
+				this_cpu_ptr(&mce_poll_banks));
 		mce_intel_cmci_poll();
 	}
 
@@ -1315,7 +1315,7 @@
  */
 void mce_timer_kick(unsigned long interval)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned long when = jiffies + interval;
 	unsigned long iv = __this_cpu_read(mce_next_interval);
 
@@ -1651,7 +1651,7 @@
 
 static void __mcheck_cpu_init_timer(void)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned int cpu = smp_processor_id();
 
 	setup_timer(t, mce_timer_fn, cpu);
@@ -1694,8 +1694,8 @@
 	__mcheck_cpu_init_generic();
 	__mcheck_cpu_init_vendor(c);
 	__mcheck_cpu_init_timer();
-	INIT_WORK(&__get_cpu_var(mce_work), mce_process_work);
-	init_irq_work(&__get_cpu_var(mce_irq_work), &mce_irq_work_cb);
+	INIT_WORK(this_cpu_ptr(&mce_work), mce_process_work);
+	init_irq_work(this_cpu_ptr(&mce_irq_work), &mce_irq_work_cb);
 }
 
 /*
@@ -1947,7 +1947,7 @@
 static void __mce_disable_bank(void *arg)
 {
 	int bank = *((int *)arg);
-	__clear_bit(bank, __get_cpu_var(mce_poll_banks));
+	__clear_bit(bank, this_cpu_ptr(mce_poll_banks));
 	cmci_disable_bank(bank);
 }
 
@@ -2057,7 +2057,7 @@
 static void mce_syscore_resume(void)
 {
 	__mcheck_cpu_init_generic();
-	__mcheck_cpu_init_vendor(__this_cpu_ptr(&cpu_info));
+	__mcheck_cpu_init_vendor(raw_cpu_ptr(&cpu_info));
 }
 
 static struct syscore_ops mce_syscore_ops = {
@@ -2072,7 +2072,7 @@
 
 static void mce_cpu_restart(void *data)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	__mcheck_cpu_init_generic();
 	__mcheck_cpu_init_timer();
@@ -2088,14 +2088,14 @@
 /* Toggle features for corrected errors */
 static void mce_disable_cmci(void *data)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	cmci_clear();
 }
 
 static void mce_enable_ce(void *all)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	cmci_reenable();
 	cmci_recheck();
@@ -2328,7 +2328,7 @@
 	unsigned long action = *(unsigned long *)h;
 	int i;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 
 	if (!(action & CPU_TASKS_FROZEN))
@@ -2346,7 +2346,7 @@
 	unsigned long action = *(unsigned long *)h;
 	int i;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 
 	if (!(action & CPU_TASKS_FROZEN))
Index: linux/arch/x86/kernel/cpu/mcheck/mce_amd.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_amd.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/mcheck/mce_amd.c	2014-02-03 13:25:42.042160239 -0600
@@ -310,7 +310,7 @@
 			 * event.
 			 */
 			machine_check_poll(MCP_TIMESTAMP,
-					&__get_cpu_var(mce_poll_banks));
+					this_cpu_ptr(&mce_poll_banks));
 
 			if (high & MASK_OVERFLOW_HI) {
 				rdmsrl(address, m.misc);
Index: linux/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event.c	2014-02-03 13:25:42.042160239 -0600
@@ -493,7 +493,7 @@
 
 void x86_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -511,7 +511,7 @@
 
 static void x86_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu_initialized())
 		return;
@@ -528,7 +528,7 @@
 
 void x86_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -874,7 +874,7 @@
 
 static void x86_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
 	int i, added = cpuc->n_added;
@@ -1023,7 +1023,7 @@
  */
 static int x86_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc;
 	int assign[X86_PMC_IDX_MAX];
 	int n, n0, ret;
@@ -1070,7 +1070,7 @@
 
 static void x86_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = event->hw.idx;
 
 	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
@@ -1149,7 +1149,7 @@
 
 void x86_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (__test_and_clear_bit(hwc->idx, cpuc->active_mask)) {
@@ -1171,7 +1171,7 @@
 
 static void x86_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	/*
@@ -1213,7 +1213,7 @@
 	int idx, handled = 0;
 	u64 val;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * Some chipsets need to unmask the LVTPC in a particular spot
@@ -1608,7 +1608,7 @@
  */
 static int x86_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int assign[X86_PMC_IDX_MAX];
 	int n, ret;
 
@@ -1964,7 +1964,7 @@
 		if (idx > GDT_ENTRIES)
 			return 0;
 
-		desc = __this_cpu_ptr(&gdt_page.gdt[0]);
+		desc = raw_cpu_ptr(gdt_page.gdt);
 	}
 
 	return get_desc_base(desc + idx);
Index: linux/arch/x86/kernel/cpu/perf_event_amd.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_amd.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_amd.c	2014-02-03 13:25:42.042160239 -0600
@@ -699,7 +699,7 @@
 
 void amd_pmu_enable_virt(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	cpuc->perf_ctr_virt_mask = 0;
 
@@ -711,7 +711,7 @@
 
 void amd_pmu_disable_virt(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * We only mask out the Host-only bit so that host-only counting works
Index: linux/arch/x86/kernel/cpu/perf_event_intel.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_intel.c	2014-02-03 13:25:42.042160239 -0600
@@ -1046,7 +1046,7 @@
 
 static void intel_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0);
 
@@ -1059,7 +1059,7 @@
 
 static void intel_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	intel_pmu_pebs_enable_all();
 	intel_pmu_lbr_enable_all();
@@ -1093,7 +1093,7 @@
  */
 static void intel_pmu_nhm_workaround(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	static const unsigned long nhm_magic[4] = {
 		0x4300B5,
 		0x4300D2,
@@ -1192,7 +1192,7 @@
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
 		intel_pmu_disable_bts();
@@ -1256,7 +1256,7 @@
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
 		if (!__this_cpu_read(cpu_hw_events.enabled))
@@ -1350,7 +1350,7 @@
 	u64 status;
 	int handled;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * No known reason to not always do late ACK,
@@ -1775,7 +1775,7 @@
 
 static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
 
 	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
@@ -1796,7 +1796,7 @@
 
 static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
 	int idx;
 
@@ -1830,7 +1830,7 @@
 
 static void core_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
Index: linux/arch/x86/kernel/cpu/perf_event_intel_ds.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_ds.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_intel_ds.c	2014-02-03 13:25:42.042160239 -0600
@@ -457,7 +457,7 @@
 
 void intel_pmu_disable_bts(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long debugctlmsr;
 
 	if (!cpuc->ds)
@@ -474,7 +474,7 @@
 
 int intel_pmu_drain_bts_buffer(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct bts_record {
 		u64	from;
@@ -694,7 +694,7 @@
 
 void intel_pmu_pebs_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
@@ -709,7 +709,7 @@
 
 void intel_pmu_pebs_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
@@ -727,7 +727,7 @@
 
 void intel_pmu_pebs_enable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->pebs_enabled)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
@@ -735,7 +735,7 @@
 
 void intel_pmu_pebs_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->pebs_enabled)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
@@ -743,7 +743,7 @@
 
 static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long from = cpuc->lbr_entries[0].from;
 	unsigned long old_to, to = cpuc->lbr_entries[0].to;
 	unsigned long ip = regs->ip;
@@ -850,7 +850,7 @@
 	 * We cast to the biggest pebs_record but are careful not to
 	 * unconditionally access the 'extra' entries.
 	 */
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct pebs_record_hsw *pebs = __pebs;
 	struct perf_sample_data data;
 	struct pt_regs regs;
@@ -939,7 +939,7 @@
 
 static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct perf_event *event = cpuc->events[0]; /* PMC0 only */
 	struct pebs_record_core *at, *top;
@@ -980,7 +980,7 @@
 
 static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct perf_event *event = NULL;
 	void *at, *top;
Index: linux/arch/x86/kernel/cpu/perf_event_intel_lbr.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_lbr.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_intel_lbr.c	2014-02-03 13:25:42.042160239 -0600
@@ -133,7 +133,7 @@
 static void __intel_pmu_lbr_enable(void)
 {
 	u64 debugctl;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_sel)
 		wrmsrl(MSR_LBR_SELECT, cpuc->lbr_sel->config);
@@ -183,7 +183,7 @@
 
 void intel_pmu_lbr_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu.lbr_nr)
 		return;
@@ -203,7 +203,7 @@
 
 void intel_pmu_lbr_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu.lbr_nr)
 		return;
@@ -220,7 +220,7 @@
 
 void intel_pmu_lbr_enable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_users)
 		__intel_pmu_lbr_enable();
@@ -228,7 +228,7 @@
 
 void intel_pmu_lbr_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_users)
 		__intel_pmu_lbr_disable();
@@ -332,7 +332,7 @@
 
 void intel_pmu_lbr_read(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuc->lbr_users)
 		return;
Index: linux/arch/x86/kernel/cpu/perf_event_knc.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_knc.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_knc.c	2014-02-03 13:25:42.042160239 -0600
@@ -217,7 +217,7 @@
 	int bit, loops;
 	u64 status;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	knc_pmu_disable_all();
 
Index: linux/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_p4.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_p4.c	2014-02-03 13:25:42.052160034 -0600
@@ -915,7 +915,7 @@
 
 static void p4_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -984,7 +984,7 @@
 
 static void p4_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -1004,7 +1004,7 @@
 	int idx, handled = 0;
 	u64 val;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
 		int overflow;
Index: linux/arch/x86/kernel/hw_breakpoint.c
===================================================================
--- linux.orig/arch/x86/kernel/hw_breakpoint.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/hw_breakpoint.c	2014-02-03 13:25:42.052160034 -0600
@@ -109,7 +109,7 @@
 	int i;
 
 	for (i = 0; i < HBP_NUM; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (!*slot) {
 			*slot = bp;
@@ -123,7 +123,7 @@
 	set_debugreg(info->address, i);
 	__this_cpu_write(cpu_debugreg[i], info->address);
 
-	dr7 = &__get_cpu_var(cpu_dr7);
+	dr7 = this_cpu_ptr(&cpu_dr7);
 	*dr7 |= encode_dr7(i, info->len, info->type);
 
 	set_debugreg(*dr7, 7);
@@ -147,7 +147,7 @@
 	int i;
 
 	for (i = 0; i < HBP_NUM; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (*slot == bp) {
 			*slot = NULL;
@@ -158,7 +158,7 @@
 	if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
 		return;
 
-	dr7 = &__get_cpu_var(cpu_dr7);
+	dr7 = this_cpu_ptr(&cpu_dr7);
 	*dr7 &= ~__encode_dr7(i, info->len, info->type);
 
 	set_debugreg(*dr7, 7);
Index: linux/arch/x86/kvm/vmx.c
===================================================================
--- linux.orig/arch/x86/kvm/vmx.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kvm/vmx.c	2014-02-03 13:25:42.052160034 -0600
@@ -1581,7 +1581,7 @@
 	/*
 	 * VT restores TR but not its size.  Useless.
 	 */
-	struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+	struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 	struct desc_struct *descs;
 
 	descs = (void *)gdt->address;
@@ -1627,7 +1627,7 @@
 
 static unsigned long segment_base(u16 selector)
 {
-	struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+	struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 	struct desc_struct *d;
 	unsigned long table_base;
 	unsigned long v;
@@ -1753,7 +1753,7 @@
 	 */
 	if (!user_has_fpu() && !vmx->vcpu.guest_fpu_loaded)
 		stts();
-	load_gdt(&__get_cpu_var(host_gdt));
+	load_gdt(this_cpu_ptr(&host_gdt));
 }
 
 static void vmx_load_host_state(struct vcpu_vmx *vmx)
@@ -1783,7 +1783,7 @@
 	}
 
 	if (vmx->loaded_vmcs->cpu != cpu) {
-		struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+		struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 		unsigned long sysenter_esp;
 
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
@@ -2695,7 +2695,7 @@
 		ept_sync_global();
 	}
 
-	native_store_gdt(&__get_cpu_var(host_gdt));
+	native_store_gdt(this_cpu_ptr(&host_gdt));
 
 	return 0;
 }
Index: linux/arch/x86/mm/kmemcheck/kmemcheck.c
===================================================================
--- linux.orig/arch/x86/mm/kmemcheck/kmemcheck.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/mm/kmemcheck/kmemcheck.c	2014-02-03 13:25:42.052160034 -0600
@@ -134,7 +134,7 @@
 
 bool kmemcheck_active(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	return data->balance > 0;
 }
@@ -142,7 +142,7 @@
 /* Save an address that needs to be shown/hidden */
 static void kmemcheck_save_addr(unsigned long addr)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	BUG_ON(data->n_addrs >= ARRAY_SIZE(data->addr));
 	data->addr[data->n_addrs++] = addr;
@@ -150,7 +150,7 @@
 
 static unsigned int kmemcheck_show_all(void)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	unsigned int i;
 	unsigned int n;
 
@@ -163,7 +163,7 @@
 
 static unsigned int kmemcheck_hide_all(void)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	unsigned int i;
 	unsigned int n;
 
@@ -179,7 +179,7 @@
  */
 void kmemcheck_show(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	BUG_ON(!irqs_disabled());
 
@@ -220,7 +220,7 @@
  */
 void kmemcheck_hide(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	int n;
 
 	BUG_ON(!irqs_disabled());
@@ -522,7 +522,7 @@
 	const uint8_t *insn_primary;
 	unsigned int size;
 
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	/* Recursive fault -- ouch. */
 	if (data->busy) {
Index: linux/arch/x86/oprofile/nmi_int.c
===================================================================
--- linux.orig/arch/x86/oprofile/nmi_int.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/oprofile/nmi_int.c	2014-02-03 13:25:42.052160034 -0600
@@ -64,11 +64,11 @@
 static int profile_exceptions_notify(unsigned int val, struct pt_regs *regs)
 {
 	if (ctr_running)
-		model->check_ctrs(regs, &__get_cpu_var(cpu_msrs));
+		model->check_ctrs(regs, this_cpu_ptr(&cpu_msrs));
 	else if (!nmi_enabled)
 		return NMI_DONE;
 	else
-		model->stop(&__get_cpu_var(cpu_msrs));
+		model->stop(this_cpu_ptr(&cpu_msrs));
 	return NMI_HANDLED;
 }
 
@@ -91,7 +91,7 @@
 
 static void nmi_cpu_start(void *dummy)
 {
-	struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+	struct op_msrs const *msrs = this_cpu_ptr(&cpu_msrs);
 	if (!msrs->controls)
 		WARN_ON_ONCE(1);
 	else
@@ -111,7 +111,7 @@
 
 static void nmi_cpu_stop(void *dummy)
 {
-	struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+	struct op_msrs const *msrs = this_cpu_ptr(&cpu_msrs);
 	if (!msrs->controls)
 		WARN_ON_ONCE(1);
 	else
Index: linux/arch/x86/platform/uv/uv_time.c
===================================================================
--- linux.orig/arch/x86/platform/uv/uv_time.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/platform/uv/uv_time.c	2014-02-03 13:25:42.052160034 -0600
@@ -365,7 +365,7 @@
 
 static __init void uv_rtc_register_clockevents(struct work_struct *dummy)
 {
-	struct clock_event_device *ced = &__get_cpu_var(cpu_ced);
+	struct clock_event_device *ced = this_cpu_ptr(&cpu_ced);
 
 	*ced = clock_event_device_uv;
 	ced->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/x86/xen/multicalls.c
===================================================================
--- linux.orig/arch/x86/xen/multicalls.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/xen/multicalls.c	2014-02-03 13:25:42.052160034 -0600
@@ -54,7 +54,7 @@
 
 void xen_mc_flush(void)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_entry *mc;
 	int ret = 0;
 	unsigned long flags;
@@ -131,7 +131,7 @@
 
 struct multicall_space __xen_mc_entry(size_t args)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_space ret;
 	unsigned argidx = roundup(b->argidx, sizeof(u64));
 
@@ -162,7 +162,7 @@
 
 struct multicall_space xen_mc_extend_args(unsigned long op, size_t size)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_space ret = { NULL, NULL };
 
 	BUG_ON(preemptible());
@@ -192,7 +192,7 @@
 
 void xen_mc_callback(void (*fn)(void *), void *data)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct callback *cb;
 
 	if (b->cbidx == MC_BATCH) {
Index: linux/arch/x86/xen/time.c
===================================================================
--- linux.orig/arch/x86/xen/time.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/xen/time.c	2014-02-03 13:25:42.052160034 -0600
@@ -80,7 +80,7 @@
 
 	BUG_ON(preemptible());
 
-	state = &__get_cpu_var(xen_runstate);
+	state = this_cpu_ptr(&xen_runstate);
 
 	/*
 	 * The runstate info is always updated by the hypervisor on
@@ -123,7 +123,7 @@
 
 	WARN_ON(state.state != RUNSTATE_running);
 
-	snap = &__get_cpu_var(xen_runstate_snapshot);
+	snap = this_cpu_ptr(&xen_runstate_snapshot);
 
 	/* work out how much time the VCPU has not been runn*ing*  */
 	runnable = state.time[RUNSTATE_runnable] - snap->time[RUNSTATE_runnable];
@@ -158,7 +158,7 @@
 	cycle_t ret;
 
 	preempt_disable_notrace();
-	src = &__get_cpu_var(xen_vcpu)->time;
+	src = this_cpu_ptr(&xen_vcpu->time);
 	ret = pvclock_clocksource_read(src);
 	preempt_enable_notrace();
 	return ret;
@@ -397,7 +397,7 @@
 
 static irqreturn_t xen_timer_interrupt(int irq, void *dev_id)
 {
-	struct clock_event_device *evt = &__get_cpu_var(xen_clock_events).evt;
+	struct clock_event_device *evt = this_cpu_ptr(&xen_clock_events.evt);
 	irqreturn_t ret;
 
 	ret = IRQ_NONE;
@@ -460,7 +460,7 @@
 {
 	BUG_ON(preemptible());
 
-	clockevents_register_device(&__get_cpu_var(xen_clock_events).evt);
+	clockevents_register_device(this_cpu_ptr(&xen_clock_events.evt));
 }
 
 void xen_timer_resume(void)
Index: linux/arch/x86/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 13:25:42.052160034 -0600
@@ -164,7 +164,7 @@
 };
 
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux.orig/arch/x86/include/asm/perf_event_p4.h	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/include/asm/perf_event_p4.h	2014-02-03 13:25:42.052160034 -0600
@@ -189,7 +189,7 @@
 {
 #ifdef CONFIG_SMP
 	if (smp_num_siblings == 2)
-		return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+		return cpu != cpumask_first(this_cpu_ptr(cpu_sibling_map));
 #endif
 	return 0;
 }
Index: linux/arch/x86/kernel/apic/x2apic_cluster.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/x2apic_cluster.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/apic/x2apic_cluster.c	2014-02-03 13:25:42.052160034 -0600
@@ -42,7 +42,7 @@
 	 * We are to modify mask, so we need an own copy
 	 * and be sure it's manipulated with irq off.
 	 */
-	ipi_mask_ptr = __raw_get_cpu_var(ipi_mask);
+	ipi_mask_ptr = this_cpu_ptr(ipi_mask);
 	cpumask_copy(ipi_mask_ptr, mask);
 
 	/*
Index: linux/arch/x86/include/asm/debugreg.h
===================================================================
--- linux.orig/arch/x86/include/asm/debugreg.h	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/include/asm/debugreg.h	2014-02-03 13:25:42.052160034 -0600
@@ -97,11 +97,11 @@
 DECLARE_PER_CPU(int, debug_stack_usage);
 static inline void debug_stack_usage_inc(void)
 {
-	__get_cpu_var(debug_stack_usage)++;
+	__this_cpu_inc(debug_stack_usage);
 }
 static inline void debug_stack_usage_dec(void)
 {
-	__get_cpu_var(debug_stack_usage)--;
+	__this_cpu_dec(debug_stack_usage);
 }
 int is_debug_stack(unsigned long addr);
 void debug_stack_set_zero(void);
Index: linux/arch/x86/kernel/cpu/common.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/common.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/kernel/cpu/common.c	2014-02-03 13:25:42.052160034 -0600
@@ -1150,9 +1150,9 @@
 
 int is_debug_stack(unsigned long addr)
 {
-	return __get_cpu_var(debug_stack_usage) ||
-		(addr <= __get_cpu_var(debug_stack_addr) &&
-		 addr > (__get_cpu_var(debug_stack_addr) - DEBUG_STKSZ));
+	return __this_cpu_read(debug_stack_usage) ||
+		(addr <= __this_cpu_read(debug_stack_addr) &&
+		 addr > (__this_cpu_read(debug_stack_addr) - DEBUG_STKSZ));
 }
 
 DEFINE_PER_CPU(u32, debug_idt_ctr);
Index: linux/arch/x86/xen/spinlock.c
===================================================================
--- linux.orig/arch/x86/xen/spinlock.c	2014-02-03 13:25:42.062159830 -0600
+++ linux/arch/x86/xen/spinlock.c	2014-02-03 13:25:42.052160034 -0600
@@ -109,7 +109,7 @@
 __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
 {
 	int irq = __this_cpu_read(lock_kicker_irq);
-	struct xen_lock_waiting *w = &__get_cpu_var(lock_waiting);
+	struct xen_lock_waiting *w = this_cpu_ptr(&lock_waiting);
 	int cpu = smp_processor_id();
 	u64 start;
 	unsigned long flags;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 30/48] x86: Change __get_cpu_var calls introduced in 3.14
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (28 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 29/48] x86: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 31/48] uv: Replace __get_cpu_var Christoph Lameter
                   ` (18 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, x86, H. Peter Anvin

[-- Attachment #1: fix_rapl --]
[-- Type: text/plain, Size: 3176 bytes --]

More were added recently.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_rapl.c	2014-02-03 13:36:13.429026651 -0600
+++ linux/arch/x86/kernel/cpu/perf_event_intel_rapl.c	2014-02-03 13:36:13.429026651 -0600
@@ -129,7 +129,7 @@
 	 * or use ldexp(count, -32).
 	 * Watts = Joules/Time delta
 	 */
-	return v << (32 - __get_cpu_var(rapl_pmu)->hw_unit);
+	return v << (32 - __this_cpu_read(rapl_pmu->hw_unit));
 }
 
 static u64 rapl_event_update(struct perf_event *event)
@@ -181,7 +181,7 @@
 
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct perf_event *event;
 	unsigned long flags;
 
@@ -228,7 +228,7 @@
 
 static void rapl_pmu_event_start(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	unsigned long flags;
 
 	spin_lock_irqsave(&pmu->lock, flags);
@@ -238,7 +238,7 @@
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -272,7 +272,7 @@
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -662,7 +662,7 @@
 		return -1;
 	}
 
-	pmu = __get_cpu_var(rapl_pmu);
+	pmu = __this_cpu_read(rapl_pmu);
 
 	pr_info("RAPL PMU detected, hw unit 2^-%d Joules,"
 		" API unit is 2^-32 Joules,"
Index: linux/kernel/sched/deadline.c
===================================================================
--- linux.orig/kernel/sched/deadline.c	2014-02-03 13:36:13.429026651 -0600
+++ linux/kernel/sched/deadline.c	2014-02-03 13:36:13.429026651 -0600
@@ -1115,7 +1115,7 @@
 static int find_later_rq(struct task_struct *task)
 {
 	struct sched_domain *sd;
-	struct cpumask *later_mask = __get_cpu_var(local_cpu_mask_dl);
+	struct cpumask *later_mask = this_cpu_ptr(local_cpu_mask_dl);
 	int this_cpu = smp_processor_id();
 	int best_cpu, cpu = task_cpu(task);
 
Index: linux/kernel/time/tick-broadcast.c
===================================================================
--- linux.orig/kernel/time/tick-broadcast.c	2014-02-03 13:36:13.429026651 -0600
+++ linux/kernel/time/tick-broadcast.c	2014-02-03 13:36:13.429026651 -0600
@@ -541,7 +541,7 @@
 void tick_check_oneshot_broadcast_this_cpu(void)
 {
 	if (cpumask_test_cpu(smp_processor_id(), tick_broadcast_oneshot_mask)) {
-		struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+		struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 
 		/*
 		 * We might be in the middle of switching over from


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 31/48] uv: Replace __get_cpu_var
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (29 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 30/48] x86: Change __get_cpu_var calls introduced in 3.14 Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-03-04 23:02   ` Andrew Morton
  2014-02-14 20:19 ` [PATCH 32/48] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (17 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Hedi Berriche, Mike Travis, Dimitri Sivanich

[-- Attachment #1: more_fixes --]
[-- Type: text/plain, Size: 799 bytes --]

Use __this_cpu_read instead.

Cc: Hedi Berriche <hedi@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/x86/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
+++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
@@ -618,7 +618,7 @@
 };
 
 DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
-#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
+#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
 #define uv_hub_nmi			(uv_cpu_nmi.hub)
 #define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
 #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 32/48] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (30 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 31/48] uv: Replace __get_cpu_var Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 33/48] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
                   ` (16 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Russell King, Catalin Marinas, Will Deacon

[-- Attachment #1: this_arm --]
[-- Type: text/plain, Size: 2824 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
CC: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/arm/kernel/smp_twd.c
===================================================================
--- linux.orig/arch/arm/kernel/smp_twd.c	2013-12-02 16:07:57.604416529 -0600
+++ linux/arch/arm/kernel/smp_twd.c	2013-12-02 16:07:57.604416529 -0600
@@ -92,7 +92,7 @@ static int twd_timer_ack(void)
 
 static void twd_timer_stop(void)
 {
-	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
+	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
 
 	twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
 	disable_percpu_irq(clk->irq);
@@ -108,7 +108,7 @@ static void twd_update_frequency(void *n
 {
 	twd_timer_rate = *((unsigned long *) new_rate);
 
-	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
+	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
 }
 
 static int twd_rate_change(struct notifier_block *nb,
@@ -134,7 +134,7 @@ static struct notifier_block twd_clk_nb
 
 static int twd_clk_init(void)
 {
-	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
+	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
 		return clk_notifier_register(twd_clk, &twd_clk_nb);
 
 	return 0;
@@ -153,7 +153,7 @@ static void twd_update_frequency(void *d
 {
 	twd_timer_rate = clk_get_rate(twd_clk);
 
-	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
+	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
 }
 
 static int twd_cpufreq_transition(struct notifier_block *nb,
@@ -179,7 +179,7 @@ static struct notifier_block twd_cpufreq
 
 static int twd_cpufreq_init(void)
 {
-	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
+	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
 		return cpufreq_register_notifier(&twd_cpufreq_nb,
 			CPUFREQ_TRANSITION_NOTIFIER);
 
@@ -269,7 +269,7 @@ static void twd_get_clock(struct device_
  */
 static void twd_timer_setup(void)
 {
-	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
+	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
 	int cpu = smp_processor_id();
 
 	/*
Index: linux/arch/arm/mach-msm/timer.c
===================================================================
--- linux.orig/arch/arm/mach-msm/timer.c	2013-12-02 16:07:57.604416529 -0600
+++ linux/arch/arm/mach-msm/timer.c	2013-12-02 16:07:57.604416529 -0600
@@ -221,7 +221,7 @@ static void __init msm_timer_init(u32 dg
 		}
 
 		/* Immediately configure the timer on the boot CPU */
-		msm_local_timer_setup(__this_cpu_ptr(msm_evt));
+		msm_local_timer_setup(raw_cpu_ptr(msm_evt));
 	}
 
 err:


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 33/48] MIPS: Replace __get_cpu_var uses in FPU emulator.
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (31 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 32/48] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 34/48] mips: Replace __get_cpu_var uses Christoph Lameter
                   ` (15 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, David Daney

[-- Attachment #1: 0001-MIPS-Replace-__get_cpu_var-uses-in-FPU-emulator.patch --]
[-- Type: text/plain, Size: 2077 bytes --]


From: David Daney <david.daney@cavium.com>

The use of __this_cpu_inc() requires a fundamental integer type, so
change the type of all the counters to unsigned long, which is the
same width they were before, but not wrapped in local_t.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/mips/include/asm/fpu_emulator.h | 14 +++++++-------
 arch/mips/math-emu/cp1emu.c          |  6 +++---
 2 files changed, 10 insertions(+), 10 deletions(-)

Index: linux/arch/mips/include/asm/fpu_emulator.h
===================================================================
--- linux.orig/arch/mips/include/asm/fpu_emulator.h	2014-02-03 13:25:55.311884622 -0600
+++ linux/arch/mips/include/asm/fpu_emulator.h	2014-02-03 13:25:55.311884622 -0600
@@ -30,12 +30,12 @@
 #ifdef CONFIG_DEBUG_FS
 
 struct mips_fpu_emulator_stats {
-	local_t emulated;
-	local_t loads;
-	local_t stores;
-	local_t cp1ops;
-	local_t cp1xops;
-	local_t errors;
+	unsigned long emulated;
+	unsigned long loads;
+	unsigned long stores;
+	unsigned long cp1ops;
+	unsigned long cp1xops;
+	unsigned long errors;
 };
 
 DECLARE_PER_CPU(struct mips_fpu_emulator_stats, fpuemustats);
@@ -43,7 +43,7 @@
 #define MIPS_FPU_EMU_INC_STATS(M)					\
 do {									\
 	preempt_disable();						\
-	__local_inc(&__get_cpu_var(fpuemustats).M);			\
+	__this_cpu_inc(fpuemustats.M);					\
 	preempt_enable();						\
 } while (0)
 
Index: linux/arch/mips/math-emu/cp1emu.c
===================================================================
--- linux.orig/arch/mips/math-emu/cp1emu.c	2014-02-03 13:25:55.311884622 -0600
+++ linux/arch/mips/math-emu/cp1emu.c	2014-02-03 13:25:55.311884622 -0600
@@ -2140,13 +2140,13 @@
 static int fpuemu_stat_get(void *data, u64 *val)
 {
 	int cpu;
-	unsigned long sum = 0;
+	u64 sum = 0;
 	for_each_online_cpu(cpu) {
 		struct mips_fpu_emulator_stats *ps;
-		local_t *pv;
+		unsigned long *pv;
 		ps = &per_cpu(fpuemustats, cpu);
 		pv = (void *)ps + (unsigned long)data;
-		sum += local_read(pv);
+		sum += *pv;
 	}
 	*val = sum;
 	return 0;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 34/48] mips: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (32 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 33/48] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 35/48] s390: rename __this_cpu_ptr to raw_cpu_ptr Christoph Lameter
                   ` (14 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ralf Baechle

[-- Attachment #1: this_mips --]
[-- Type: text/plain, Size: 10685 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/mips/cavium-octeon/octeon-irq.c
===================================================================
--- linux.orig/arch/mips/cavium-octeon/octeon-irq.c	2014-02-03 13:26:05.861665500 -0600
+++ linux/arch/mips/cavium-octeon/octeon-irq.c	2014-02-03 13:26:05.851665711 -0600
@@ -264,13 +264,13 @@
 	unsigned long *pen;
 	unsigned long flags;
 	union octeon_ciu_chip_data cd;
-	raw_spinlock_t *lock = &__get_cpu_var(octeon_irq_ciu_spinlock);
+	raw_spinlock_t *lock = this_cpu_ptr(&octeon_irq_ciu_spinlock);
 
 	cd.p = irq_data_get_irq_chip_data(data);
 
 	raw_spin_lock_irqsave(lock, flags);
 	if (cd.s.line == 0) {
-		pen = &__get_cpu_var(octeon_irq_ciu0_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu0_en_mirror);
 		__set_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -279,7 +279,7 @@
 		wmb();
 		cvmx_write_csr(CVMX_CIU_INTX_EN0(cvmx_get_core_num() * 2), *pen);
 	} else {
-		pen = &__get_cpu_var(octeon_irq_ciu1_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu1_en_mirror);
 		__set_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -296,13 +296,13 @@
 	unsigned long *pen;
 	unsigned long flags;
 	union octeon_ciu_chip_data cd;
-	raw_spinlock_t *lock = &__get_cpu_var(octeon_irq_ciu_spinlock);
+	raw_spinlock_t *lock = this_cpu_ptr(&octeon_irq_ciu_spinlock);
 
 	cd.p = irq_data_get_irq_chip_data(data);
 
 	raw_spin_lock_irqsave(lock, flags);
 	if (cd.s.line == 0) {
-		pen = &__get_cpu_var(octeon_irq_ciu0_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu0_en_mirror);
 		__clear_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -311,7 +311,7 @@
 		wmb();
 		cvmx_write_csr(CVMX_CIU_INTX_EN0(cvmx_get_core_num() * 2), *pen);
 	} else {
-		pen = &__get_cpu_var(octeon_irq_ciu1_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu1_en_mirror);
 		__clear_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -431,11 +431,11 @@
 
 	if (cd.s.line == 0) {
 		int index = cvmx_get_core_num() * 2;
-		set_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu0_en_mirror));
+		set_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu0_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN0_W1S(index), mask);
 	} else {
 		int index = cvmx_get_core_num() * 2 + 1;
-		set_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu1_en_mirror));
+		set_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu1_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN1_W1S(index), mask);
 	}
 }
@@ -450,11 +450,11 @@
 
 	if (cd.s.line == 0) {
 		int index = cvmx_get_core_num() * 2;
-		clear_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu0_en_mirror));
+		clear_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu0_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN0_W1C(index), mask);
 	} else {
 		int index = cvmx_get_core_num() * 2 + 1;
-		clear_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu1_en_mirror));
+		clear_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu1_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN1_W1C(index), mask);
 	}
 }
@@ -1063,7 +1063,7 @@
 	const unsigned long core_id = cvmx_get_core_num();
 	u64 ciu_sum = cvmx_read_csr(CVMX_CIU_INTX_SUM0(core_id * 2));
 
-	ciu_sum &= __get_cpu_var(octeon_irq_ciu0_en_mirror);
+	ciu_sum &= __this_cpu_read(octeon_irq_ciu0_en_mirror);
 	if (likely(ciu_sum)) {
 		int bit = fls64(ciu_sum) - 1;
 		int irq = octeon_irq_ciu_to_irq[0][bit];
@@ -1080,7 +1080,7 @@
 {
 	u64 ciu_sum = cvmx_read_csr(CVMX_CIU_INT_SUM1);
 
-	ciu_sum &= __get_cpu_var(octeon_irq_ciu1_en_mirror);
+	ciu_sum &= __this_cpu_read(octeon_irq_ciu1_en_mirror);
 	if (likely(ciu_sum)) {
 		int bit = fls64(ciu_sum) - 1;
 		int irq = octeon_irq_ciu_to_irq[1][bit];
@@ -1129,10 +1129,10 @@
 	int coreid = cvmx_get_core_num();
 
 
-	__get_cpu_var(octeon_irq_ciu0_en_mirror) = 0;
-	__get_cpu_var(octeon_irq_ciu1_en_mirror) = 0;
+	__this_cpu_write(octeon_irq_ciu0_en_mirror, 0);
+	__this_cpu_write(octeon_irq_ciu1_en_mirror, 0);
 	wmb();
-	raw_spin_lock_init(&__get_cpu_var(octeon_irq_ciu_spinlock));
+	raw_spin_lock_init(this_cpu_ptr(&octeon_irq_ciu_spinlock));
 	/*
 	 * Disable All CIU Interrupts. The ones we need will be
 	 * enabled later.  Read the SUM register so we know the write
Index: linux/arch/mips/kernel/kprobes.c
===================================================================
--- linux.orig/arch/mips/kernel/kprobes.c	2014-02-03 13:26:05.861665500 -0600
+++ linux/arch/mips/kernel/kprobes.c	2014-02-03 13:26:05.851665711 -0600
@@ -224,7 +224,7 @@
 
 static void restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_old_SR = kcb->prev_kprobe.old_SR;
 	kcb->kprobe_saved_SR = kcb->prev_kprobe.saved_SR;
@@ -234,7 +234,7 @@
 static void set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 			       struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_saved_SR = kcb->kprobe_old_SR = (regs->cp0_status & ST0_IE);
 	kcb->kprobe_saved_epc = regs->cp0_epc;
 }
@@ -385,7 +385,7 @@
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/mips/kernel/perf_event_mipsxx.c
===================================================================
--- linux.orig/arch/mips/kernel/perf_event_mipsxx.c	2014-02-03 13:26:05.861665500 -0600
+++ linux/arch/mips/kernel/perf_event_mipsxx.c	2014-02-03 13:26:05.861665500 -0600
@@ -340,7 +340,7 @@
 
 static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
 
@@ -360,7 +360,7 @@
 
 static void mipsxx_pmu_disable_event(int idx)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 
 	WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
@@ -460,7 +460,7 @@
 
 static int mipspmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx;
 	int err = 0;
@@ -496,7 +496,7 @@
 
 static void mipspmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -1270,7 +1270,7 @@
 
 static void pause_local_counters(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int ctr = mipspmu.num_counters;
 	unsigned long flags;
 
@@ -1286,7 +1286,7 @@
 
 static void resume_local_counters(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int ctr = mipspmu.num_counters;
 
 	do {
@@ -1297,7 +1297,7 @@
 
 static int mipsxx_pmu_handle_shared_irq(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_sample_data data;
 	unsigned int counters = mipspmu.num_counters;
 	u64 counter;
Index: linux/arch/mips/kernel/smp-bmips.c
===================================================================
--- linux.orig/arch/mips/kernel/smp-bmips.c	2014-02-03 13:26:05.861665500 -0600
+++ linux/arch/mips/kernel/smp-bmips.c	2014-02-03 13:26:05.861665500 -0600
@@ -353,7 +353,7 @@
 	int action, cpu = irq - IPI0_IRQ;
 
 	spin_lock_irqsave(&ipi_lock, flags);
-	action = __get_cpu_var(ipi_action_mask);
+	action = __this_cpu_read(ipi_action_mask);
 	per_cpu(ipi_action_mask, cpu) = 0;
 	clear_c0_cause(cpu ? C_SW1 : C_SW0);
 	spin_unlock_irqrestore(&ipi_lock, flags);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 35/48] s390: rename __this_cpu_ptr to raw_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (33 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 34/48] mips: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 36/48] s390: Replace __get_cpu_var uses Christoph Lameter
                   ` (13 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_s390_corrupt --]
[-- Type: text/plain, Size: 2521 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

Use raw_cpu_ptr instead.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/s390/include/asm/percpu.h
===================================================================
--- linux.orig/arch/s390/include/asm/percpu.h	2013-12-18 13:37:42.757203647 -0600
+++ linux/arch/s390/include/asm/percpu.h	2013-12-18 13:37:42.747203967 -0600
@@ -31,7 +31,7 @@
 	pcp_op_T__ old__, new__, prev__;				\
 	pcp_op_T__ *ptr__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	prev__ = *ptr__;						\
 	do {								\
 		old__ = prev__;						\
@@ -70,7 +70,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp)); 				\
 	if (__builtin_constant_p(val__) &&				\
 	    ((szcast)val__ > -129) && ((szcast)val__ < 128)) {		\
 		asm volatile(						\
@@ -97,7 +97,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp));	 				\
 	asm volatile(							\
 		op "    %[old__],%[val__],%[ptr__]\n"			\
 		: [old__] "=d" (old__), [ptr__] "+Q" (*ptr__)		\
@@ -116,7 +116,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp));	 				\
 	asm volatile(							\
 		op "    %[old__],%[val__],%[ptr__]\n"			\
 		: [old__] "=d" (old__), [ptr__] "+Q" (*ptr__)		\
@@ -138,7 +138,7 @@
 	pcp_op_T__ ret__;						\
 	pcp_op_T__ *ptr__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	ret__ = cmpxchg(ptr__, oval, nval);				\
 	preempt_enable();						\
 	ret__;								\
@@ -154,7 +154,7 @@
 	typeof(pcp) *ptr__;						\
 	typeof(pcp) ret__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	ret__ = xchg(ptr__, nval);					\
 	preempt_enable();						\
 	ret__;								\
@@ -173,8 +173,8 @@
 	typeof(pcp2) *p2__;						\
 	int ret__;							\
 	preempt_disable();						\
-	p1__ = __this_cpu_ptr(&(pcp1));					\
-	p2__ = __this_cpu_ptr(&(pcp2));					\
+	p1__ = raw_cpu_ptr(&(pcp1));					\
+	p2__ = raw_cpu_ptr(&(pcp2));					\
 	ret__ = __cmpxchg_double(p1__, p2__, o1__, o2__, n1__, n2__);	\
 	preempt_enable();						\
 	ret__;								\


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 36/48] s390: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (34 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 35/48] s390: rename __this_cpu_ptr to raw_cpu_ptr Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 37/48] s390: Handle new __get_cpu_var calls added in 3.14 Christoph Lameter
                   ` (12 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Martin Schwidefsky, linux390, Heiko Carstens

[-- Attachment #1: this_s390 --]
[-- Type: text/plain, Size: 12041 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: linux390@de.ibm.com
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/s390/include/asm/irq.h
===================================================================
--- linux.orig/arch/s390/include/asm/irq.h	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/include/asm/irq.h	2014-02-03 14:15:34.299556667 -0600
@@ -66,7 +66,7 @@
 
 static __always_inline void inc_irq_stat(enum interruption_class irq)
 {
-	__get_cpu_var(irq_stat).irqs[irq]++;
+	__this_cpu_inc(irq_stat.irqs[irq]);
 }
 
 struct ext_code {
Index: linux/arch/s390/include/asm/cputime.h
===================================================================
--- linux.orig/arch/s390/include/asm/cputime.h	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/include/asm/cputime.h	2014-02-03 14:15:34.299556667 -0600
@@ -184,7 +184,7 @@
 
 static inline int s390_nohz_delay(int cpu)
 {
-	return __get_cpu_var(s390_idle).nohz_delay != 0;
+	return __this_cpu_read(s390_idle.nohz_delay) != 0;
 }
 
 #define arch_needs_cpu(cpu) s390_nohz_delay(cpu)
Index: linux/arch/s390/kernel/kprobes.c
===================================================================
--- linux.orig/arch/s390/kernel/kprobes.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/kprobes.c	2014-02-03 14:15:34.299556667 -0600
@@ -366,9 +366,9 @@
  */
 static void __kprobes push_kprobe(struct kprobe_ctlblk *kcb, struct kprobe *p)
 {
-	kcb->prev_kprobe.kp = __get_cpu_var(current_kprobe);
+	kcb->prev_kprobe.kp = __this_cpu_read(current_kprobe);
 	kcb->prev_kprobe.status = kcb->kprobe_status;
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 /*
@@ -378,7 +378,7 @@
  */
 static void __kprobes pop_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 }
 
@@ -459,7 +459,7 @@
 		enable_singlestep(kcb, regs, (unsigned long) p->ainsn.insn);
 		return 1;
 	} else if (kprobe_running()) {
-		p = __get_cpu_var(current_kprobe);
+		p = __this_cpu_read(current_kprobe);
 		if (p->break_handler && p->break_handler(p, regs)) {
 			/*
 			 * Continuation after the jprobe completed and
Index: linux/arch/s390/kernel/nmi.c
===================================================================
--- linux.orig/arch/s390/kernel/nmi.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/nmi.c	2014-02-03 14:15:34.299556667 -0600
@@ -53,8 +53,8 @@
 	 */
 	local_irq_save(flags);
 	local_mcck_disable();
-	mcck = __get_cpu_var(cpu_mcck);
-	memset(&__get_cpu_var(cpu_mcck), 0, sizeof(struct mcck_struct));
+	mcck = __this_cpu_read(cpu_mcck);
+	memset(this_cpu_ptr(&cpu_mcck), 0, sizeof(struct mcck_struct));
 	clear_thread_flag(TIF_MCCK_PENDING);
 	local_mcck_enable();
 	local_irq_restore(flags);
@@ -253,7 +253,7 @@
 	nmi_enter();
 	inc_irq_stat(NMI_NMI);
 	mci = (struct mci *) &S390_lowcore.mcck_interruption_code;
-	mcck = &__get_cpu_var(cpu_mcck);
+	mcck = this_cpu_ptr(&cpu_mcck);
 	umode = user_mode(regs);
 
 	if (mci->sd) {
Index: linux/arch/s390/kernel/perf_cpum_cf.c
===================================================================
--- linux.orig/arch/s390/kernel/perf_cpum_cf.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/perf_cpum_cf.c	2014-02-03 14:15:34.299556667 -0600
@@ -173,7 +173,7 @@
  */
 static void cpumf_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	int err;
 
 	if (cpuhw->flags & PMU_F_ENABLED)
@@ -196,7 +196,7 @@
  */
 static void cpumf_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	int err;
 	u64 inactive;
 
@@ -230,7 +230,7 @@
 		return;
 
 	inc_irq_stat(IRQEXT_CMC);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	/* Measurement alerts are shared and might happen when the PMU
 	 * is not reserved.  Ignore these alerts in this case. */
@@ -250,7 +250,7 @@
 #define PMC_RELEASE   1
 static void setup_pmc_cpu(void *flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	switch (*((int *) flags)) {
 	case PMC_INIT:
@@ -481,7 +481,7 @@
 
 static void cpumf_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
@@ -512,7 +512,7 @@
 
 static void cpumf_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (!(hwc->state & PERF_HES_STOPPED)) {
@@ -533,7 +533,7 @@
 
 static int cpumf_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	/* Check authorization for the counter set to which this
 	 * counter belongs.
@@ -557,7 +557,7 @@
 
 static void cpumf_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpumf_pmu_stop(event, PERF_EF_UPDATE);
 
@@ -581,7 +581,7 @@
  */
 static void cpumf_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->flags |= PERF_EVENT_TXN;
@@ -595,7 +595,7 @@
  */
 static void cpumf_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	WARN_ON(cpuhw->tx_state != cpuhw->state);
 
@@ -610,7 +610,7 @@
  */
 static int cpumf_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	u64 state;
 
 	/* check if the updated state can be scheduled */
Index: linux/arch/s390/kernel/processor.c
===================================================================
--- linux.orig/arch/s390/kernel/processor.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/processor.c	2014-02-03 14:15:34.299556667 -0600
@@ -23,8 +23,8 @@
  */
 void cpu_init(void)
 {
-	struct s390_idle_data *idle = &__get_cpu_var(s390_idle);
-	struct cpuid *id = &__get_cpu_var(cpu_id);
+	struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
+	struct cpuid *id = this_cpu_ptr(&cpu_id);
 
 	get_cpu_id(id);
 	atomic_inc(&init_mm.mm_count);
Index: linux/arch/s390/kernel/time.c
===================================================================
--- linux.orig/arch/s390/kernel/time.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/time.c	2014-02-03 14:15:34.299556667 -0600
@@ -92,7 +92,7 @@
 	struct clock_event_device *cd;
 
 	S390_lowcore.clock_comparator = -1ULL;
-	cd = &__get_cpu_var(comparators);
+	cd = this_cpu_ptr(&comparators);
 	cd->event_handler(cd);
 }
 
@@ -360,7 +360,7 @@
  */
 static void disable_sync_clock(void *dummy)
 {
-	atomic_t *sw_ptr = &__get_cpu_var(clock_sync_word);
+	atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word);
 	/*
 	 * Clear the in-sync bit 2^31. All get_sync_clock calls will
 	 * fail until the sync bit is turned back on. In addition
@@ -377,7 +377,7 @@
  */
 static void enable_sync_clock(void)
 {
-	atomic_t *sw_ptr = &__get_cpu_var(clock_sync_word);
+	atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word);
 	atomic_set_mask(0x80000000, sw_ptr);
 }
 
Index: linux/arch/s390/kernel/vtime.c
===================================================================
--- linux.orig/arch/s390/kernel/vtime.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/vtime.c	2014-02-03 14:15:34.299556667 -0600
@@ -154,7 +154,7 @@
 
 void __kprobes vtime_stop_cpu(void)
 {
-	struct s390_idle_data *idle = &__get_cpu_var(s390_idle);
+	struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
 	unsigned long long idle_time;
 	unsigned long psw_mask;
 
Index: linux/arch/s390/oprofile/hwsampler.c
===================================================================
--- linux.orig/arch/s390/oprofile/hwsampler.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/oprofile/hwsampler.c	2014-02-03 14:15:34.299556667 -0600
@@ -178,7 +178,7 @@
 static void hws_ext_handler(struct ext_code ext_code,
 			    unsigned int param32, unsigned long param64)
 {
-	struct hws_cpu_buffer *cb = &__get_cpu_var(sampler_cpu_buffer);
+	struct hws_cpu_buffer *cb = this_cpu_ptr(&sampler_cpu_buffer);
 
 	if (!(param32 & CPU_MF_INT_SF_MASK))
 		return;
Index: linux/arch/s390/kernel/irq.c
===================================================================
--- linux.orig/arch/s390/kernel/irq.c	2014-02-03 14:15:34.299556667 -0600
+++ linux/arch/s390/kernel/irq.c	2014-02-03 14:15:34.299556667 -0600
@@ -252,7 +252,7 @@
 
 	ext_code = *(struct ext_code *) &regs->int_code;
 	if (ext_code.code != 0x1004)
-		__get_cpu_var(s390_idle).nohz_delay = 1;
+		__this_cpu_write(s390_idle.nohz_delay, 1);
 
 	index = ext_hash(ext_code.code);
 	rcu_read_lock();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 37/48] s390: Handle new __get_cpu_var calls added in 3.14
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (35 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 36/48] s390: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19   ` Christoph Lameter
                   ` (11 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: s390_new --]
[-- Type: text/plain, Size: 2404 bytes --]

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/s390/kernel/perf_cpum_sf.c
===================================================================
--- linux.orig/arch/s390/kernel/perf_cpum_sf.c	2014-01-20 16:18:06.634974387 -0600
+++ linux/arch/s390/kernel/perf_cpum_sf.c	2014-02-03 14:20:50.982932405 -0600
@@ -562,7 +562,7 @@
 static void setup_pmc_cpu(void *flags)
 {
 	int err;
-	struct cpu_hw_sf *cpusf = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpusf = this_cpu_ptr(&cpu_hw_sf);
 
 	err = 0;
 	switch (*((int *) flags)) {
@@ -849,7 +849,7 @@
 
 static void cpumsf_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	struct hw_perf_event *hwc;
 	int err;
 
@@ -898,7 +898,7 @@
 
 static void cpumsf_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	struct hws_lsctl_request_block inactive;
 	struct hws_qsi_info_block si;
 	int err;
@@ -1306,7 +1306,7 @@
  */
 static void cpumsf_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
 		return;
@@ -1327,7 +1327,7 @@
  */
 static void cpumsf_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	if (event->hw.state & PERF_HES_STOPPED)
 		return;
@@ -1346,7 +1346,7 @@
 
 static int cpumsf_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	int err;
 
 	if (cpuhw->flags & PMU_F_IN_USE)
@@ -1397,7 +1397,7 @@
 
 static void cpumsf_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	perf_pmu_disable(event->pmu);
 	cpumsf_pmu_stop(event, PERF_EF_UPDATE);
@@ -1470,7 +1470,7 @@
 	if (!(alert & CPU_MF_INT_SF_MASK))
 		return;
 	inc_irq_stat(IRQEXT_CMS);
-	cpuhw = &__get_cpu_var(cpu_hw_sf);
+	cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	/* Measurement alerts are shared and might happen when the PMU
 	 * is not reserved.  Ignore these alerts in this case. */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 38/48] ia64: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:19   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

[-- Attachment #1: this_ia64 --]
[-- Type: text/plain, Size: 14340 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/alpha/kernel/perf_event.c
===================================================================
--- linux.orig/arch/alpha/kernel/perf_event.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/alpha/kernel/perf_event.c	2014-01-20 16:21:25.561294390 -0600
@@ -814,7 +814,7 @@
 	struct hw_perf_event *hwc;
 	int idx, j;
 
-	__get_cpu_var(irq_pmi_count)++;
+	__this_cpu_inc(irq_pmi_count);
 	cpuc = &__get_cpu_var(cpu_hw_events);
 
 	/* Completely counting through the PMC's period to trigger a new PMC
Index: linux/arch/ia64/kernel/irq.c
===================================================================
--- linux.orig/arch/ia64/kernel/irq.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/irq.c	2014-01-20 16:21:25.561294390 -0600
@@ -42,7 +42,7 @@
 
 unsigned int __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/kernel/irq_ia64.c
===================================================================
--- linux.orig/arch/ia64/kernel/irq_ia64.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/irq_ia64.c	2014-01-20 16:21:25.561294390 -0600
@@ -338,7 +338,7 @@
 		int irq;
 		struct irq_desc *desc;
 		struct irq_cfg *cfg;
-		irq = __get_cpu_var(vector_irq)[vector];
+		irq = __this_cpu_read(vector_irq[vector]);
 		if (irq < 0)
 			continue;
 
@@ -352,7 +352,7 @@
 			goto unlock;
 
 		spin_lock_irqsave(&vector_lock, flags);
-		__get_cpu_var(vector_irq)[vector] = -1;
+		__this_cpu_write(vector_irq[vector], -1);
 		cpu_clear(me, vector_table[vector]);
 		spin_unlock_irqrestore(&vector_lock, flags);
 		cfg->move_cleanup_count--;
Index: linux/arch/ia64/kernel/kprobes.c
===================================================================
--- linux.orig/arch/ia64/kernel/kprobes.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/kprobes.c	2014-01-20 16:21:25.561294390 -0600
@@ -396,7 +396,7 @@
 {
 	unsigned int i;
 	i = atomic_read(&kcb->prev_kprobe_index);
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe[i-1].kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe[i-1].kp);
 	kcb->kprobe_status = kcb->prev_kprobe[i-1].status;
 	atomic_sub(1, &kcb->prev_kprobe_index);
 }
@@ -404,7 +404,7 @@
 static void __kprobes set_current_kprobe(struct kprobe *p,
 			struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static void kretprobe_trampoline(void)
@@ -823,7 +823,7 @@
 			/*
 			 * jprobe instrumented function just completed
 			 */
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/ia64/kernel/mca.c
===================================================================
--- linux.orig/arch/ia64/kernel/mca.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/mca.c	2014-01-20 16:21:25.561294390 -0600
@@ -1341,7 +1341,7 @@
 		ia64_mlogbuf_finish(1);
 	}
 
-	if (__get_cpu_var(ia64_mca_tr_reload)) {
+	if (__this_cpu_read(ia64_mca_tr_reload)) {
 		mca_insert_tr(0x1); /*Reload dynamic itrs*/
 		mca_insert_tr(0x2); /*Reload dynamic itrs*/
 	}
@@ -1874,14 +1874,14 @@
 		"MCA", cpu);
 	format_mca_init_stack(data, offsetof(struct ia64_mca_cpu, init_stack),
 		"INIT", cpu);
-	__get_cpu_var(ia64_mca_data) = __per_cpu_mca[cpu] = __pa(data);
+	__this_cpu_write(ia64_mca_data, (__per_cpu_mca[cpu] = __pa(data)));
 
 	/*
 	 * Stash away a copy of the PTE needed to map the per-CPU page.
 	 * We may need it during MCA recovery.
 	 */
-	__get_cpu_var(ia64_mca_per_cpu_pte) =
-		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_per_cpu_pte,
+		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL)));
 
 	/*
 	 * Also, stash away a copy of the PAL address and the PTE
@@ -1890,10 +1890,10 @@
 	pal_vaddr = efi_get_pal_addr();
 	if (!pal_vaddr)
 		return;
-	__get_cpu_var(ia64_mca_pal_base) =
-		GRANULEROUNDDOWN((unsigned long) pal_vaddr);
-	__get_cpu_var(ia64_mca_pal_pte) = pte_val(mk_pte_phys(__pa(pal_vaddr),
-							      PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_pal_base,
+		GRANULEROUNDDOWN((unsigned long) pal_vaddr));
+	__this_cpu_write(ia64_mca_pal_pte, pte_val(mk_pte_phys(__pa(pal_vaddr),
+							      PAGE_KERNEL)));
 }
 
 static void ia64_mca_cmc_vector_adjust(void *dummy)
Index: linux/arch/ia64/kernel/process.c
===================================================================
--- linux.orig/arch/ia64/kernel/process.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/process.c	2014-01-20 16:21:25.561294390 -0600
@@ -215,7 +215,7 @@
 	unsigned int this_cpu = smp_processor_id();
 
 	/* Ack it */
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 
 	max_xtp();
 	local_irq_disable();
@@ -273,7 +273,7 @@
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_save_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE)
 		pfm_syst_wide_update_task(task, info, 0);
 #endif
@@ -293,7 +293,7 @@
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_load_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE) 
 		pfm_syst_wide_update_task(task, info, 1);
 #endif
Index: linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
===================================================================
--- linux.orig/arch/ia64/sn/kernel/sn2/sn2_smp.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/sn/kernel/sn2/sn2_smp.c	2014-01-20 16:21:25.561294390 -0600
@@ -134,8 +134,8 @@
 	itc = ia64_get_itc();
 	smp_flush_tlb_cpumask(*mm_cpumask(mm));
 	itc = ia64_get_itc() - itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes_itc_clocks += itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes++;
+	__this_cpu_add(ptcstats.shub_ipi_flushes_itc_clocks, itc);
+	__this_cpu_inc(ptcstats.shub_ipi_flushes);
 }
 
 /**
@@ -199,14 +199,14 @@
 			start += (1UL << nbits);
 		} while (start < end);
 		ia64_srlz_i();
-		__get_cpu_var(ptcstats).ptc_l++;
+		__this_cpu_inc(ptcstats.ptc_l);
 		preempt_enable();
 		return;
 	}
 
 	if (atomic_read(&mm->mm_users) == 1 && mymm) {
 		flush_tlb_mm(mm);
-		__get_cpu_var(ptcstats).change_rid++;
+		__this_cpu_inc(ptcstats.change_rid);
 		preempt_enable();
 		return;
 	}
@@ -250,11 +250,11 @@
 	spin_lock_irqsave(PTC_LOCK(shub1), flags);
 	itc2 = ia64_get_itc();
 
-	__get_cpu_var(ptcstats).lock_itc_clocks += itc2 - itc;
-	__get_cpu_var(ptcstats).shub_ptc_flushes++;
-	__get_cpu_var(ptcstats).nodes_flushed += nix;
+	__this_cpu_add(ptcstats.lock_itc_clocks, itc2 - itc);
+	__this_cpu_inc(ptcstats.shub_ptc_flushes);
+	__this_cpu_add(ptcstats.nodes_flushed, nix);
 	if (!mymm)
-		 __get_cpu_var(ptcstats).shub_ptc_flushes_not_my_mm++;
+		 __this_cpu_inc(ptcstats.shub_ptc_flushes_not_my_mm);
 
 	if (use_cpu_ptcga && !mymm) {
 		old_rr = ia64_get_rr(start);
@@ -299,9 +299,9 @@
 
 done:
 	itc2 = ia64_get_itc() - itc2;
-	__get_cpu_var(ptcstats).shub_itc_clocks += itc2;
-	if (itc2 > __get_cpu_var(ptcstats).shub_itc_clocks_max)
-		__get_cpu_var(ptcstats).shub_itc_clocks_max = itc2;
+	__this_cpu_add(ptcstats)shub_itc_clocks, itc2);
+	if (itc2 > __this_cpu_read(ptcstats.shub_itc_clocks_max))
+		__this_cpu_write(ptcstats).shub_itc_clocks_max, itc2);
 
 	if (old_rr) {
 		ia64_set_rr(start, old_rr);
@@ -311,7 +311,7 @@
 	spin_unlock_irqrestore(PTC_LOCK(shub1), flags);
 
 	if (flush_opt == 1 && deadlock) {
-		__get_cpu_var(ptcstats).deadlocks++;
+		__this_cpu_inc(ptcstats.deadlocks);
 		sn2_ipi_flush_all_tlb(mm);
 	}
 
@@ -334,7 +334,7 @@
 	short nasid, i;
 	unsigned long *piows, zeroval, n;
 
-	__get_cpu_var(ptcstats).deadlocks++;
+	__this_cpu_inc(ptcstats.deadlocks);
 
 	piows = (unsigned long *) pda->pio_write_status_addr;
 	zeroval = pda->pio_write_status_val;
@@ -349,7 +349,7 @@
 			ptc1 = CHANGE_NASID(nasid, ptc1);
 
 		n = sn2_ptc_deadlock_recovery_core(ptc0, data0, ptc1, data1, piows, zeroval);
-		__get_cpu_var(ptcstats).deadlocks2 += n;
+		__this_cpu_add(ptcstats.deadlocks2, n);
 	}
 
 }
Index: linux/arch/ia64/include/asm/hw_irq.h
===================================================================
--- linux.orig/arch/ia64/include/asm/hw_irq.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/hw_irq.h	2014-01-20 16:21:25.561294390 -0600
@@ -160,7 +160,7 @@
 static inline unsigned int
 __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/include/asm/sn/nodepda.h
===================================================================
--- linux.orig/arch/ia64/include/asm/sn/nodepda.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/sn/nodepda.h	2014-01-20 16:21:25.561294390 -0600
@@ -70,7 +70,7 @@
  */
 
 DECLARE_PER_CPU(struct nodepda_s *, __sn_nodepda);
-#define sn_nodepda		(__get_cpu_var(__sn_nodepda))
+#define sn_nodepda		__this_cpu_read(__sn_nodepda)
 #define	NODEPDA(cnodeid)	(sn_nodepda->pernode_pdaindr[cnodeid])
 
 /*
Index: linux/arch/ia64/include/asm/switch_to.h
===================================================================
--- linux.orig/arch/ia64/include/asm/switch_to.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/switch_to.h	2014-01-20 16:21:25.561294390 -0600
@@ -32,7 +32,7 @@
 
 #ifdef CONFIG_PERFMON
   DECLARE_PER_CPU(unsigned long, pfm_syst_info);
-# define PERFMON_IS_SYSWIDE() (__get_cpu_var(pfm_syst_info) & 0x1)
+# define PERFMON_IS_SYSWIDE() (__this_cpu_read(pfm_syst_info) & 0x1)
 #else
 # define PERFMON_IS_SYSWIDE() (0)
 #endif
Index: linux/arch/ia64/include/asm/sn/arch.h
===================================================================
--- linux.orig/arch/ia64/include/asm/sn/arch.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/sn/arch.h	2014-01-20 16:21:25.561294390 -0600
@@ -57,7 +57,7 @@
 	u16 nasid_bitmask;
 };
 DECLARE_PER_CPU(struct sn_hub_info_s, __sn_hub_info);
-#define sn_hub_info 	(&__get_cpu_var(__sn_hub_info))
+#define sn_hub_info 	this_cpu_ptr(&__sn_hub_info)
 #define is_shub2()	(sn_hub_info->shub2)
 #define is_shub1()	(sn_hub_info->shub2 == 0)
 
@@ -72,7 +72,7 @@
  * cpu.
  */
 DECLARE_PER_CPU(short, __sn_cnodeid_to_nasid[MAX_COMPACT_NODES]);
-#define sn_cnodeid_to_nasid	(&__get_cpu_var(__sn_cnodeid_to_nasid[0]))
+#define sn_cnodeid_to_nasid	this_cpu_ptr(&__sn_cnodeid_to_nasid[0])
 
 
 extern u8 sn_partition_id;
Index: linux/arch/ia64/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/ia64/include/asm/uv/uv_hub.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/uv/uv_hub.h	2014-01-20 16:21:25.561294390 -0600
@@ -108,7 +108,7 @@
 	unsigned char	n_val;
 };
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info 		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info 		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/ia64/kernel/traps.c
===================================================================
--- linux.orig/arch/ia64/kernel/traps.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/traps.c	2014-01-20 16:21:25.561294390 -0600
@@ -299,7 +299,7 @@
 
 	if (!(current->thread.flags & IA64_THREAD_FPEMU_NOPRINT))  {
 		unsigned long count, current_jiffies = jiffies;
-		struct fpu_swa_msg *cp = &__get_cpu_var(cpulast);
+		struct fpu_swa_msg *cp = this_cpu_ptr(&cpulast);
 
 		if (unlikely(current_jiffies > cp->time))
 			cp->count = 0;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 38/48] ia64: Replace __get_cpu_var uses
@ 2014-02-14 20:19   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/alpha/kernel/perf_event.c
=================================--- linux.orig/arch/alpha/kernel/perf_event.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/alpha/kernel/perf_event.c	2014-01-20 16:21:25.561294390 -0600
@@ -814,7 +814,7 @@
 	struct hw_perf_event *hwc;
 	int idx, j;
 
-	__get_cpu_var(irq_pmi_count)++;
+	__this_cpu_inc(irq_pmi_count);
 	cpuc = &__get_cpu_var(cpu_hw_events);
 
 	/* Completely counting through the PMC's period to trigger a new PMC
Index: linux/arch/ia64/kernel/irq.c
=================================--- linux.orig/arch/ia64/kernel/irq.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/irq.c	2014-01-20 16:21:25.561294390 -0600
@@ -42,7 +42,7 @@
 
 unsigned int __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/kernel/irq_ia64.c
=================================--- linux.orig/arch/ia64/kernel/irq_ia64.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/irq_ia64.c	2014-01-20 16:21:25.561294390 -0600
@@ -338,7 +338,7 @@
 		int irq;
 		struct irq_desc *desc;
 		struct irq_cfg *cfg;
-		irq = __get_cpu_var(vector_irq)[vector];
+		irq = __this_cpu_read(vector_irq[vector]);
 		if (irq < 0)
 			continue;
 
@@ -352,7 +352,7 @@
 			goto unlock;
 
 		spin_lock_irqsave(&vector_lock, flags);
-		__get_cpu_var(vector_irq)[vector] = -1;
+		__this_cpu_write(vector_irq[vector], -1);
 		cpu_clear(me, vector_table[vector]);
 		spin_unlock_irqrestore(&vector_lock, flags);
 		cfg->move_cleanup_count--;
Index: linux/arch/ia64/kernel/kprobes.c
=================================--- linux.orig/arch/ia64/kernel/kprobes.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/kprobes.c	2014-01-20 16:21:25.561294390 -0600
@@ -396,7 +396,7 @@
 {
 	unsigned int i;
 	i = atomic_read(&kcb->prev_kprobe_index);
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe[i-1].kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe[i-1].kp);
 	kcb->kprobe_status = kcb->prev_kprobe[i-1].status;
 	atomic_sub(1, &kcb->prev_kprobe_index);
 }
@@ -404,7 +404,7 @@
 static void __kprobes set_current_kprobe(struct kprobe *p,
 			struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static void kretprobe_trampoline(void)
@@ -823,7 +823,7 @@
 			/*
 			 * jprobe instrumented function just completed
 			 */
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/ia64/kernel/mca.c
=================================--- linux.orig/arch/ia64/kernel/mca.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/mca.c	2014-01-20 16:21:25.561294390 -0600
@@ -1341,7 +1341,7 @@
 		ia64_mlogbuf_finish(1);
 	}
 
-	if (__get_cpu_var(ia64_mca_tr_reload)) {
+	if (__this_cpu_read(ia64_mca_tr_reload)) {
 		mca_insert_tr(0x1); /*Reload dynamic itrs*/
 		mca_insert_tr(0x2); /*Reload dynamic itrs*/
 	}
@@ -1874,14 +1874,14 @@
 		"MCA", cpu);
 	format_mca_init_stack(data, offsetof(struct ia64_mca_cpu, init_stack),
 		"INIT", cpu);
-	__get_cpu_var(ia64_mca_data) = __per_cpu_mca[cpu] = __pa(data);
+	__this_cpu_write(ia64_mca_data, (__per_cpu_mca[cpu] = __pa(data)));
 
 	/*
 	 * Stash away a copy of the PTE needed to map the per-CPU page.
 	 * We may need it during MCA recovery.
 	 */
-	__get_cpu_var(ia64_mca_per_cpu_pte) -		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_per_cpu_pte,
+		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL)));
 
 	/*
 	 * Also, stash away a copy of the PAL address and the PTE
@@ -1890,10 +1890,10 @@
 	pal_vaddr = efi_get_pal_addr();
 	if (!pal_vaddr)
 		return;
-	__get_cpu_var(ia64_mca_pal_base) -		GRANULEROUNDDOWN((unsigned long) pal_vaddr);
-	__get_cpu_var(ia64_mca_pal_pte) = pte_val(mk_pte_phys(__pa(pal_vaddr),
-							      PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_pal_base,
+		GRANULEROUNDDOWN((unsigned long) pal_vaddr));
+	__this_cpu_write(ia64_mca_pal_pte, pte_val(mk_pte_phys(__pa(pal_vaddr),
+							      PAGE_KERNEL)));
 }
 
 static void ia64_mca_cmc_vector_adjust(void *dummy)
Index: linux/arch/ia64/kernel/process.c
=================================--- linux.orig/arch/ia64/kernel/process.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/process.c	2014-01-20 16:21:25.561294390 -0600
@@ -215,7 +215,7 @@
 	unsigned int this_cpu = smp_processor_id();
 
 	/* Ack it */
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 
 	max_xtp();
 	local_irq_disable();
@@ -273,7 +273,7 @@
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_save_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE)
 		pfm_syst_wide_update_task(task, info, 0);
 #endif
@@ -293,7 +293,7 @@
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_load_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE) 
 		pfm_syst_wide_update_task(task, info, 1);
 #endif
Index: linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
=================================--- linux.orig/arch/ia64/sn/kernel/sn2/sn2_smp.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/sn/kernel/sn2/sn2_smp.c	2014-01-20 16:21:25.561294390 -0600
@@ -134,8 +134,8 @@
 	itc = ia64_get_itc();
 	smp_flush_tlb_cpumask(*mm_cpumask(mm));
 	itc = ia64_get_itc() - itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes_itc_clocks += itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes++;
+	__this_cpu_add(ptcstats.shub_ipi_flushes_itc_clocks, itc);
+	__this_cpu_inc(ptcstats.shub_ipi_flushes);
 }
 
 /**
@@ -199,14 +199,14 @@
 			start += (1UL << nbits);
 		} while (start < end);
 		ia64_srlz_i();
-		__get_cpu_var(ptcstats).ptc_l++;
+		__this_cpu_inc(ptcstats.ptc_l);
 		preempt_enable();
 		return;
 	}
 
 	if (atomic_read(&mm->mm_users) = 1 && mymm) {
 		flush_tlb_mm(mm);
-		__get_cpu_var(ptcstats).change_rid++;
+		__this_cpu_inc(ptcstats.change_rid);
 		preempt_enable();
 		return;
 	}
@@ -250,11 +250,11 @@
 	spin_lock_irqsave(PTC_LOCK(shub1), flags);
 	itc2 = ia64_get_itc();
 
-	__get_cpu_var(ptcstats).lock_itc_clocks += itc2 - itc;
-	__get_cpu_var(ptcstats).shub_ptc_flushes++;
-	__get_cpu_var(ptcstats).nodes_flushed += nix;
+	__this_cpu_add(ptcstats.lock_itc_clocks, itc2 - itc);
+	__this_cpu_inc(ptcstats.shub_ptc_flushes);
+	__this_cpu_add(ptcstats.nodes_flushed, nix);
 	if (!mymm)
-		 __get_cpu_var(ptcstats).shub_ptc_flushes_not_my_mm++;
+		 __this_cpu_inc(ptcstats.shub_ptc_flushes_not_my_mm);
 
 	if (use_cpu_ptcga && !mymm) {
 		old_rr = ia64_get_rr(start);
@@ -299,9 +299,9 @@
 
 done:
 	itc2 = ia64_get_itc() - itc2;
-	__get_cpu_var(ptcstats).shub_itc_clocks += itc2;
-	if (itc2 > __get_cpu_var(ptcstats).shub_itc_clocks_max)
-		__get_cpu_var(ptcstats).shub_itc_clocks_max = itc2;
+	__this_cpu_add(ptcstats)shub_itc_clocks, itc2);
+	if (itc2 > __this_cpu_read(ptcstats.shub_itc_clocks_max))
+		__this_cpu_write(ptcstats).shub_itc_clocks_max, itc2);
 
 	if (old_rr) {
 		ia64_set_rr(start, old_rr);
@@ -311,7 +311,7 @@
 	spin_unlock_irqrestore(PTC_LOCK(shub1), flags);
 
 	if (flush_opt = 1 && deadlock) {
-		__get_cpu_var(ptcstats).deadlocks++;
+		__this_cpu_inc(ptcstats.deadlocks);
 		sn2_ipi_flush_all_tlb(mm);
 	}
 
@@ -334,7 +334,7 @@
 	short nasid, i;
 	unsigned long *piows, zeroval, n;
 
-	__get_cpu_var(ptcstats).deadlocks++;
+	__this_cpu_inc(ptcstats.deadlocks);
 
 	piows = (unsigned long *) pda->pio_write_status_addr;
 	zeroval = pda->pio_write_status_val;
@@ -349,7 +349,7 @@
 			ptc1 = CHANGE_NASID(nasid, ptc1);
 
 		n = sn2_ptc_deadlock_recovery_core(ptc0, data0, ptc1, data1, piows, zeroval);
-		__get_cpu_var(ptcstats).deadlocks2 += n;
+		__this_cpu_add(ptcstats.deadlocks2, n);
 	}
 
 }
Index: linux/arch/ia64/include/asm/hw_irq.h
=================================--- linux.orig/arch/ia64/include/asm/hw_irq.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/hw_irq.h	2014-01-20 16:21:25.561294390 -0600
@@ -160,7 +160,7 @@
 static inline unsigned int
 __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/include/asm/sn/nodepda.h
=================================--- linux.orig/arch/ia64/include/asm/sn/nodepda.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/sn/nodepda.h	2014-01-20 16:21:25.561294390 -0600
@@ -70,7 +70,7 @@
  */
 
 DECLARE_PER_CPU(struct nodepda_s *, __sn_nodepda);
-#define sn_nodepda		(__get_cpu_var(__sn_nodepda))
+#define sn_nodepda		__this_cpu_read(__sn_nodepda)
 #define	NODEPDA(cnodeid)	(sn_nodepda->pernode_pdaindr[cnodeid])
 
 /*
Index: linux/arch/ia64/include/asm/switch_to.h
=================================--- linux.orig/arch/ia64/include/asm/switch_to.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/switch_to.h	2014-01-20 16:21:25.561294390 -0600
@@ -32,7 +32,7 @@
 
 #ifdef CONFIG_PERFMON
   DECLARE_PER_CPU(unsigned long, pfm_syst_info);
-# define PERFMON_IS_SYSWIDE() (__get_cpu_var(pfm_syst_info) & 0x1)
+# define PERFMON_IS_SYSWIDE() (__this_cpu_read(pfm_syst_info) & 0x1)
 #else
 # define PERFMON_IS_SYSWIDE() (0)
 #endif
Index: linux/arch/ia64/include/asm/sn/arch.h
=================================--- linux.orig/arch/ia64/include/asm/sn/arch.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/sn/arch.h	2014-01-20 16:21:25.561294390 -0600
@@ -57,7 +57,7 @@
 	u16 nasid_bitmask;
 };
 DECLARE_PER_CPU(struct sn_hub_info_s, __sn_hub_info);
-#define sn_hub_info 	(&__get_cpu_var(__sn_hub_info))
+#define sn_hub_info 	this_cpu_ptr(&__sn_hub_info)
 #define is_shub2()	(sn_hub_info->shub2)
 #define is_shub1()	(sn_hub_info->shub2 = 0)
 
@@ -72,7 +72,7 @@
  * cpu.
  */
 DECLARE_PER_CPU(short, __sn_cnodeid_to_nasid[MAX_COMPACT_NODES]);
-#define sn_cnodeid_to_nasid	(&__get_cpu_var(__sn_cnodeid_to_nasid[0]))
+#define sn_cnodeid_to_nasid	this_cpu_ptr(&__sn_cnodeid_to_nasid[0])
 
 
 extern u8 sn_partition_id;
Index: linux/arch/ia64/include/asm/uv/uv_hub.h
=================================--- linux.orig/arch/ia64/include/asm/uv/uv_hub.h	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/include/asm/uv/uv_hub.h	2014-01-20 16:21:25.561294390 -0600
@@ -108,7 +108,7 @@
 	unsigned char	n_val;
 };
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info 		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info 		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/ia64/kernel/traps.c
=================================--- linux.orig/arch/ia64/kernel/traps.c	2014-01-20 16:21:25.571294202 -0600
+++ linux/arch/ia64/kernel/traps.c	2014-01-20 16:21:25.561294390 -0600
@@ -299,7 +299,7 @@
 
 	if (!(current->thread.flags & IA64_THREAD_FPEMU_NOPRINT))  {
 		unsigned long count, current_jiffies = jiffies;
-		struct fpu_swa_msg *cp = &__get_cpu_var(cpulast);
+		struct fpu_swa_msg *cp = this_cpu_ptr(&cpulast);
 
 		if (unlikely(current_jiffies > cp->time))
 			cp->count = 0;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (37 preceding siblings ...)
  2014-02-14 20:19   ` Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-15  3:50   ` Benjamin Herrenschmidt
  2014-02-14 20:19 ` [PATCH 40/48] powerpc: Handle new __get_cpu_var calls in 3.14 Christoph Lameter
                   ` (9 subsequent siblings)
  48 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras

[-- Attachment #1: this_powerpc --]
[-- Type: text/plain, Size: 31373 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/powerpc/include/asm/cputime.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/cputime.h	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/include/asm/cputime.h	2014-02-03 13:26:45.060851321 -0600
@@ -56,10 +56,10 @@
 static inline cputime_t cputime_to_scaled(const cputime_t ct)
 {
 	if (cpu_has_feature(CPU_FTR_SPURR) &&
-	    __get_cpu_var(cputime_last_delta))
+	    __this_cpu_read(cputime_last_delta))
 		return (__force u64) ct *
-			__get_cpu_var(cputime_scaled_last_delta) /
-			__get_cpu_var(cputime_last_delta);
+			__this_cpu_read(cputime_scaled_last_delta) /
+			__this_cpu_read(cputime_last_delta);
 	return ct;
 }
 
Index: linux/arch/powerpc/include/asm/hardirq.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/hardirq.h	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/include/asm/hardirq.h	2014-02-03 13:26:45.060851321 -0600
@@ -20,7 +20,7 @@
 
 #define __ARCH_IRQ_STAT
 
-#define local_softirq_pending()	__get_cpu_var(irq_stat).__softirq_pending
+#define local_softirq_pending()	__this_cpu_read(irq_stat.__softirq_pending)
 
 static inline void ack_bad_irq(unsigned int irq)
 {
Index: linux/arch/powerpc/include/asm/tlbflush.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/tlbflush.h	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/include/asm/tlbflush.h	2014-02-03 13:26:45.060851321 -0600
@@ -107,14 +107,14 @@
 
 static inline void arch_enter_lazy_mmu_mode(void)
 {
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 
 	batch->active = 1;
 }
 
 static inline void arch_leave_lazy_mmu_mode(void)
 {
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 
 	if (batch->index)
 		__flush_tlb_pending(batch);
Index: linux/arch/powerpc/include/asm/xics.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/xics.h	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/include/asm/xics.h	2014-02-03 13:26:45.060851321 -0600
@@ -97,7 +97,7 @@
 
 static inline void xics_push_cppr(unsigned int vec)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	if (WARN_ON(os_cppr->index >= MAX_NUM_PRIORITIES - 1))
 		return;
@@ -110,7 +110,7 @@
 
 static inline unsigned char xics_pop_cppr(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	if (WARN_ON(os_cppr->index < 1))
 		return LOWEST_PRIORITY;
@@ -120,7 +120,7 @@
 
 static inline void xics_set_base_cppr(unsigned char cppr)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	/* we only really want to set the priority when there's
 	 * just one cppr value on the stack
@@ -132,7 +132,7 @@
 
 static inline unsigned char xics_cppr_top(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 	
 	return os_cppr->stack[os_cppr->index];
 }
Index: linux/arch/powerpc/kernel/dbell.c
===================================================================
--- linux.orig/arch/powerpc/kernel/dbell.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/dbell.c	2014-02-03 13:26:45.060851321 -0600
@@ -41,7 +41,7 @@
 
 	may_hard_irq_enable();
 
-	__get_cpu_var(irq_stat).doorbell_irqs++;
+	__this_cpu_inc(irq_stat.doorbell_irqs);
 
 	smp_ipi_demux();
 
Index: linux/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- linux.orig/arch/powerpc/kernel/hw_breakpoint.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/hw_breakpoint.c	2014-02-03 13:26:45.060851321 -0600
@@ -63,7 +63,7 @@
 int arch_install_hw_breakpoint(struct perf_event *bp)
 {
 	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
-	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+	struct perf_event **slot = this_cpu_ptr(&bp_per_reg);
 
 	*slot = bp;
 
@@ -88,7 +88,7 @@
  */
 void arch_uninstall_hw_breakpoint(struct perf_event *bp)
 {
-	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+	struct perf_event **slot = this_cpu_ptr(&bp_per_reg);
 
 	if (*slot != bp) {
 		WARN_ONCE(1, "Can't find the breakpoint");
@@ -226,7 +226,7 @@
 	 */
 	rcu_read_lock();
 
-	bp = __get_cpu_var(bp_per_reg);
+	bp = __this_cpu_read(bp_per_reg);
 	if (!bp)
 		goto out;
 	info = counter_arch_bp(bp);
Index: linux/arch/powerpc/kernel/irq.c
===================================================================
--- linux.orig/arch/powerpc/kernel/irq.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/irq.c	2014-02-03 13:26:45.060851321 -0600
@@ -114,7 +114,7 @@
 static inline notrace int decrementer_check_overflow(void)
 {
  	u64 now = get_tb_or_rtc();
- 	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
  
 	return now >= *next_tb;
 }
Index: linux/arch/powerpc/kernel/kprobes.c
===================================================================
--- linux.orig/arch/powerpc/kernel/kprobes.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/kprobes.c	2014-02-03 13:26:45.060851321 -0600
@@ -118,7 +118,7 @@
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_saved_msr = kcb->prev_kprobe.saved_msr;
 }
@@ -126,7 +126,7 @@
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_saved_msr = regs->msr;
 }
 
@@ -191,7 +191,7 @@
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/powerpc/kernel/process.c
===================================================================
--- linux.orig/arch/powerpc/kernel/process.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/process.c	2014-02-03 13:28:48.338290902 -0600
@@ -497,7 +497,7 @@
 
 int set_breakpoint(struct arch_hw_breakpoint *brk)
 {
-	__get_cpu_var(current_brk) = *brk;
+	__this_cpu_write(current_brk, *brk);
 
 	if (cpu_has_feature(CPU_FTR_DAWR))
 		return set_dawr(brk);
@@ -811,7 +811,7 @@
  * schedule DABR
  */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
-	if (unlikely(!hw_brk_match(&__get_cpu_var(current_brk), &new->thread.hw_brk)))
+	if (unlikely(!hw_brk_match(this_cpu_ptr(&current_brk), &new->thread.hw_brk)))
 		set_breakpoint(&new->thread.hw_brk);
 #endif /* CONFIG_HAVE_HW_BREAKPOINT */
 #endif
@@ -825,7 +825,7 @@
 	 * Collect processor utilization data per process
 	 */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
+		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
 		long unsigned start_tb, current_tb;
 		start_tb = old_thread->start_tb;
 		cu->current_tb = current_tb = mfspr(SPRN_PURR);
@@ -835,7 +835,7 @@
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_PPC_BOOK3S_64
-	batch = &__get_cpu_var(ppc64_tlb_batch);
+	batch = this_cpu_ptr(&ppc64_tlb_batch);
 	if (batch->active) {
 		current_thread_info()->local_flags |= _TLF_LAZY_MMU;
 		if (batch->index)
@@ -858,7 +858,7 @@
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (current_thread_info()->local_flags & _TLF_LAZY_MMU) {
 		current_thread_info()->local_flags &= ~_TLF_LAZY_MMU;
-		batch = &__get_cpu_var(ppc64_tlb_batch);
+		batch = this_cpu_ptr(&ppc64_tlb_batch);
 		batch->active = 1;
 	}
 #endif /* CONFIG_PPC_BOOK3S_64 */
Index: linux/arch/powerpc/kernel/smp.c
===================================================================
--- linux.orig/arch/powerpc/kernel/smp.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/smp.c	2014-02-03 13:26:45.060851321 -0600
@@ -240,7 +240,7 @@
 
 irqreturn_t smp_ipi_demux(void)
 {
-	struct cpu_messages *info = &__get_cpu_var(ipi_message);
+	struct cpu_messages *info = this_cpu_ptr(&ipi_message);
 	unsigned int all;
 
 	mb();	/* order any irq clear */
@@ -420,9 +420,9 @@
 	idle_task_exit();
 	cpu = smp_processor_id();
 	printk(KERN_DEBUG "CPU%d offline\n", cpu);
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 	smp_wmb();
-	while (__get_cpu_var(cpu_state) != CPU_UP_PREPARE)
+	while (__this_cpu_read(cpu_state) != CPU_UP_PREPARE)
 		cpu_relax();
 }
 
Index: linux/arch/powerpc/kernel/sysfs.c
===================================================================
--- linux.orig/arch/powerpc/kernel/sysfs.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/sysfs.c	2014-02-03 13:26:45.060851321 -0600
@@ -394,10 +394,10 @@
 	ppc_set_pmu_inuse(1);
 
 	/* Only need to enable them once */
-	if (__get_cpu_var(pmcs_enabled))
+	if (__this_cpu_read(pmcs_enabled))
 		return;
 
-	__get_cpu_var(pmcs_enabled) = 1;
+	__this_cpu_write(pmcs_enabled, 1);
 
 	if (ppc_md.enable_pmcs)
 		ppc_md.enable_pmcs();
Index: linux/arch/powerpc/kernel/time.c
===================================================================
--- linux.orig/arch/powerpc/kernel/time.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/time.c	2014-02-03 13:30:05.936679287 -0600
@@ -457,9 +457,9 @@
 
 DEFINE_PER_CPU(u8, irq_work_pending);
 
-#define set_irq_work_pending_flag()	__get_cpu_var(irq_work_pending) = 1
-#define test_irq_work_pending()		__get_cpu_var(irq_work_pending)
-#define clear_irq_work_pending()	__get_cpu_var(irq_work_pending) = 0
+#define set_irq_work_pending_flag()	__this_cpu_write(irq_work_pending, 1)
+#define test_irq_work_pending()		__this_cpu_read(irq_work_pending)
+#define clear_irq_work_pending()	__this_cpu_write(irq_work_pending, 0)
 
 #endif /* 32 vs 64 bit */
 
@@ -485,8 +485,8 @@
 void timer_interrupt(struct pt_regs * regs)
 {
 	struct pt_regs *old_regs;
-	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
-	struct clock_event_device *evt = &__get_cpu_var(decrementers);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
 	u64 now;
 
 	/* Ensure a positive value is written to the decrementer, or else
@@ -545,7 +544,7 @@
 #ifdef CONFIG_PPC64
 	/* collect purr register values often, for accurate calculations */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
+		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
 		cu->current_tb = mfspr(SPRN_PURR);
 	}
 #endif
@@ -808,7 +807,7 @@
 	/* Don't adjust the decrementer if some irq work is pending */
 	if (test_irq_work_pending())
 		return 0;
-	__get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
+	__this_cpu_write(decrementers_next_tb, get_tb_or_rtc() + evt);
 	set_dec(evt);
 
 	/* We may have raced with new irq work */
Index: linux/arch/powerpc/kernel/traps.c
===================================================================
--- linux.orig/arch/powerpc/kernel/traps.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/traps.c	2014-02-03 13:26:45.060851321 -0600
@@ -688,7 +688,7 @@
 	enum ctx_state prev_state = exception_enter();
 	int recover = 0;
 
-	__get_cpu_var(irq_stat).mce_exceptions++;
+	__this_cpu_inc(irq_stat.mce_exceptions);
 
 	/* See if any machine dependent calls. In theory, we would want
 	 * to call the CPU first, and call the ppc_md. one if the CPU
@@ -1492,7 +1492,7 @@
 
 void performance_monitor_exception(struct pt_regs *regs)
 {
-	__get_cpu_var(irq_stat).pmu_irqs++;
+	__this_cpu_inc(irq_stat.pmu_irqs);
 
 	perf_irq(regs);
 }
Index: linux/arch/powerpc/kvm/e500.c
===================================================================
--- linux.orig/arch/powerpc/kvm/e500.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kvm/e500.c	2014-02-03 13:26:45.060851321 -0600
@@ -76,11 +76,11 @@
 	unsigned long sid;
 	int ret = -1;
 
-	sid = ++(__get_cpu_var(pcpu_last_used_sid));
+	sid = __this_cpu_inc_return(pcpu_last_used_sid);
 	if (sid < NUM_TIDS) {
-		__get_cpu_var(pcpu_sids).entry[sid] = entry;
+		__this_cpu_write(pcpu_sids)entry[sid], entry);
 		entry->val = sid;
-		entry->pentry = &__get_cpu_var(pcpu_sids).entry[sid];
+		entry->pentry = this_cpu_ptr(&pcpu_sids.entry[sid]);
 		ret = sid;
 	}
 
@@ -108,8 +108,8 @@
 static inline int local_sid_lookup(struct id *entry)
 {
 	if (entry && entry->val != 0 &&
-	    __get_cpu_var(pcpu_sids).entry[entry->val] == entry &&
-	    entry->pentry == &__get_cpu_var(pcpu_sids).entry[entry->val])
+	    __this_cpu_read(pcpu_sids.entry[entry->val]) == entry &&
+	    entry->pentry == this_cpu_ptr(&pcpu_sids.entry[entry->val]))
 		return entry->val;
 	return -1;
 }
@@ -117,8 +117,8 @@
 /* Invalidate all id mappings on local core -- call with preempt disabled */
 static inline void local_sid_destroy_all(void)
 {
-	__get_cpu_var(pcpu_last_used_sid) = 0;
-	memset(&__get_cpu_var(pcpu_sids), 0, sizeof(__get_cpu_var(pcpu_sids)));
+	__this_cpu_write(pcpu_last_used_sid, 0);
+	memset(this_cpu_ptr(&pcpu_sids), 0, sizeof(pcpu_sids));
 }
 
 static void *kvmppc_e500_id_table_alloc(struct kvmppc_vcpu_e500 *vcpu_e500)
Index: linux/arch/powerpc/kvm/e500mc.c
===================================================================
--- linux.orig/arch/powerpc/kvm/e500mc.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kvm/e500mc.c	2014-02-03 13:26:45.060851321 -0600
@@ -141,9 +141,9 @@
 	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
 
 	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
-	    __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+	    __this_cpu_read(last_vcpu_on_cpu) != vcpu) {
 		kvmppc_e500_tlbil_all(vcpu_e500);
-		__get_cpu_var(last_vcpu_on_cpu) = vcpu;
+		__this_cpu_read(last_vcpu_on_cpu) = vcpu;
 	}
 
 	kvmppc_load_guest_fp(vcpu);
Index: linux/arch/powerpc/mm/hash_native_64.c
===================================================================
--- linux.orig/arch/powerpc/mm/hash_native_64.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/mm/hash_native_64.c	2014-02-03 13:26:45.060851321 -0600
@@ -649,7 +649,7 @@
 	unsigned long want_v;
 	unsigned long flags;
 	real_pte_t pte;
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 	unsigned long psize = batch->psize;
 	int ssize = batch->ssize;
 	int i;
Index: linux/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- linux.orig/arch/powerpc/mm/hash_utils_64.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/mm/hash_utils_64.c	2014-02-03 13:26:45.060851321 -0600
@@ -1284,7 +1284,7 @@
 	else {
 		int i;
 		struct ppc64_tlb_batch *batch =
-			&__get_cpu_var(ppc64_tlb_batch);
+			this_cpu_ptr(&ppc64_tlb_batch);
 
 		for (i = 0; i < number; i++)
 			flush_hash_page(batch->vpn[i], batch->pte[i],
Index: linux/arch/powerpc/mm/hugetlbpage-book3e.c
===================================================================
--- linux.orig/arch/powerpc/mm/hugetlbpage-book3e.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/mm/hugetlbpage-book3e.c	2014-02-03 13:31:42.464674595 -0600
@@ -37,9 +37,9 @@
 
 	/* Just round-robin the entries and wrap when we hit the end */
 	if (unlikely(index == ncams - 1))
-		__get_cpu_var(next_tlbcam_idx) = tlbcam_index;
+		__this_cpu_write(next_tlbcam_idx, tlbcam_index);
 	else
-		__get_cpu_var(next_tlbcam_idx)++;
+		__this_cpu_inc(next_tlbcam_idx);
 
 	return index;
 }
Index: linux/arch/powerpc/mm/hugetlbpage.c
===================================================================
--- linux.orig/arch/powerpc/mm/hugetlbpage.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/mm/hugetlbpage.c	2014-02-03 13:32:30.863669474 -0600
@@ -472,7 +472,7 @@
 {
 	struct hugepd_freelist **batchp;
 
-	batchp = &get_cpu_var(hugepd_freelist_cur);
+	batchp = this_cpu_ptr(&hugepd_freelist_cur);
 
 	if (atomic_read(&tlb->mm->mm_users) < 2 ||
 	    cpumask_equal(mm_cpumask(tlb->mm),
Index: linux/arch/powerpc/mm/stab.c
===================================================================
--- linux.orig/arch/powerpc/mm/stab.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/mm/stab.c	2014-02-03 13:26:45.060851321 -0600
@@ -133,12 +133,12 @@
 	stab_entry = make_ste(get_paca()->stab_addr, GET_ESID(ea), vsid);
 
 	if (!is_kernel_addr(ea)) {
-		offset = __get_cpu_var(stab_cache_ptr);
+		offset = __this_cpu_read(stab_cache_ptr);
 		if (offset < NR_STAB_CACHE_ENTRIES)
-			__get_cpu_var(stab_cache[offset++]) = stab_entry;
+			__this_cpu_read(stab_cache[offset++]) = stab_entry;
 		else
 			offset = NR_STAB_CACHE_ENTRIES+1;
-		__get_cpu_var(stab_cache_ptr) = offset;
+		__this_cpu_write(stab_cache_ptr, offset);
 
 		/* Order update */
 		asm volatile("sync":::"memory");
@@ -177,12 +177,12 @@
 	 */
 	hard_irq_disable();
 
-	offset = __get_cpu_var(stab_cache_ptr);
+	offset = __this_cpu_read(stab_cache_ptr);
 	if (offset <= NR_STAB_CACHE_ENTRIES) {
 		int i;
 
 		for (i = 0; i < offset; i++) {
-			ste = stab + __get_cpu_var(stab_cache[i]);
+			ste = stab + __this_cpu_read(stab_cache[i]);
 			ste->esid_data = 0; /* invalidate entry */
 		}
 	} else {
@@ -206,7 +206,7 @@
 
 	asm volatile("sync; slbia; sync":::"memory");
 
-	__get_cpu_var(stab_cache_ptr) = 0;
+	__this_cpu_write(stab_cache_ptr, 0);
 
 	/* Now preload some entries for the new task */
 	if (test_tsk_thread_flag(tsk, TIF_32BIT))
Index: linux/arch/powerpc/perf/core-book3s.c
===================================================================
--- linux.orig/arch/powerpc/perf/core-book3s.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/perf/core-book3s.c	2014-02-03 13:26:45.060851321 -0600
@@ -332,7 +332,7 @@
 
 static void power_pmu_bhrb_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!ppmu->bhrb_nr)
 		return;
@@ -347,7 +347,7 @@
 
 static void power_pmu_bhrb_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!ppmu->bhrb_nr)
 		return;
@@ -961,7 +961,7 @@
 	if (!ppmu)
 		return;
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuhw->disabled) {
 		/*
@@ -1027,7 +1027,7 @@
 		return;
 	local_irq_save(flags);
 
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	if (!cpuhw->disabled)
 		goto out;
 
@@ -1211,7 +1211,7 @@
 	 * Add the event to the list (if there is room)
 	 * and check whether the total set is still feasible.
 	 */
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	n0 = cpuhw->n_events;
 	if (n0 >= ppmu->n_counter)
 		goto out;
@@ -1277,7 +1277,7 @@
 
 	power_pmu_read(event);
 
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	for (i = 0; i < cpuhw->n_events; ++i) {
 		if (event == cpuhw->event[i]) {
 			while (++i < cpuhw->n_events) {
@@ -1383,7 +1383,7 @@
  */
 void power_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1397,7 +1397,7 @@
  */
 void power_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1415,7 +1415,7 @@
 
 	if (!ppmu)
 		return -EAGAIN;
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	n = cpuhw->n_events;
 	if (check_excludes(cpuhw->event, cpuhw->flags, 0, n))
 		return -EAGAIN;
@@ -1772,7 +1772,7 @@
 
 		if (event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK) {
 			struct cpu_hw_events *cpuhw;
-			cpuhw = &__get_cpu_var(cpu_hw_events);
+			cpuhw = this_cpu_ptr(&cpu_hw_events);
 			power_pmu_bhrb_read(cpuhw);
 			data.br_stack = &cpuhw->bhrb_stack;
 		}
@@ -1845,7 +1845,7 @@
 static void perf_event_interrupt(struct pt_regs *regs)
 {
 	int i, j;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	unsigned long val[8];
 	int found, active;
Index: linux/arch/powerpc/perf/core-fsl-emb.c
===================================================================
--- linux.orig/arch/powerpc/perf/core-fsl-emb.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/perf/core-fsl-emb.c	2014-02-03 13:26:45.060851321 -0600
@@ -210,7 +210,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuhw->disabled) {
 		cpuhw->disabled = 1;
@@ -249,7 +249,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	if (!cpuhw->disabled)
 		goto out;
 
@@ -653,7 +653,7 @@
 static void perf_event_interrupt(struct pt_regs *regs)
 {
 	int i;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	unsigned long val;
 	int found = 0;
Index: linux/arch/powerpc/platforms/cell/interrupt.c
===================================================================
--- linux.orig/arch/powerpc/platforms/cell/interrupt.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/cell/interrupt.c	2014-02-03 13:26:45.060851321 -0600
@@ -82,7 +82,7 @@
 
 static void iic_eoi(struct irq_data *d)
 {
-	struct iic *iic = &__get_cpu_var(cpu_iic);
+	struct iic *iic = this_cpu_ptr(&cpu_iic);
 	out_be64(&iic->regs->prio, iic->eoi_stack[--iic->eoi_ptr]);
 	BUG_ON(iic->eoi_ptr < 0);
 }
@@ -148,7 +148,7 @@
 	struct iic *iic;
 	unsigned int virq;
 
-	iic = &__get_cpu_var(cpu_iic);
+	iic = this_cpu_ptr(&cpu_iic);
 	*(unsigned long *) &pending =
 		in_be64((u64 __iomem *) &iic->regs->pending_destr);
 	if (!(pending.flags & CBE_IIC_IRQ_VALID))
@@ -163,7 +163,7 @@
 
 void iic_setup_cpu(void)
 {
-	out_be64(&__get_cpu_var(cpu_iic).regs->prio, 0xff);
+	out_be64(this_cpu_ptr(&cpu_iic.regs->prio), 0xff);
 }
 
 u8 iic_get_target_id(int cpu)
Index: linux/arch/powerpc/platforms/ps3/interrupt.c
===================================================================
--- linux.orig/arch/powerpc/platforms/ps3/interrupt.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/ps3/interrupt.c	2014-02-03 13:26:45.060851321 -0600
@@ -711,7 +711,7 @@
 
 static unsigned int ps3_get_irq(void)
 {
-	struct ps3_private *pd = &__get_cpu_var(ps3_private);
+	struct ps3_private *pd = this_cpu_ptr(&ps3_private);
 	u64 x = (pd->bmp.status & pd->bmp.mask);
 	unsigned int plug;
 
Index: linux/arch/powerpc/platforms/pseries/dtl.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/dtl.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/pseries/dtl.c	2014-02-03 13:26:45.060851321 -0600
@@ -74,7 +74,7 @@
  */
 static void consume_dtle(struct dtl_entry *dtle, u64 index)
 {
-	struct dtl_ring *dtlr = &__get_cpu_var(dtl_rings);
+	struct dtl_ring *dtlr = this_cpu_ptr(&dtl_rings);
 	struct dtl_entry *wp = dtlr->write_ptr;
 	struct lppaca *vpa = local_paca->lppaca_ptr;
 
Index: linux/arch/powerpc/platforms/pseries/hvCall_inst.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/hvCall_inst.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/pseries/hvCall_inst.c	2014-02-03 13:26:45.070851115 -0600
@@ -109,7 +109,7 @@
 	if (opcode > MAX_HCALL_OPCODE)
 		return;
 
-	h = &__get_cpu_var(hcall_stats)[opcode / 4];
+	h = this_cpu_ptr(&hcall_stats[opcode / 4]);
 	h->tb_start = mftb();
 	h->purr_start = mfspr(SPRN_PURR);
 }
@@ -122,7 +122,7 @@
 	if (opcode > MAX_HCALL_OPCODE)
 		return;
 
-	h = &__get_cpu_var(hcall_stats)[opcode / 4];
+	h = this_cpu_ptr(&hcall_stats[opcode / 4]);
 	h->num_calls++;
 	h->tb_total += mftb() - h->tb_start;
 	h->purr_total += mfspr(SPRN_PURR) - h->purr_start;
Index: linux/arch/powerpc/platforms/pseries/iommu.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/iommu.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/pseries/iommu.c	2014-02-03 13:26:45.070851115 -0600
@@ -200,7 +200,7 @@
 
 	local_irq_save(flags);	/* to protect tcep and the page behind it */
 
-	tcep = __get_cpu_var(tce_page);
+	tcep = __this_cpu_read(tce_page);
 
 	/* This is safe to do since interrupts are off when we're called
 	 * from iommu_alloc{,_sg}()
@@ -213,7 +213,7 @@
 			return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
 					    direction, attrs);
 		}
-		__get_cpu_var(tce_page) = tcep;
+		__this_cpu_write(tce_page, tcep);
 	}
 
 	rpn = __pa(uaddr) >> TCE_SHIFT;
@@ -399,7 +399,7 @@
 	long l, limit;
 
 	local_irq_disable();	/* to protect tcep and the page behind it */
-	tcep = __get_cpu_var(tce_page);
+	tcep = __this_cpu_read(tce_page);
 
 	if (!tcep) {
 		tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
@@ -407,7 +407,7 @@
 			local_irq_enable();
 			return -ENOMEM;
 		}
-		__get_cpu_var(tce_page) = tcep;
+		__this_cpu_write(tce_page, tcep);
 	}
 
 	proto_tce = TCE_PCI_READ | TCE_PCI_WRITE;
Index: linux/arch/powerpc/platforms/pseries/lpar.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/lpar.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/pseries/lpar.c	2014-02-03 13:26:45.070851115 -0600
@@ -514,7 +514,7 @@
 	unsigned long vpn;
 	unsigned long i, pix, rc;
 	unsigned long flags = 0;
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 	int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
 	unsigned long param[9];
 	unsigned long hash, index, shift, hidx, slot;
@@ -689,7 +689,7 @@
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(hcall_trace_depth);
+	depth = this_cpu_ptr(&hcall_trace_depth);
 
 	if (*depth)
 		goto out;
@@ -714,7 +714,7 @@
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(hcall_trace_depth);
+	depth = this_cpu_ptr(&hcall_trace_depth);
 
 	if (*depth)
 		goto out;
Index: linux/arch/powerpc/platforms/pseries/ras.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/ras.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/platforms/pseries/ras.c	2014-02-03 13:26:45.070851115 -0600
@@ -301,8 +301,8 @@
 	/* If it isn't an extended log we can use the per cpu 64bit buffer */
 	h = (struct rtas_error_log *)&savep[1];
 	if (!h->extended) {
-		memcpy(&__get_cpu_var(mce_data_buf), h, sizeof(__u64));
-		errhdr = (struct rtas_error_log *)&__get_cpu_var(mce_data_buf);
+		memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64));
+		errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf);
 	} else {
 		int len;
 
Index: linux/arch/powerpc/sysdev/xics/xics-common.c
===================================================================
--- linux.orig/arch/powerpc/sysdev/xics/xics-common.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/sysdev/xics/xics-common.c	2014-02-03 13:26:45.070851115 -0600
@@ -155,7 +155,7 @@
 
 void xics_teardown_cpu(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	/*
 	 * we have to reset the cppr index to 0 because we're
Index: linux/arch/powerpc/kernel/iommu.c
===================================================================
--- linux.orig/arch/powerpc/kernel/iommu.c	2014-02-03 13:26:45.070851115 -0600
+++ linux/arch/powerpc/kernel/iommu.c	2014-02-03 13:26:45.070851115 -0600
@@ -208,7 +208,7 @@
 	 * We don't need to disable preemption here because any CPU can
 	 * safely use any IOMMU pool.
 	 */
-	pool_nr = __raw_get_cpu_var(iommu_pool_hash) & (tbl->nr_pools - 1);
+	pool_nr = __this_cpu_read(iommu_pool_hash) & (tbl->nr_pools - 1);
 
 	if (largealloc)
 		pool = &(tbl->large_pool);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 40/48] powerpc: Handle new __get_cpu_var calls in 3.14
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (38 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 39/48] powerpc: " Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19   ` Christoph Lameter
                   ` (8 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: powerpc_new --]
[-- Type: text/plain, Size: 4716 bytes --]

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/powerpc/kernel/irq.c
===================================================================
--- linux.orig/arch/powerpc/kernel/irq.c	2014-02-03 14:16:29.028411561 -0600
+++ linux/arch/powerpc/kernel/irq.c	2014-02-03 14:27:59.283978219 -0600
@@ -486,7 +486,7 @@
 
 	/* And finally process it */
 	if (unlikely(irq == NO_IRQ))
-		__get_cpu_var(irq_stat).spurious_irqs++;
+		__this_cpu_inc(irq_stat.spurious_irqs);
 	else {
 		desc = irq_to_desc(irq);
 		if (likely(desc))
Index: linux/arch/powerpc/kernel/kgdb.c
===================================================================
--- linux.orig/arch/powerpc/kernel/kgdb.c	2014-01-28 11:24:05.156637470 -0600
+++ linux/arch/powerpc/kernel/kgdb.c	2014-02-03 14:22:38.680680583 -0600
@@ -155,7 +155,7 @@
 {
 	struct thread_info *thread_info, *exception_thread_info;
 	struct thread_info *backup_current_thread_info =
-		&__get_cpu_var(kgdb_thread_info);
+		this_cpu_ptr(&kgdb_thread_info);
 
 	if (user_mode(regs))
 		return 0;
Index: linux/arch/powerpc/kernel/mce.c
===================================================================
--- linux.orig/arch/powerpc/kernel/mce.c	2014-01-28 11:24:05.156637470 -0600
+++ linux/arch/powerpc/kernel/mce.c	2014-02-03 14:26:27.685893903 -0600
@@ -73,8 +73,8 @@
 		    uint64_t addr)
 {
 	uint64_t srr1;
-	int index = __get_cpu_var(mce_nest_count)++;
-	struct machine_check_event *mce = &__get_cpu_var(mce_event[index]);
+	int index = __this_cpu_inc_return(mce_nest_count);
+	struct machine_check_event *mce = __this_cpu_read(mce_event[index]);
 
 	/*
 	 * Return if we don't have enough space to log mce event.
@@ -143,7 +143,7 @@
  */
 int get_mce_event(struct machine_check_event *mce, bool release)
 {
-	int index = __get_cpu_var(mce_nest_count) - 1;
+	int index = __this_cpu_read(mce_nest_count) - 1;
 	struct machine_check_event *mc_evt;
 	int ret = 0;
 
@@ -153,7 +153,7 @@
 
 	/* Check if we have MCE info to process. */
 	if (index < MAX_MC_EVT) {
-		mc_evt = &__get_cpu_var(mce_event[index]);
+		mc_evt = __this_cpu_read(mce_event[index]);
 		/* Copy the event structure and release the original */
 		if (mce)
 			*mce = *mc_evt;
@@ -163,7 +163,7 @@
 	}
 	/* Decrement the count to free the slot. */
 	if (release)
-		__get_cpu_var(mce_nest_count)--;
+		__this_cpu_dec(mce_nest_count);
 
 	return ret;
 }
@@ -184,13 +184,13 @@
 	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
 		return;
 
-	index = __get_cpu_var(mce_queue_count)++;
+	index = __this_cpu_inc_return(mce_queue_count);
 	/* If queue is full, just return for now. */
 	if (index >= MAX_MC_EVT) {
-		__get_cpu_var(mce_queue_count)--;
+		__this_cpu_dec(mce_queue_count);
 		return;
 	}
-	__get_cpu_var(mce_event_queue[index]) = evt;
+	__this_cpu_write(mce_event_queue[index], evt);
 
 	/* Queue irq work to process this event later. */
 	irq_work_queue(&mce_event_process_work);
@@ -208,11 +208,11 @@
 	 * For now just print it to console.
 	 * TODO: log this error event to FSP or nvram.
 	 */
-	while (__get_cpu_var(mce_queue_count) > 0) {
-		index = __get_cpu_var(mce_queue_count) - 1;
+	while (__this_cpu_read(mce_queue_count) > 0) {
+		index = __this_cpu_read(mce_queue_count) - 1;
 		machine_check_print_event_info(
-				&__get_cpu_var(mce_event_queue[index]));
-		__get_cpu_var(mce_queue_count)--;
+				this_cpu_ptr(&mce_event_queue[index]));
+		__this_cpu_dec(mce_queue_count);
 	}
 }
 
Index: linux/arch/powerpc/kernel/time.c
===================================================================
--- linux.orig/arch/powerpc/kernel/time.c	2014-02-03 14:16:29.028411561 -0600
+++ linux/arch/powerpc/kernel/time.c	2014-02-03 14:27:26.854657482 -0600
@@ -531,7 +531,7 @@
 		*next_tb = ~(u64)0;
 		if (evt->event_handler)
 			evt->event_handler(evt);
-		__get_cpu_var(irq_stat).timer_irqs_event++;
+		__this_cpu_inc(irq_stat.timer_irqs_event);
 	} else {
 		now = *next_tb - now;
 		if (now <= DECREMENTER_MAX)
@@ -539,7 +539,7 @@
 		/* We may have raced with new irq work */
 		if (test_irq_work_pending())
 			set_dec(1);
-		__get_cpu_var(irq_stat).timer_irqs_others++;
+		__this_cpu_inc(irq_stat.timer_irqs_others);
 	}
 
 #ifdef CONFIG_PPC64
Index: linux/arch/powerpc/mm/hugetlbpage-book3e.c
===================================================================
--- linux.orig/arch/powerpc/mm/hugetlbpage-book3e.c	2014-02-03 14:16:29.038411354 -0600
+++ linux/arch/powerpc/mm/hugetlbpage-book3e.c	2014-02-03 14:22:04.551394132 -0600
@@ -33,7 +33,7 @@
 
 	ncams = mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY;
 
-	index = __get_cpu_var(next_tlbcam_idx);
+	index = this_cpu_read(next_tlbcam_idx);
 
 	/* Just round-robin the entries and wrap when we hit the end */
 	if (unlikely(index == ncams - 1))


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 41/48] sparc: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:19   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

[-- Attachment #1: this_sparc --]
[-- Type: text/plain, Size: 13729 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: sparclinux@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/sparc/include/asm/cpudata_32.h
===================================================================
--- linux.orig/arch/sparc/include/asm/cpudata_32.h	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/include/asm/cpudata_32.h	2014-02-03 13:35:53.119452546 -0600
@@ -26,6 +26,6 @@
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu) per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data() __get_cpu_var(__cpu_data)
+#define local_cpu_data() __this_cpu_read(__cpu_data)
 
 #endif /* _SPARC_CPUDATA_H */
Index: linux/arch/sparc/include/asm/cpudata_64.h
===================================================================
--- linux.orig/arch/sparc/include/asm/cpudata_64.h	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/include/asm/cpudata_64.h	2014-02-03 13:35:53.119452546 -0600
@@ -33,7 +33,7 @@
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu)		per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data()	__get_cpu_var(__cpu_data)
+#define local_cpu_data()	__this_cpu_read(__cpu_data)
 
 extern const struct seq_operations cpuinfo_op;
 
Index: linux/arch/sparc/kernel/kprobes.c
===================================================================
--- linux.orig/arch/sparc/kernel/kprobes.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/kprobes.c	2014-02-03 13:35:53.129452338 -0600
@@ -83,7 +83,7 @@
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_orig_tnpc = kcb->prev_kprobe.orig_tnpc;
 	kcb->kprobe_orig_tstate_pil = kcb->prev_kprobe.orig_tstate_pil;
@@ -92,7 +92,7 @@
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_orig_tnpc = regs->tnpc;
 	kcb->kprobe_orig_tstate_pil = (regs->tstate & TSTATE_PIL);
 }
@@ -155,7 +155,7 @@
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/sparc/kernel/leon_smp.c
===================================================================
--- linux.orig/arch/sparc/kernel/leon_smp.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/leon_smp.c	2014-02-03 13:35:53.129452338 -0600
@@ -354,7 +354,7 @@
 
 void leonsmp_ipi_interrupt(void)
 {
-	struct leon_ipi_work *work = &__get_cpu_var(leon_ipi_work);
+	struct leon_ipi_work *work = this_cpu_ptr(&leon_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/nmi.c
===================================================================
--- linux.orig/arch/sparc/kernel/nmi.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/nmi.c	2014-02-03 13:35:53.129452338 -0600
@@ -111,20 +111,20 @@
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
 
 	sum = local_cpu_data().irq0_irqs;
-	if (__get_cpu_var(nmi_touch)) {
-		__get_cpu_var(nmi_touch) = 0;
+	if (__this_cpu_read(nmi_touch)) {
+		__this_cpu_write(nmi_touch, 0);
 		touched = 1;
 	}
-	if (!touched && __get_cpu_var(last_irq_sum) == sum) {
+	if (!touched && __this_cpu_read(last_irq_sum) == sum) {
 		__this_cpu_inc(alert_counter);
 		if (__this_cpu_read(alert_counter) == 30 * nmi_hz)
 			die_nmi("BUG: NMI Watchdog detected LOCKUP",
 				regs, panic_on_timeout);
 	} else {
-		__get_cpu_var(last_irq_sum) = sum;
+		__this_cpu_write(last_irq_sum, sum);
 		__this_cpu_write(alert_counter, 0);
 	}
-	if (__get_cpu_var(wd_enabled)) {
+	if (__this_cpu_read(wd_enabled)) {
 		pcr_ops->write_pic(0, pcr_ops->nmi_picl_value(nmi_hz));
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_enable);
 	}
@@ -166,7 +166,7 @@
 void stop_nmi_watchdog(void *unused)
 {
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
-	__get_cpu_var(wd_enabled) = 0;
+	__this_cpu_write(wd_enabled, 0);
 	atomic_dec(&nmi_active);
 }
 
@@ -219,7 +219,7 @@
 
 void start_nmi_watchdog(void *unused)
 {
-	__get_cpu_var(wd_enabled) = 1;
+	__this_cpu_write(wd_enabled, 1);
 	atomic_inc(&nmi_active);
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
@@ -230,7 +230,7 @@
 
 static void nmi_adjust_hz_one(void *unused)
 {
-	if (!__get_cpu_var(wd_enabled))
+	if (!__this_cpu_read(wd_enabled))
 		return;
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
Index: linux/arch/sparc/kernel/pci_sun4v.c
===================================================================
--- linux.orig/arch/sparc/kernel/pci_sun4v.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/pci_sun4v.c	2014-02-03 13:35:53.129452338 -0600
@@ -48,7 +48,7 @@
 /* Interrupts must be disabled.  */
 static inline void iommu_batch_start(struct device *dev, unsigned long prot, unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	p->dev		= dev;
 	p->prot		= prot;
@@ -94,7 +94,7 @@
 
 static inline void iommu_batch_new_entry(unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	if (p->entry + p->npages == entry)
 		return;
@@ -106,7 +106,7 @@
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_add(u64 phys_page)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
@@ -120,7 +120,7 @@
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_end(void)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
Index: linux/arch/sparc/kernel/perf_event.c
===================================================================
--- linux.orig/arch/sparc/kernel/perf_event.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/perf_event.c	2014-02-03 13:35:53.129452338 -0600
@@ -1013,7 +1013,7 @@
 
 static void sparc_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (cpuc->enabled)
@@ -1031,7 +1031,7 @@
 
 static void sparc_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (!cpuc->enabled)
@@ -1065,7 +1065,7 @@
 
 static void sparc_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (flags & PERF_EF_RELOAD) {
@@ -1080,7 +1080,7 @@
 
 static void sparc_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (!(event->hw.state & PERF_HES_STOPPED)) {
@@ -1096,7 +1096,7 @@
 
 static void sparc_pmu_del(struct perf_event *event, int _flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 	int i;
 
@@ -1133,7 +1133,7 @@
 
 static void sparc_pmu_read(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -1145,7 +1145,7 @@
 
 static void perf_stop_nmi_watchdog(void *unused)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	stop_nmi_watchdog(NULL);
@@ -1356,7 +1356,7 @@
 
 static int sparc_pmu_add(struct perf_event *event, int ef_flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n0, ret = -EAGAIN;
 	unsigned long flags;
 
@@ -1498,7 +1498,7 @@
  */
 static void sparc_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1511,7 +1511,7 @@
  */
 static void sparc_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1524,13 +1524,13 @@
  */
 static int sparc_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n;
 
 	if (!sparc_pmu)
 		return -EINVAL;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 	n = cpuc->n_events;
 	if (check_excludes(cpuc->event, 0, n))
 		return -EINVAL;
@@ -1601,7 +1601,7 @@
 
 	regs = args->regs;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* If the PMU has the TOE IRQ enable bits, we need to do a
 	 * dummy write to the %pcr to clear the overflow bits and thus
Index: linux/arch/sparc/kernel/sun4d_smp.c
===================================================================
--- linux.orig/arch/sparc/kernel/sun4d_smp.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/sun4d_smp.c	2014-02-03 13:35:53.129452338 -0600
@@ -204,7 +204,7 @@
 
 void sun4d_ipi_interrupt(void)
 {
-	struct sun4d_ipi_work *work = &__get_cpu_var(sun4d_ipi_work);
+	struct sun4d_ipi_work *work = this_cpu_ptr(&sun4d_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/time_64.c
===================================================================
--- linux.orig/arch/sparc/kernel/time_64.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/time_64.c	2014-02-03 13:35:53.129452338 -0600
@@ -766,7 +766,7 @@
 			     : /* no outputs */
 			     : "r" (pstate));
 
-	sevt = &__get_cpu_var(sparc64_events);
+	sevt = this_cpu_ptr(&sparc64_events);
 
 	memcpy(sevt, &sparc64_clockevent, sizeof(*sevt));
 	sevt->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/sparc/mm/tlb.c
===================================================================
--- linux.orig/arch/sparc/mm/tlb.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/mm/tlb.c	2014-02-03 13:35:53.129452338 -0600
@@ -52,14 +52,14 @@
 
 void arch_enter_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	tb->active = 1;
 }
 
 void arch_leave_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	if (tb->tlb_nr)
 		flush_tlb_pending();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 41/48] sparc: Replace __get_cpu_var uses
@ 2014-02-14 20:19   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: sparclinux@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/sparc/include/asm/cpudata_32.h
=================================--- linux.orig/arch/sparc/include/asm/cpudata_32.h	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/include/asm/cpudata_32.h	2014-02-03 13:35:53.119452546 -0600
@@ -26,6 +26,6 @@
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu) per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data() __get_cpu_var(__cpu_data)
+#define local_cpu_data() __this_cpu_read(__cpu_data)
 
 #endif /* _SPARC_CPUDATA_H */
Index: linux/arch/sparc/include/asm/cpudata_64.h
=================================--- linux.orig/arch/sparc/include/asm/cpudata_64.h	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/include/asm/cpudata_64.h	2014-02-03 13:35:53.119452546 -0600
@@ -33,7 +33,7 @@
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu)		per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data()	__get_cpu_var(__cpu_data)
+#define local_cpu_data()	__this_cpu_read(__cpu_data)
 
 extern const struct seq_operations cpuinfo_op;
 
Index: linux/arch/sparc/kernel/kprobes.c
=================================--- linux.orig/arch/sparc/kernel/kprobes.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/kprobes.c	2014-02-03 13:35:53.129452338 -0600
@@ -83,7 +83,7 @@
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_orig_tnpc = kcb->prev_kprobe.orig_tnpc;
 	kcb->kprobe_orig_tstate_pil = kcb->prev_kprobe.orig_tstate_pil;
@@ -92,7 +92,7 @@
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_orig_tnpc = regs->tnpc;
 	kcb->kprobe_orig_tstate_pil = (regs->tstate & TSTATE_PIL);
 }
@@ -155,7 +155,7 @@
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/sparc/kernel/leon_smp.c
=================================--- linux.orig/arch/sparc/kernel/leon_smp.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/leon_smp.c	2014-02-03 13:35:53.129452338 -0600
@@ -354,7 +354,7 @@
 
 void leonsmp_ipi_interrupt(void)
 {
-	struct leon_ipi_work *work = &__get_cpu_var(leon_ipi_work);
+	struct leon_ipi_work *work = this_cpu_ptr(&leon_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/nmi.c
=================================--- linux.orig/arch/sparc/kernel/nmi.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/nmi.c	2014-02-03 13:35:53.129452338 -0600
@@ -111,20 +111,20 @@
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
 
 	sum = local_cpu_data().irq0_irqs;
-	if (__get_cpu_var(nmi_touch)) {
-		__get_cpu_var(nmi_touch) = 0;
+	if (__this_cpu_read(nmi_touch)) {
+		__this_cpu_write(nmi_touch, 0);
 		touched = 1;
 	}
-	if (!touched && __get_cpu_var(last_irq_sum) = sum) {
+	if (!touched && __this_cpu_read(last_irq_sum) = sum) {
 		__this_cpu_inc(alert_counter);
 		if (__this_cpu_read(alert_counter) = 30 * nmi_hz)
 			die_nmi("BUG: NMI Watchdog detected LOCKUP",
 				regs, panic_on_timeout);
 	} else {
-		__get_cpu_var(last_irq_sum) = sum;
+		__this_cpu_write(last_irq_sum, sum);
 		__this_cpu_write(alert_counter, 0);
 	}
-	if (__get_cpu_var(wd_enabled)) {
+	if (__this_cpu_read(wd_enabled)) {
 		pcr_ops->write_pic(0, pcr_ops->nmi_picl_value(nmi_hz));
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_enable);
 	}
@@ -166,7 +166,7 @@
 void stop_nmi_watchdog(void *unused)
 {
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
-	__get_cpu_var(wd_enabled) = 0;
+	__this_cpu_write(wd_enabled, 0);
 	atomic_dec(&nmi_active);
 }
 
@@ -219,7 +219,7 @@
 
 void start_nmi_watchdog(void *unused)
 {
-	__get_cpu_var(wd_enabled) = 1;
+	__this_cpu_write(wd_enabled, 1);
 	atomic_inc(&nmi_active);
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
@@ -230,7 +230,7 @@
 
 static void nmi_adjust_hz_one(void *unused)
 {
-	if (!__get_cpu_var(wd_enabled))
+	if (!__this_cpu_read(wd_enabled))
 		return;
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
Index: linux/arch/sparc/kernel/pci_sun4v.c
=================================--- linux.orig/arch/sparc/kernel/pci_sun4v.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/pci_sun4v.c	2014-02-03 13:35:53.129452338 -0600
@@ -48,7 +48,7 @@
 /* Interrupts must be disabled.  */
 static inline void iommu_batch_start(struct device *dev, unsigned long prot, unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	p->dev		= dev;
 	p->prot		= prot;
@@ -94,7 +94,7 @@
 
 static inline void iommu_batch_new_entry(unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	if (p->entry + p->npages = entry)
 		return;
@@ -106,7 +106,7 @@
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_add(u64 phys_page)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
@@ -120,7 +120,7 @@
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_end(void)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
Index: linux/arch/sparc/kernel/perf_event.c
=================================--- linux.orig/arch/sparc/kernel/perf_event.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/perf_event.c	2014-02-03 13:35:53.129452338 -0600
@@ -1013,7 +1013,7 @@
 
 static void sparc_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (cpuc->enabled)
@@ -1031,7 +1031,7 @@
 
 static void sparc_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (!cpuc->enabled)
@@ -1065,7 +1065,7 @@
 
 static void sparc_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (flags & PERF_EF_RELOAD) {
@@ -1080,7 +1080,7 @@
 
 static void sparc_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (!(event->hw.state & PERF_HES_STOPPED)) {
@@ -1096,7 +1096,7 @@
 
 static void sparc_pmu_del(struct perf_event *event, int _flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 	int i;
 
@@ -1133,7 +1133,7 @@
 
 static void sparc_pmu_read(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -1145,7 +1145,7 @@
 
 static void perf_stop_nmi_watchdog(void *unused)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	stop_nmi_watchdog(NULL);
@@ -1356,7 +1356,7 @@
 
 static int sparc_pmu_add(struct perf_event *event, int ef_flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n0, ret = -EAGAIN;
 	unsigned long flags;
 
@@ -1498,7 +1498,7 @@
  */
 static void sparc_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1511,7 +1511,7 @@
  */
 static void sparc_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1524,13 +1524,13 @@
  */
 static int sparc_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n;
 
 	if (!sparc_pmu)
 		return -EINVAL;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 	n = cpuc->n_events;
 	if (check_excludes(cpuc->event, 0, n))
 		return -EINVAL;
@@ -1601,7 +1601,7 @@
 
 	regs = args->regs;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* If the PMU has the TOE IRQ enable bits, we need to do a
 	 * dummy write to the %pcr to clear the overflow bits and thus
Index: linux/arch/sparc/kernel/sun4d_smp.c
=================================--- linux.orig/arch/sparc/kernel/sun4d_smp.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/sun4d_smp.c	2014-02-03 13:35:53.129452338 -0600
@@ -204,7 +204,7 @@
 
 void sun4d_ipi_interrupt(void)
 {
-	struct sun4d_ipi_work *work = &__get_cpu_var(sun4d_ipi_work);
+	struct sun4d_ipi_work *work = this_cpu_ptr(&sun4d_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/time_64.c
=================================--- linux.orig/arch/sparc/kernel/time_64.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/kernel/time_64.c	2014-02-03 13:35:53.129452338 -0600
@@ -766,7 +766,7 @@
 			     : /* no outputs */
 			     : "r" (pstate));
 
-	sevt = &__get_cpu_var(sparc64_events);
+	sevt = this_cpu_ptr(&sparc64_events);
 
 	memcpy(sevt, &sparc64_clockevent, sizeof(*sevt));
 	sevt->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/sparc/mm/tlb.c
=================================--- linux.orig/arch/sparc/mm/tlb.c	2014-02-03 13:35:53.129452338 -0600
+++ linux/arch/sparc/mm/tlb.c	2014-02-03 13:35:53.129452338 -0600
@@ -52,14 +52,14 @@
 
 void arch_enter_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	tb->active = 1;
 }
 
 void arch_leave_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	if (tb->tlb_nr)
 		flush_tlb_pending();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 42/48] tile: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (40 preceding siblings ...)
  2014-02-14 20:19   ` Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 43/48] blackfin: " Christoph Lameter
                   ` (6 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

[-- Attachment #1: this_tile --]
[-- Type: text/plain, Size: 14452 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/tile/include/asm/mmu_context.h
===================================================================
--- linux.orig/arch/tile/include/asm/mmu_context.h	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/include/asm/mmu_context.h	2013-12-18 13:40:08.322514483 -0600
@@ -84,7 +84,7 @@ static inline void enter_lazy_tlb(struct
 	 * clear any pending DMA interrupts.
 	 */
 	if (current->thread.tile_dma_state.enabled)
-		install_page_table(mm->pgd, __get_cpu_var(current_asid));
+		install_page_table(mm->pgd, __this_cpu_read(current_asid));
 #endif
 }
 
@@ -96,12 +96,12 @@ static inline void switch_mm(struct mm_s
 		int cpu = smp_processor_id();
 
 		/* Pick new ASID. */
-		int asid = __get_cpu_var(current_asid) + 1;
+		int asid = __this_cpu_read(current_asid) + 1;
 		if (asid > max_asid) {
 			asid = min_asid;
 			local_flush_tlb();
 		}
-		__get_cpu_var(current_asid) = asid;
+		__this_cpu_write(current_asid, asid);
 
 		/* Clear cpu from the old mm, and set it in the new one. */
 		cpumask_clear_cpu(cpu, mm_cpumask(prev));
Index: linux/arch/tile/kernel/irq.c
===================================================================
--- linux.orig/arch/tile/kernel/irq.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/irq.c	2013-12-18 13:40:08.322514483 -0600
@@ -79,7 +79,7 @@ static DEFINE_SPINLOCK(available_irqs_lo
  */
 void tile_dev_intr(struct pt_regs *regs, int intnum)
 {
-	int depth = __get_cpu_var(irq_depth)++;
+	int depth = __this_cpu_inc_return(irq_depth);
 	unsigned long original_irqs;
 	unsigned long remaining_irqs;
 	struct pt_regs *old_regs;
@@ -126,7 +126,7 @@ void tile_dev_intr(struct pt_regs *regs,
 
 		/* Count device irqs; Linux IPIs are counted elsewhere. */
 		if (irq != IRQ_RESCHEDULE)
-			__get_cpu_var(irq_stat).irq_dev_intr_count++;
+			__this_cpu_inc(irq_stat.irq_dev_intr_count);
 
 		generic_handle_irq(irq);
 	}
@@ -136,10 +136,10 @@ void tile_dev_intr(struct pt_regs *regs,
 	 * including any that were reenabled during interrupt
 	 * handling.
 	 */
-	if (depth == 0)
-		unmask_irqs(~__get_cpu_var(irq_disable_mask));
+	if (depth == 1)
+		unmask_irqs(~__this_cpu_read(irq_disable_mask));
 
-	__get_cpu_var(irq_depth)--;
+	__this_cpu_dec(irq_depth);
 
 	/*
 	 * Track time spent against the current process again and
@@ -157,7 +157,7 @@ void tile_dev_intr(struct pt_regs *regs,
 static void tile_irq_chip_enable(struct irq_data *d)
 {
 	get_cpu_var(irq_disable_mask) &= ~(1UL << d->irq);
-	if (__get_cpu_var(irq_depth) == 0)
+	if (__this_cpu_read(irq_depth) == 0)
 		unmask_irqs(1UL << d->irq);
 	put_cpu_var(irq_disable_mask);
 }
@@ -203,7 +203,7 @@ static void tile_irq_chip_ack(struct irq
  */
 static void tile_irq_chip_eoi(struct irq_data *d)
 {
-	if (!(__get_cpu_var(irq_disable_mask) & (1UL << d->irq)))
+	if (!(__this_cpu_read(irq_disable_mask) & (1UL << d->irq)))
 		unmask_irqs(1UL << d->irq);
 }
 
Index: linux/arch/tile/kernel/messaging.c
===================================================================
--- linux.orig/arch/tile/kernel/messaging.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/messaging.c	2013-12-18 13:40:08.322514483 -0600
@@ -28,7 +28,7 @@ static DEFINE_PER_CPU(HV_MsgState, msg_s
 void init_messaging(void)
 {
 	/* Allocate storage for messages in kernel space */
-	HV_MsgState *state = &__get_cpu_var(msg_state);
+	HV_MsgState *state = this_cpu_ptr(&msg_state);
 	int rc = hv_register_message_state(state);
 	if (rc != HV_OK)
 		panic("hv_register_message_state: error %d", rc);
@@ -68,7 +68,7 @@ void hv_message_intr(struct pt_regs *reg
 #endif
 
 	while (1) {
-		rmi = hv_receive_message(__get_cpu_var(msg_state),
+		rmi = hv_receive_message(__this_cpu_read(msg_state),
 					 (HV_VirtAddr) message,
 					 sizeof(message));
 		if (rmi.msglen == 0)
@@ -96,7 +96,7 @@ void hv_message_intr(struct pt_regs *reg
 			struct hv_driver_cb *cb =
 				(struct hv_driver_cb *)him->intarg;
 			cb->callback(cb, him->intdata);
-			__get_cpu_var(irq_stat).irq_hv_msg_count++;
+			__this_cpu_inc(irq_stat.irq_hv_msg_count);
 		}
 	}
 
Index: linux/arch/tile/kernel/process.c
===================================================================
--- linux.orig/arch/tile/kernel/process.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/process.c	2013-12-18 13:40:08.322514483 -0600
@@ -64,7 +64,7 @@ early_param("idle", idle_setup);
 
 void arch_cpu_idle(void)
 {
-	__get_cpu_var(irq_stat).idle_timestamp = jiffies;
+	__this_cpu_write(irq_stat.idle_timestamp, jiffies);
 	_cpu_idle();
 }
 
Index: linux/arch/tile/kernel/setup.c
===================================================================
--- linux.orig/arch/tile/kernel/setup.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/setup.c	2013-12-18 13:40:08.322514483 -0600
@@ -1220,7 +1220,8 @@ static void __init validate_hv(void)
 	 * various asid variables to their appropriate initial states.
 	 */
 	asid_range = hv_inquire_asid(0);
-	__get_cpu_var(current_asid) = min_asid = asid_range.start;
+	min_asid = asid_range.start;
+	__this_cpu_write(current_asid, min_asid);
 	max_asid = asid_range.start + asid_range.size - 1;
 
 	if (hv_confstr(HV_CONFSTR_CHIP_MODEL, (HV_VirtAddr)chip_model,
Index: linux/arch/tile/kernel/single_step.c
===================================================================
--- linux.orig/arch/tile/kernel/single_step.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/single_step.c	2013-12-18 13:40:08.322514483 -0600
@@ -740,7 +740,7 @@ static DEFINE_PER_CPU(unsigned long, ss_
 
 void gx_singlestep_handle(struct pt_regs *regs, int fault_num)
 {
-	unsigned long *ss_pc = &__get_cpu_var(ss_saved_pc);
+	unsigned long *ss_pc = this_cpu_ptr(&ss_saved_pc);
 	struct thread_info *info = (void *)current_thread_info();
 	int is_single_step = test_ti_thread_flag(info, TIF_SINGLESTEP);
 	unsigned long control = __insn_mfspr(SPR_SINGLE_STEP_CONTROL_K);
@@ -766,7 +766,7 @@ void gx_singlestep_handle(struct pt_regs
 
 void single_step_once(struct pt_regs *regs)
 {
-	unsigned long *ss_pc = &__get_cpu_var(ss_saved_pc);
+	unsigned long *ss_pc = this_cpu_ptr(&ss_saved_pc);
 	unsigned long control = __insn_mfspr(SPR_SINGLE_STEP_CONTROL_K);
 
 	*ss_pc = regs->pc;
Index: linux/arch/tile/kernel/smp.c
===================================================================
--- linux.orig/arch/tile/kernel/smp.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/smp.c	2013-12-18 13:40:08.322514483 -0600
@@ -188,7 +188,7 @@ void flush_icache_range(unsigned long st
 /* Called when smp_send_reschedule() triggers IRQ_RESCHEDULE. */
 static irqreturn_t handle_reschedule_ipi(int irq, void *token)
 {
-	__get_cpu_var(irq_stat).irq_resched_count++;
+	__this_cpu_inc(irq_stat.irq_resched_count);
 	scheduler_ipi();
 
 	return IRQ_HANDLED;
Index: linux/arch/tile/kernel/smpboot.c
===================================================================
--- linux.orig/arch/tile/kernel/smpboot.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/smpboot.c	2013-12-18 13:40:08.322514483 -0600
@@ -41,7 +41,7 @@ void __init smp_prepare_boot_cpu(void)
 	int cpu = smp_processor_id();
 	set_cpu_online(cpu, 1);
 	set_cpu_present(cpu, 1);
-	__get_cpu_var(cpu_state) = CPU_ONLINE;
+	__this_cpu_write(cpu_state, CPU_ONLINE);
 
 	init_messaging();
 }
@@ -158,7 +158,7 @@ static void start_secondary(void)
 	/* printk(KERN_DEBUG "Initializing CPU#%d\n", cpuid); */
 
 	/* Initialize the current asid for our first page table. */
-	__get_cpu_var(current_asid) = min_asid;
+	__this_cpu_write(current_asid, min_asid);
 
 	/* Set up this thread as another owner of the init_mm */
 	atomic_inc(&init_mm.mm_count);
@@ -201,7 +201,7 @@ void online_secondary(void)
 	notify_cpu_starting(smp_processor_id());
 
 	set_cpu_online(smp_processor_id(), 1);
-	__get_cpu_var(cpu_state) = CPU_ONLINE;
+	__this_cpu_write(cpu_state, CPU_ONLINE);
 
 	/* Set up tile-specific state for this cpu. */
 	setup_cpu(0);
Index: linux/arch/tile/kernel/time.c
===================================================================
--- linux.orig/arch/tile/kernel/time.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/kernel/time.c	2013-12-18 13:40:08.322514483 -0600
@@ -162,7 +162,7 @@ static DEFINE_PER_CPU(struct clock_event
 
 void setup_tile_timer(void)
 {
-	struct clock_event_device *evt = &__get_cpu_var(tile_timer);
+	struct clock_event_device *evt = this_cpu_ptr(&tile_timer);
 
 	/* Fill in fields that are speed-specific. */
 	clockevents_calc_mult_shift(evt, cycles_per_sec, TILE_MINSEC);
@@ -182,7 +182,7 @@ void setup_tile_timer(void)
 void do_timer_interrupt(struct pt_regs *regs, int fault_num)
 {
 	struct pt_regs *old_regs = set_irq_regs(regs);
-	struct clock_event_device *evt = &__get_cpu_var(tile_timer);
+	struct clock_event_device *evt = this_cpu_ptr(&tile_timer);
 
 	/*
 	 * Mask the timer interrupt here, since we are a oneshot timer
@@ -194,7 +194,7 @@ void do_timer_interrupt(struct pt_regs *
 	irq_enter();
 
 	/* Track interrupt count. */
-	__get_cpu_var(irq_stat).irq_timer_count++;
+	__this_cpu_inc(irq_stat.irq_timer_count);
 
 	/* Call the generic timer handler */
 	evt->event_handler(evt);
@@ -235,7 +235,7 @@ cycles_t ns2cycles(unsigned long nsecs)
 	 * We do not have to disable preemption here as each core has the same
 	 * clock frequency.
 	 */
-	struct clock_event_device *dev = &__raw_get_cpu_var(tile_timer);
+	struct clock_event_device *dev = raw_cpu_ptr(&tile_timer);
 	return ((u64)nsecs * dev->mult) >> dev->shift;
 }
 
Index: linux/arch/tile/mm/highmem.c
===================================================================
--- linux.orig/arch/tile/mm/highmem.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/mm/highmem.c	2013-12-18 13:40:08.322514483 -0600
@@ -103,7 +103,7 @@ static void kmap_atomic_register(struct
 	spin_lock(&amp_lock);
 
 	/* With interrupts disabled, now fill in the per-cpu info. */
-	amp = &__get_cpu_var(amps).per_type[type];
+	amp = this_cpu_ptr(&amps.per_type[type]);
 	amp->page = page;
 	amp->cpu = smp_processor_id();
 	amp->va = va;
Index: linux/arch/tile/mm/init.c
===================================================================
--- linux.orig/arch/tile/mm/init.c	2013-12-18 13:40:08.322514483 -0600
+++ linux/arch/tile/mm/init.c	2013-12-18 13:40:08.322514483 -0600
@@ -593,14 +593,14 @@ static void __init kernel_physical_mappi
 	interrupt_mask_set_mask(-1ULL);
 	rc = flush_and_install_context(__pa(pgtables),
 				       init_pgprot((unsigned long)pgtables),
-				       __get_cpu_var(current_asid),
+				       __this_cpu_read(current_asid),
 				       cpumask_bits(my_cpu_mask));
 	interrupt_mask_restore_mask(irqmask);
 	BUG_ON(rc != 0);
 
 	/* Copy the page table back to the normal swapper_pg_dir. */
 	memcpy(pgd_base, pgtables, sizeof(pgtables));
-	__install_page_table(pgd_base, __get_cpu_var(current_asid),
+	__install_page_table(pgd_base, __this_cpu_read(current_asid),
 			     swapper_pgprot);
 
 	/*
Index: linux/arch/tile/include/asm/irqflags.h
===================================================================
--- linux.orig/arch/tile/include/asm/irqflags.h	2013-12-18 13:37:21.000000000 -0600
+++ linux/arch/tile/include/asm/irqflags.h	2013-12-18 13:40:24.781984237 -0600
@@ -140,12 +140,12 @@ extern unsigned int debug_smp_processor_
 
 /*
  * Read the set of maskable interrupts.
- * We avoid the preemption warning here via __this_cpu_ptr since even
+ * We avoid the preemption warning here via raw_cpu_ptr since even
  * if irqs are already enabled, it's harmless to read the wrong cpu's
  * enabled mask.
  */
 #define arch_local_irqs_enabled() \
-	(*__this_cpu_ptr(&interrupts_enabled_mask))
+	(*raw_cpu_ptr(&interrupts_enabled_mask))
 
 /* Re-enable all maskable interrupts. */
 #define arch_local_irq_enable() \


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 43/48] blackfin: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (41 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 42/48] tile: " Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 44/48] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
                   ` (5 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Mike Frysinger

[-- Attachment #1: this_blackfin --]
[-- Type: text/plain, Size: 6426 bytes --]

[Patch depends on another patch in this series that introduces raw_cpu_ops]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


CC: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/blackfin/include/asm/ipipe.h
===================================================================
--- linux.orig/arch/blackfin/include/asm/ipipe.h	2014-02-03 13:35:59.579317079 -0600
+++ linux/arch/blackfin/include/asm/ipipe.h	2014-02-03 13:35:59.569317289 -0600
@@ -157,7 +157,7 @@
 }
 
 #define __ipipe_do_root_xirq(ipd, irq)					\
-	((ipd)->irqs[irq].handler(irq, &__raw_get_cpu_var(__ipipe_tick_regs)))
+	((ipd)->irqs[irq].handler(irq, raw_cpu_ptr(&__ipipe_tick_regs)))
 
 #define __ipipe_run_irqtail(irq)  /* Must be a macro */			\
 	do {								\
Index: linux/arch/blackfin/kernel/perf_event.c
===================================================================
--- linux.orig/arch/blackfin/kernel/perf_event.c	2014-02-03 13:35:59.579317079 -0600
+++ linux/arch/blackfin/kernel/perf_event.c	2014-02-03 13:35:59.569317289 -0600
@@ -300,7 +300,7 @@
 
 static void bfin_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -318,7 +318,7 @@
 
 static void bfin_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -335,7 +335,7 @@
 
 static void bfin_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	bfin_pmu_stop(event, PERF_EF_UPDATE);
 	__clear_bit(event->hw.idx, cpuc->used_mask);
@@ -345,7 +345,7 @@
 
 static int bfin_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 	int ret = -EAGAIN;
@@ -429,7 +429,7 @@
 
 static void bfin_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
 	int i;
Index: linux/arch/blackfin/mach-common/ints-priority.c
===================================================================
--- linux.orig/arch/blackfin/mach-common/ints-priority.c	2014-02-03 13:35:59.579317079 -0600
+++ linux/arch/blackfin/mach-common/ints-priority.c	2014-02-03 13:35:59.569317289 -0600
@@ -1311,12 +1311,12 @@
 		bfin_write_TIMER_STATUS(1); /* Latch TIMIL0 */
 #endif
 		/* This is basically what we need from the register frame. */
-		__raw_get_cpu_var(__ipipe_tick_regs).ipend = regs->ipend;
-		__raw_get_cpu_var(__ipipe_tick_regs).pc = regs->pc;
+		__this_cpu_write(__ipipe_tick_regs.ipend, regs->ipend);
+		__this_cpu_write(__ipipe_tick_regs.pc, regs->pc);
 		if (this_domain != ipipe_root_domain)
-			__raw_get_cpu_var(__ipipe_tick_regs).ipend &= ~0x10;
+			__this_cpu_and(__ipipe_tick_regs.ipend, ~0x10);
 		else
-			__raw_get_cpu_var(__ipipe_tick_regs).ipend |= 0x10;
+			__this_cpu_or(__ipipe_tick_regs.ipend, 0x10);
 	}
 
 	/*
Index: linux/arch/blackfin/mach-common/smp.c
===================================================================
--- linux.orig/arch/blackfin/mach-common/smp.c	2014-02-03 13:35:59.579317079 -0600
+++ linux/arch/blackfin/mach-common/smp.c	2014-02-03 13:35:59.569317289 -0600
@@ -146,7 +146,7 @@
 	platform_clear_ipi(cpu, IRQ_SUPPLE_1);
 
 	smp_rmb();
-	bfin_ipi_data = &__get_cpu_var(bfin_ipi);
+	bfin_ipi_data = this_cpu_ptr(&bfin_ipi);
 	while ((pending = atomic_xchg(&bfin_ipi_data->bits, 0)) != 0) {
 		msg = 0;
 		do {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 44/48] avr32: Replace __get_cpu_var with __this_cpu_write
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (42 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 43/48] blackfin: " Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 45/48] alpha: Replace __get_cpu_var Christoph Lameter
                   ` (4 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Haavard Skinnemoen, Hans-Christian Egtvedt

[-- Attachment #1: this_avr32 --]
[-- Type: text/plain, Size: 754 bytes --]

Replace the single use of __get_cpu_var in avr32 with
__this_cpu_write.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/avr32/kernel/kprobes.c
===================================================================
--- linux.orig/arch/avr32/kernel/kprobes.c	2013-12-02 16:08:00.844326498 -0600
+++ linux/arch/avr32/kernel/kprobes.c	2013-12-02 16:08:00.834326779 -0600
@@ -104,7 +104,7 @@ static void __kprobes resume_execution(s
 
 static void __kprobes set_current_kprobe(struct kprobe *p)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static int __kprobes kprobe_handler(struct pt_regs *regs)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 45/48] alpha: Replace __get_cpu_var
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (43 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 44/48] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19   ` Christoph Lameter
                   ` (3 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ivan Kokshaysky, Matt Turner, Richard Henderson

[-- Attachment #1: this_alpha --]
[-- Type: text/plain, Size: 6057 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/alpha/kernel/perf_event.c
===================================================================
--- linux.orig/arch/alpha/kernel/perf_event.c	2013-12-02 16:08:01.194316776 -0600
+++ linux/arch/alpha/kernel/perf_event.c	2013-12-02 16:08:01.194316776 -0600
@@ -431,7 +431,7 @@ static void maybe_change_configuration(s
  */
 static int alpha_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int n0;
 	int ret;
@@ -483,7 +483,7 @@ static int alpha_pmu_add(struct perf_eve
  */
 static void alpha_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long irq_flags;
 	int j;
@@ -531,7 +531,7 @@ static void alpha_pmu_read(struct perf_e
 static void alpha_pmu_stop(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!(hwc->state & PERF_HES_STOPPED)) {
 		cpuc->idx_mask &= ~(1UL<<hwc->idx);
@@ -551,7 +551,7 @@ static void alpha_pmu_stop(struct perf_e
 static void alpha_pmu_start(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
 		return;
@@ -724,7 +724,7 @@ static int alpha_pmu_event_init(struct p
  */
 static void alpha_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->enabled)
 		return;
@@ -750,7 +750,7 @@ static void alpha_pmu_enable(struct pmu
 
 static void alpha_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuc->enabled)
 		return;
@@ -815,7 +815,7 @@ static void alpha_perf_event_irq_handler
 	int idx, j;
 
 	__this_cpu_inc(irq_pmi_count);
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* Completely counting through the PMC's period to trigger a new PMC
 	 * overflow interrupt while in this interrupt routine is utterly
Index: linux/arch/alpha/kernel/time.c
===================================================================
--- linux.orig/arch/alpha/kernel/time.c	2013-12-02 16:08:01.194316776 -0600
+++ linux/arch/alpha/kernel/time.c	2013-12-02 16:08:01.194316776 -0600
@@ -56,9 +56,9 @@ unsigned long est_cycle_freq;
 
 DEFINE_PER_CPU(u8, irq_work_pending);
 
-#define set_irq_work_pending_flag()  __get_cpu_var(irq_work_pending) = 1
-#define test_irq_work_pending()      __get_cpu_var(irq_work_pending)
-#define clear_irq_work_pending()     __get_cpu_var(irq_work_pending) = 0
+#define set_irq_work_pending_flag()  __this_cpu_write(irq_work_pending, 1)
+#define test_irq_work_pending()      __this_cpu_read(irq_work_pending)
+#define clear_irq_work_pending()     __this_cpu_writeirq_work_pending, 0)
 
 void arch_irq_work_raise(void)
 {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 46/48] sh: Replace __get_cpu_var uses
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
@ 2014-02-14 20:19   ` Christoph Lameter
  2014-02-14 20:18   ` Christoph Lameter
                     ` (47 subsequent siblings)
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Paul Mundt, linux-sh

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Paul Mundt <lethal@linux-sh.org>
CC: linux-sh@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/sh/kernel/hw_breakpoint.c
=================================--- linux.orig/arch/sh/kernel/hw_breakpoint.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/hw_breakpoint.c	2013-12-02 16:08:01.524307605 -0600
@@ -52,7 +52,7 @@ int arch_install_hw_breakpoint(struct pe
 	int i;
 
 	for (i = 0; i < sh_ubc->num_events; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (!*slot) {
 			*slot = bp;
@@ -84,7 +84,7 @@ void arch_uninstall_hw_breakpoint(struct
 	int i;
 
 	for (i = 0; i < sh_ubc->num_events; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (*slot = bp) {
 			*slot = NULL;
Index: linux/arch/sh/kernel/kprobes.c
=================================--- linux.orig/arch/sh/kernel/kprobes.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/kprobes.c	2013-12-02 16:08:01.524307605 -0600
@@ -102,7 +102,7 @@ int __kprobes kprobe_handle_illslot(unsi
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	struct kprobe *saved = &__get_cpu_var(saved_next_opcode);
+	struct kprobe *saved = this_cpu_ptr(&saved_next_opcode);
 
 	if (saved->addr) {
 		arch_disarm_kprobe(p);
@@ -111,7 +111,7 @@ void __kprobes arch_remove_kprobe(struct
 		saved->addr = NULL;
 		saved->opcode = 0;
 
-		saved = &__get_cpu_var(saved_next_opcode2);
+		saved = this_cpu_ptr(&saved_next_opcode2);
 		if (saved->addr) {
 			arch_disarm_kprobe(saved);
 
@@ -129,14 +129,14 @@ static void __kprobes save_previous_kpro
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 }
 
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 					 struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 /*
@@ -146,15 +146,15 @@ static void __kprobes set_current_kprobe
  */
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
 {
-	__get_cpu_var(saved_current_opcode).addr = (kprobe_opcode_t *)regs->pc;
+	__this_cpu_write(saved_current_opcode.addr, (kprobe_opcode_t *)regs->pc);
 
 	if (p != NULL) {
 		struct kprobe *op1, *op2;
 
 		arch_disarm_kprobe(p);
 
-		op1 = &__get_cpu_var(saved_next_opcode);
-		op2 = &__get_cpu_var(saved_next_opcode2);
+		op1 = this_cpu_ptr(&saved_next_opcode);
+		op2 = this_cpu_ptr(&saved_next_opcode2);
 
 		if (OPCODE_JSR(p->opcode) || OPCODE_JMP(p->opcode)) {
 			unsigned int reg_nr = ((p->opcode >> 8) & 0x000F);
@@ -249,7 +249,7 @@ static int __kprobes kprobe_handler(stru
 			kcb->kprobe_status = KPROBE_REENTER;
 			return 1;
 		} else {
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
@@ -336,9 +336,9 @@ int __kprobes trampoline_probe_handler(s
 			continue;
 
 		if (ri->rp && ri->rp->handler) {
-			__get_cpu_var(current_kprobe) = &ri->rp->kp;
+			__this_cpu_write(current_kprobe, &ri->rp->kp);
 			ri->rp->handler(ri, regs);
-			__get_cpu_var(current_kprobe) = NULL;
+			__this_cpu_write(current_kprobe, NULL);
 		}
 
 		orig_ret_address = (unsigned long)ri->ret_addr;
@@ -383,19 +383,19 @@ static int __kprobes post_kprobe_handler
 		cur->post_handler(cur, regs, 0);
 	}
 
-	p = &__get_cpu_var(saved_next_opcode);
+	p = this_cpu_ptr(&saved_next_opcode);
 	if (p->addr) {
 		arch_disarm_kprobe(p);
 		p->addr = NULL;
 		p->opcode = 0;
 
-		addr = __get_cpu_var(saved_current_opcode).addr;
-		__get_cpu_var(saved_current_opcode).addr = NULL;
+		addr = __this_cpu_read(saved_current_opcode).addr;
+		__this_cpu_write(saved_current_opcode.addr, NULL);
 
 		p = get_kprobe(addr);
 		arch_arm_kprobe(p);
 
-		p = &__get_cpu_var(saved_next_opcode2);
+		p = this_cpu_ptr(&saved_next_opcode2);
 		if (p->addr) {
 			arch_disarm_kprobe(p);
 			p->addr = NULL;
@@ -511,7 +511,7 @@ int __kprobes kprobe_exceptions_notify(s
 				if (kprobe_handler(args->regs)) {
 					ret = NOTIFY_STOP;
 				} else {
-					p = __get_cpu_var(current_kprobe);
+					p = __this_cpu_read(current_kprobe);
 					if (p->break_handler &&
 					    p->break_handler(p, args->regs))
 						ret = NOTIFY_STOP;
Index: linux/arch/sh/kernel/localtimer.c
=================================--- linux.orig/arch/sh/kernel/localtimer.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/localtimer.c	2013-12-02 16:08:01.524307605 -0600
@@ -32,7 +32,7 @@ static DEFINE_PER_CPU(struct clock_event
  */
 void local_timer_interrupt(void)
 {
-	struct clock_event_device *clk = &__get_cpu_var(local_clockevent);
+	struct clock_event_device *clk = this_cpu_ptr(&local_clockevent);
 
 	irq_enter();
 	clk->event_handler(clk);
Index: linux/arch/sh/kernel/perf_event.c
=================================--- linux.orig/arch/sh/kernel/perf_event.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/perf_event.c	2013-12-02 16:08:01.524307605 -0600
@@ -227,7 +227,7 @@ again:
 
 static void sh_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -245,7 +245,7 @@ static void sh_pmu_stop(struct perf_even
 
 static void sh_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -262,7 +262,7 @@ static void sh_pmu_start(struct perf_eve
 
 static void sh_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	sh_pmu_stop(event, PERF_EF_UPDATE);
 	__clear_bit(event->hw.idx, cpuc->used_mask);
@@ -272,7 +272,7 @@ static void sh_pmu_del(struct perf_event
 
 static int sh_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 	int ret = -EAGAIN;
Index: linux/arch/sh/kernel/smp.c
=================================--- linux.orig/arch/sh/kernel/smp.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/smp.c	2013-12-02 16:08:01.524307605 -0600
@@ -111,7 +111,7 @@ void play_dead_common(void)
 	irq_ctx_exit(raw_smp_processor_id());
 	mb();
 
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 	local_irq_disable();
 }
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 46/48] sh: Replace __get_cpu_var uses
@ 2014-02-14 20:19   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Paul Mundt, linux-sh

[-- Attachment #1: this_sh --]
[-- Type: text/plain, Size: 9644 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Paul Mundt <lethal@linux-sh.org>
CC: linux-sh@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/sh/kernel/hw_breakpoint.c
===================================================================
--- linux.orig/arch/sh/kernel/hw_breakpoint.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/hw_breakpoint.c	2013-12-02 16:08:01.524307605 -0600
@@ -52,7 +52,7 @@ int arch_install_hw_breakpoint(struct pe
 	int i;
 
 	for (i = 0; i < sh_ubc->num_events; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (!*slot) {
 			*slot = bp;
@@ -84,7 +84,7 @@ void arch_uninstall_hw_breakpoint(struct
 	int i;
 
 	for (i = 0; i < sh_ubc->num_events; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (*slot == bp) {
 			*slot = NULL;
Index: linux/arch/sh/kernel/kprobes.c
===================================================================
--- linux.orig/arch/sh/kernel/kprobes.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/kprobes.c	2013-12-02 16:08:01.524307605 -0600
@@ -102,7 +102,7 @@ int __kprobes kprobe_handle_illslot(unsi
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	struct kprobe *saved = &__get_cpu_var(saved_next_opcode);
+	struct kprobe *saved = this_cpu_ptr(&saved_next_opcode);
 
 	if (saved->addr) {
 		arch_disarm_kprobe(p);
@@ -111,7 +111,7 @@ void __kprobes arch_remove_kprobe(struct
 		saved->addr = NULL;
 		saved->opcode = 0;
 
-		saved = &__get_cpu_var(saved_next_opcode2);
+		saved = this_cpu_ptr(&saved_next_opcode2);
 		if (saved->addr) {
 			arch_disarm_kprobe(saved);
 
@@ -129,14 +129,14 @@ static void __kprobes save_previous_kpro
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 }
 
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 					 struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 /*
@@ -146,15 +146,15 @@ static void __kprobes set_current_kprobe
  */
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
 {
-	__get_cpu_var(saved_current_opcode).addr = (kprobe_opcode_t *)regs->pc;
+	__this_cpu_write(saved_current_opcode.addr, (kprobe_opcode_t *)regs->pc);
 
 	if (p != NULL) {
 		struct kprobe *op1, *op2;
 
 		arch_disarm_kprobe(p);
 
-		op1 = &__get_cpu_var(saved_next_opcode);
-		op2 = &__get_cpu_var(saved_next_opcode2);
+		op1 = this_cpu_ptr(&saved_next_opcode);
+		op2 = this_cpu_ptr(&saved_next_opcode2);
 
 		if (OPCODE_JSR(p->opcode) || OPCODE_JMP(p->opcode)) {
 			unsigned int reg_nr = ((p->opcode >> 8) & 0x000F);
@@ -249,7 +249,7 @@ static int __kprobes kprobe_handler(stru
 			kcb->kprobe_status = KPROBE_REENTER;
 			return 1;
 		} else {
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
@@ -336,9 +336,9 @@ int __kprobes trampoline_probe_handler(s
 			continue;
 
 		if (ri->rp && ri->rp->handler) {
-			__get_cpu_var(current_kprobe) = &ri->rp->kp;
+			__this_cpu_write(current_kprobe, &ri->rp->kp);
 			ri->rp->handler(ri, regs);
-			__get_cpu_var(current_kprobe) = NULL;
+			__this_cpu_write(current_kprobe, NULL);
 		}
 
 		orig_ret_address = (unsigned long)ri->ret_addr;
@@ -383,19 +383,19 @@ static int __kprobes post_kprobe_handler
 		cur->post_handler(cur, regs, 0);
 	}
 
-	p = &__get_cpu_var(saved_next_opcode);
+	p = this_cpu_ptr(&saved_next_opcode);
 	if (p->addr) {
 		arch_disarm_kprobe(p);
 		p->addr = NULL;
 		p->opcode = 0;
 
-		addr = __get_cpu_var(saved_current_opcode).addr;
-		__get_cpu_var(saved_current_opcode).addr = NULL;
+		addr = __this_cpu_read(saved_current_opcode).addr;
+		__this_cpu_write(saved_current_opcode.addr, NULL);
 
 		p = get_kprobe(addr);
 		arch_arm_kprobe(p);
 
-		p = &__get_cpu_var(saved_next_opcode2);
+		p = this_cpu_ptr(&saved_next_opcode2);
 		if (p->addr) {
 			arch_disarm_kprobe(p);
 			p->addr = NULL;
@@ -511,7 +511,7 @@ int __kprobes kprobe_exceptions_notify(s
 				if (kprobe_handler(args->regs)) {
 					ret = NOTIFY_STOP;
 				} else {
-					p = __get_cpu_var(current_kprobe);
+					p = __this_cpu_read(current_kprobe);
 					if (p->break_handler &&
 					    p->break_handler(p, args->regs))
 						ret = NOTIFY_STOP;
Index: linux/arch/sh/kernel/localtimer.c
===================================================================
--- linux.orig/arch/sh/kernel/localtimer.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/localtimer.c	2013-12-02 16:08:01.524307605 -0600
@@ -32,7 +32,7 @@ static DEFINE_PER_CPU(struct clock_event
  */
 void local_timer_interrupt(void)
 {
-	struct clock_event_device *clk = &__get_cpu_var(local_clockevent);
+	struct clock_event_device *clk = this_cpu_ptr(&local_clockevent);
 
 	irq_enter();
 	clk->event_handler(clk);
Index: linux/arch/sh/kernel/perf_event.c
===================================================================
--- linux.orig/arch/sh/kernel/perf_event.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/perf_event.c	2013-12-02 16:08:01.524307605 -0600
@@ -227,7 +227,7 @@ again:
 
 static void sh_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -245,7 +245,7 @@ static void sh_pmu_stop(struct perf_even
 
 static void sh_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -262,7 +262,7 @@ static void sh_pmu_start(struct perf_eve
 
 static void sh_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	sh_pmu_stop(event, PERF_EF_UPDATE);
 	__clear_bit(event->hw.idx, cpuc->used_mask);
@@ -272,7 +272,7 @@ static void sh_pmu_del(struct perf_event
 
 static int sh_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 	int ret = -EAGAIN;
Index: linux/arch/sh/kernel/smp.c
===================================================================
--- linux.orig/arch/sh/kernel/smp.c	2013-12-02 16:08:01.534307329 -0600
+++ linux/arch/sh/kernel/smp.c	2013-12-02 16:08:01.524307605 -0600
@@ -111,7 +111,7 @@ void play_dead_common(void)
 	irq_ctx_exit(raw_smp_processor_id());
 	mb();
 
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 	local_irq_disable();
 }
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 47/48] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.16]
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (45 preceding siblings ...)
  2014-02-14 20:19   ` Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-02-14 20:19 ` [PATCH 48/48] percpu: Remove __this_cpu_ptr Christoph Lameter
  2014-03-04 22:27 ` [PATCH 00/48] percpu: Consistent per cpu operations V4 Andrew Morton
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: this_drop_get_cpu_Var --]
[-- Type: text/plain, Size: 1187 bytes --]

No user is left in the kernel source tree. Therefore we can
drop the definitions.

[Patch should not be merged until all the replacement patches have been
merged. Probably this means hold until the 3.16 merge window]

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/asm-generic/percpu.h
===================================================================
--- linux.orig/include/asm-generic/percpu.h	2013-12-02 16:08:01.914296770 -0600
+++ linux/include/asm-generic/percpu.h	2013-12-02 16:08:01.914296770 -0600
@@ -65,9 +65,6 @@ extern unsigned long __per_cpu_offset[NR
 #define this_cpu_ptr(ptr) raw_cpu_ptr(ptr)
 #endif
 
-#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
-#define __raw_get_cpu_var(var) (*raw_cpu_ptr(&(var)))
-
 #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
 extern void setup_per_cpu_areas(void);
 #endif
@@ -80,8 +77,6 @@ extern void setup_per_cpu_areas(void);
 })
 
 #define per_cpu(var, cpu)	(*((void)(cpu), VERIFY_PERCPU_PTR(&(var))))
-#define __get_cpu_var(var)	(*VERIFY_PERCPU_PTR(&(var)))
-#define __raw_get_cpu_var(var)	(*VERIFY_PERCPU_PTR(&(var)))
 #define this_cpu_ptr(ptr)	per_cpu_ptr(ptr, 0)
 #define raw_cpu_ptr(ptr)	this_cpu_ptr(ptr)
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 48/48] percpu: Remove __this_cpu_ptr
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (46 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 47/48] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.16] Christoph Lameter
@ 2014-02-14 20:19 ` Christoph Lameter
  2014-03-04 22:27 ` [PATCH 00/48] percpu: Consistent per cpu operations V4 Andrew Morton
  48 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-02-14 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: __this_cpu_ptr_gone --]
[-- Type: text/plain, Size: 610 bytes --]

The __this_cpu_ptr macro is no longer in use so drop it.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/asm-generic/percpu.h
===================================================================
--- linux.orig/include/asm-generic/percpu.h	2013-12-18 13:41:37.359646058 -0600
+++ linux/include/asm-generic/percpu.h	2013-12-18 13:42:00.428902830 -0600
@@ -117,7 +117,4 @@ extern void setup_per_cpu_areas(void);
 #define PER_CPU_DEF_ATTRIBUTES
 #endif
 
-/* Keep until we have removed all uses of __this_cpu_ptr */
-#define __this_cpu_ptr raw_cpu_ptr
-
 #endif /* _ASM_GENERIC_PERCPU_H_ */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-14 20:19 ` [PATCH 39/48] powerpc: " Christoph Lameter
@ 2014-02-15  3:50   ` Benjamin Herrenschmidt
  2014-02-15  4:26     ` Steven Rostedt
  2014-02-15  9:42     ` Peter Zijlstra
  0 siblings, 2 replies; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15  3:50 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Paul Mackerras

For some reason Im still getting these as attachments instead of inline
in the email, which makes reviewing a major PITA. Christoph, I'm not
sure what you are doing wrong here but you should consider fixing it :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  3:50   ` Benjamin Herrenschmidt
@ 2014-02-15  4:26     ` Steven Rostedt
  2014-02-15  7:54       ` Mike Galbraith
  2014-02-15  9:59       ` Benjamin Herrenschmidt
  2014-02-15  9:42     ` Peter Zijlstra
  1 sibling, 2 replies; 87+ messages in thread
From: Steven Rostedt @ 2014-02-15  4:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Lameter, Tejun Heo, akpm, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Paul Mackerras

On Sat, 15 Feb 2014 14:50:08 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> For some reason Im still getting these as attachments instead of inline
> in the email, which makes reviewing a major PITA. Christoph, I'm not
> sure what you are doing wrong here but you should consider fixing it :-)

Get a better email client ;-)

They show up fine for me. Looking at the raw format, could it possibly
be the DKIM-Signature? Here's (most of) the header:

Return-Path: cl@linux.com
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on goliath
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
	LOCAL_SUBJECT_PATCH,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.2
Delivered-To: rostedt@goodmis.org
X-FDA: 68539149672.07.nose31_2242917219f26
X-HE-Tag: nose31_2242917219f26
X-Filterd-Recvd-Size: 34239
Received: from mail.hover.com.cust.hostedemail.com [216.40.42.134]
	by goliath with IMAP (fetchmail-6.3.26)
	for <rostedt@localhost> (single-drop); Fri, 14 Feb 2014 15:23:03 -0500 (EST)
Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [76.96.30.96])
	by imf20.hostedemail.com (Postfix) with ESMTP
	for <rostedt@goodmis.org>; Fri, 14 Feb 2014 20:20:35 +0000 (UTC)
Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71])
	by qmta09.emeryville.ca.mail.comcast.net with comcast
	id SKft1n0011Y3wxoA9LLbGd; Fri, 14 Feb 2014 20:20:35 +0000
Received: from gentwo.org ([98.213.233.247])
	by omta15.emeryville.ca.mail.comcast.net with comcast
	id SLLY1n0095Lw0ES8bLLZFV; Fri, 14 Feb 2014 20:20:34 +0000
Received: by gentwo.org (Postfix, from userid 1001)
	id 2941A68181; Fri, 14 Feb 2014 14:19:08 -0600 (CST)
Message-Id: <20140214201908.082769265@linux.com>
Date: Fri, 14 Feb 2014 14:19:20 -0600
From: Christoph Lameter <cl@linux.com>
To: Tejun Heo <tj@kernel.org>
Cc: akpm@linuxfoundation.org,
 rostedt@goodmis.org,
 linux-kernel@vger.kernel.org,
 Ingo Molnar <mingo@kernel.org>,
 Peter Zijlstra <peterz@infradead.org>,
 Thomas Gleixner <tglx@linutronix.de>,
 Benjamin Herrenschmidt <benh@kernel.crashing.org>,
 Paul Mackerras <paulus@samba.org>
Subject: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
References: <20140214201841.826179349@linux.com>
Content-Type: text/plain; charset=UTF-8
Content-Disposition: inline; filename=this_powerpc
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
	s=q20121106; t=1392409235;
	bh=3eMNjwtVIqOOdbqtLUxQGKw+WvtREJiNlp1vuKEVub8=;
	h=Received:Received:Received:Message-Id:Date:From:To:Subject:
	 Content-Type;
	b=AtOZLlCPJL9tvUnoSyw0qPmNQym1dGDConwMdjEnayz2rNubuobF/JYdewBUypgbZ
	 3sq/JQoBriYX8NsMHQhzZL4bh+7gM6OPWhGSIczq6MNuzFvw7E9MVTeX+t+yhVXMNG
	 O+sWkLbWkM18luLRPlJU7wD6ANEFrGgQvt7i1GqSrCVZ+qKgXq9brzu9NTsMp4ODzv
	 Djs9LTZEBTjbYxllW+16x+lV/Y6zY9LYQzsB6VEEpjO2J5Z6Wbk12K4puGL0rPPKoW
	 rpxrQjP44g+Tgr5iguuBUQAqIuzcgbDu3MtOwz+HllZuKiye+qQB+Eq/U5+CqtaIxl
	 FN9CB0XgHTDJA==


-- Steve

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  4:26     ` Steven Rostedt
@ 2014-02-15  7:54       ` Mike Galbraith
  2014-02-15 10:00         ` Benjamin Herrenschmidt
  2014-02-15  9:59       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 87+ messages in thread
From: Mike Galbraith @ 2014-02-15  7:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Christoph Lameter, Tejun Heo, akpm,
	linux-kernel, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Paul Mackerras

On Fri, 2014-02-14 at 23:26 -0500, Steven Rostedt wrote: 
> On Sat, 15 Feb 2014 14:50:08 +1100
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > For some reason Im still getting these as attachments instead of inline
> > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > sure what you are doing wrong here but you should consider fixing it :-)
> 
> Get a better email client ;-)

Yeah, evolution shows them inline just fine (thought you made it do bad
things too, it couldn't save signed patches or such).

-Mike


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  3:50   ` Benjamin Herrenschmidt
  2014-02-15  4:26     ` Steven Rostedt
@ 2014-02-15  9:42     ` Peter Zijlstra
  2014-02-15 10:01       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2014-02-15  9:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Lameter, Tejun Heo, akpm, rostedt, linux-kernel,
	Ingo Molnar, Thomas Gleixner, Paul Mackerras

On Sat, Feb 15, 2014 at 02:50:08PM +1100, Benjamin Herrenschmidt wrote:
> For some reason Im still getting these as attachments instead of inline
> in the email, which makes reviewing a major PITA. Christoph, I'm not
> sure what you are doing wrong here but you should consider fixing it :-)

Yeah, what the others have already said; evolution is a pile of steaming
crap. Get yourself a real MUA :-)

For some reason it does absurd things with "Content-Disposition:
inline".

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  4:26     ` Steven Rostedt
  2014-02-15  7:54       ` Mike Galbraith
@ 2014-02-15  9:59       ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15  9:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Lameter, Tejun Heo, akpm, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Paul Mackerras

On Fri, 2014-02-14 at 23:26 -0500, Steven Rostedt wrote:
> On Sat, 15 Feb 2014 14:50:08 +1100
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > For some reason Im still getting these as attachments instead of inline
> > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > sure what you are doing wrong here but you should consider fixing it :-)
> 
> Get a better email client ;-)

Hah, maybe, I use evolution... but those patches from Christoph are the
only ones to do that for me, everything else works just fine.

Cheers,
Ben.

> They show up fine for me. Looking at the raw format, could it possibly
> be the DKIM-Signature? Here's (most of) the header:
> 
> Return-Path: cl@linux.com
> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on goliath
> X-Spam-Level: 
> X-Spam-Status: No, score=-0.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
> 	LOCAL_SUBJECT_PATCH,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.2
> Delivered-To: rostedt@goodmis.org
> X-FDA: 68539149672.07.nose31_2242917219f26
> X-HE-Tag: nose31_2242917219f26
> X-Filterd-Recvd-Size: 34239
> Received: from mail.hover.com.cust.hostedemail.com [216.40.42.134]
> 	by goliath with IMAP (fetchmail-6.3.26)
> 	for <rostedt@localhost> (single-drop); Fri, 14 Feb 2014 15:23:03 -0500 (EST)
> Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [76.96.30.96])
> 	by imf20.hostedemail.com (Postfix) with ESMTP
> 	for <rostedt@goodmis.org>; Fri, 14 Feb 2014 20:20:35 +0000 (UTC)
> Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71])
> 	by qmta09.emeryville.ca.mail.comcast.net with comcast
> 	id SKft1n0011Y3wxoA9LLbGd; Fri, 14 Feb 2014 20:20:35 +0000
> Received: from gentwo.org ([98.213.233.247])
> 	by omta15.emeryville.ca.mail.comcast.net with comcast
> 	id SLLY1n0095Lw0ES8bLLZFV; Fri, 14 Feb 2014 20:20:34 +0000
> Received: by gentwo.org (Postfix, from userid 1001)
> 	id 2941A68181; Fri, 14 Feb 2014 14:19:08 -0600 (CST)
> Message-Id: <20140214201908.082769265@linux.com>
> Date: Fri, 14 Feb 2014 14:19:20 -0600
> From: Christoph Lameter <cl@linux.com>
> To: Tejun Heo <tj@kernel.org>
> Cc: akpm@linuxfoundation.org,
>  rostedt@goodmis.org,
>  linux-kernel@vger.kernel.org,
>  Ingo Molnar <mingo@kernel.org>,
>  Peter Zijlstra <peterz@infradead.org>,
>  Thomas Gleixner <tglx@linutronix.de>,
>  Benjamin Herrenschmidt <benh@kernel.crashing.org>,
>  Paul Mackerras <paulus@samba.org>
> Subject: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
> References: <20140214201841.826179349@linux.com>
> Content-Type: text/plain; charset=UTF-8
> Content-Disposition: inline; filename=this_powerpc
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
> 	s=q20121106; t=1392409235;
> 	bh=3eMNjwtVIqOOdbqtLUxQGKw+WvtREJiNlp1vuKEVub8=;
> 	h=Received:Received:Received:Message-Id:Date:From:To:Subject:
> 	 Content-Type;
> 	b=AtOZLlCPJL9tvUnoSyw0qPmNQym1dGDConwMdjEnayz2rNubuobF/JYdewBUypgbZ
> 	 3sq/JQoBriYX8NsMHQhzZL4bh+7gM6OPWhGSIczq6MNuzFvw7E9MVTeX+t+yhVXMNG
> 	 O+sWkLbWkM18luLRPlJU7wD6ANEFrGgQvt7i1GqSrCVZ+qKgXq9brzu9NTsMp4ODzv
> 	 Djs9LTZEBTjbYxllW+16x+lV/Y6zY9LYQzsB6VEEpjO2J5Z6Wbk12K4puGL0rPPKoW
> 	 rpxrQjP44g+Tgr5iguuBUQAqIuzcgbDu3MtOwz+HllZuKiye+qQB+Eq/U5+CqtaIxl
> 	 FN9CB0XgHTDJA==
> 
> 
> -- Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  7:54       ` Mike Galbraith
@ 2014-02-15 10:00         ` Benjamin Herrenschmidt
  2014-02-15 11:29           ` Mike Galbraith
  0 siblings, 1 reply; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15 10:00 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Steven Rostedt, Christoph Lameter, Tejun Heo, akpm, linux-kernel,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Paul Mackerras

On Sat, 2014-02-15 at 08:54 +0100, Mike Galbraith wrote:
> > > For some reason Im still getting these as attachments instead of inline
> > > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > > sure what you are doing wrong here but you should consider fixing it :-)
> > 
> > Get a better email client ;-)
> 
> Yeah, evolution shows them inline just fine (thought you made it do bad
> things too, it couldn't save signed patches or such).

Evo 3.8.4 here and they show as attachements only... odd.

Ben.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15  9:42     ` Peter Zijlstra
@ 2014-02-15 10:01       ` Benjamin Herrenschmidt
  2014-02-15 12:07         ` Andreas Schwab
  2014-02-15 15:45         ` Peter Zijlstra
  0 siblings, 2 replies; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15 10:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Lameter, Tejun Heo, akpm, rostedt, linux-kernel,
	Ingo Molnar, Thomas Gleixner, Paul Mackerras

On Sat, 2014-02-15 at 10:42 +0100, Peter Zijlstra wrote:
> On Sat, Feb 15, 2014 at 02:50:08PM +1100, Benjamin Herrenschmidt wrote:
> > For some reason Im still getting these as attachments instead of inline
> > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > sure what you are doing wrong here but you should consider fixing it :-)
> 
> Yeah, what the others have already said; evolution is a pile of steaming
> crap. Get yourself a real MUA :-)

I yet have to find one that isn't a steaming pile ... :-) Sadly I got
used to evo and transferring all my filters etc... to another one is
something I'm really not looking fwd to, though I suppose it will have
to happen sooner or later.

Plus dwmw2 keeps convincing me that evo doesn't suck as much as I think
it does :-)

Cheers,
Ben.

> For some reason it does absurd things with "Content-Disposition:
> inline".
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 10:00         ` Benjamin Herrenschmidt
@ 2014-02-15 11:29           ` Mike Galbraith
  0 siblings, 0 replies; 87+ messages in thread
From: Mike Galbraith @ 2014-02-15 11:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Steven Rostedt, Christoph Lameter, Tejun Heo, akpm, linux-kernel,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Paul Mackerras

On Sat, 2014-02-15 at 21:00 +1100, Benjamin Herrenschmidt wrote: 
> On Sat, 2014-02-15 at 08:54 +0100, Mike Galbraith wrote:
> > > > For some reason Im still getting these as attachments instead of inline
> > > > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > > > sure what you are doing wrong here but you should consider fixing it :-)
> > > 
> > > Get a better email client ;-)
> > 
> > Yeah, evolution shows them inline just fine (thought you made it do bad
> > things too, it couldn't save signed patches or such).
> 
> Evo 3.8.4 here and they show as attachements only... odd.

Heh, 3.2.3 here.  I may never "upgrade" this box again.  No daemon from
hell, no grub2, everything including evolution (ok, mostly) just works.

-Mike


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 20/48] time: Replace __get_cpu_var uses
  2014-02-14 20:19 ` [PATCH 20/48] time: Replace __get_cpu_var uses Christoph Lameter
@ 2014-02-15 11:33   ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2014-02-15 11:33 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra

On Fri, 14 Feb 2014, Christoph Lameter wrote:

> [Patch depends on another patch in this series that introduces raw_cpu_ops]
> 
> Convert uses of __get_cpu_var for creating a address from a percpu
> offset to this_cpu_ptr.
> 
> The two cases where get_cpu_var is used to actually access a percpu
> variable are changed to use this_cpu_read/raw_cpu_read.
> 
> CC: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr
  2014-02-14 20:19 ` [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr Christoph Lameter
@ 2014-02-15 11:33   ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2014-02-15 11:33 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra

On Fri, 14 Feb 2014, Christoph Lameter wrote:

> Two new uses introduced in 3.14-rc1.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 10:01       ` Benjamin Herrenschmidt
@ 2014-02-15 12:07         ` Andreas Schwab
  2014-02-15 15:45         ` Peter Zijlstra
  1 sibling, 0 replies; 87+ messages in thread
From: Andreas Schwab @ 2014-02-15 12:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Zijlstra, Christoph Lameter, Tejun Heo, akpm, rostedt,
	linux-kernel, Ingo Molnar, Thomas Gleixner, Paul Mackerras

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Sat, 2014-02-15 at 10:42 +0100, Peter Zijlstra wrote:
>> On Sat, Feb 15, 2014 at 02:50:08PM +1100, Benjamin Herrenschmidt wrote:
>> > For some reason Im still getting these as attachments instead of inline
>> > in the email, which makes reviewing a major PITA. Christoph, I'm not
>> > sure what you are doing wrong here but you should consider fixing it :-)
>> 
>> Yeah, what the others have already said; evolution is a pile of steaming
>> crap. Get yourself a real MUA :-)
>
> I yet have to find one that isn't a steaming pile ... :-) 

Gnus.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 10:01       ` Benjamin Herrenschmidt
  2014-02-15 12:07         ` Andreas Schwab
@ 2014-02-15 15:45         ` Peter Zijlstra
  2014-02-15 18:12           ` David Woodhouse
  1 sibling, 1 reply; 87+ messages in thread
From: Peter Zijlstra @ 2014-02-15 15:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Lameter, Tejun Heo, akpm, rostedt, linux-kernel,
	Ingo Molnar, Thomas Gleixner, Paul Mackerras, David Woodhouse

On Sat, Feb 15, 2014 at 09:01:19PM +1100, Benjamin Herrenschmidt wrote:
> On Sat, 2014-02-15 at 10:42 +0100, Peter Zijlstra wrote:
> > On Sat, Feb 15, 2014 at 02:50:08PM +1100, Benjamin Herrenschmidt wrote:
> > > For some reason Im still getting these as attachments instead of inline
> > > in the email, which makes reviewing a major PITA. Christoph, I'm not
> > > sure what you are doing wrong here but you should consider fixing it :-)
> > 
> > Yeah, what the others have already said; evolution is a pile of steaming
> > crap. Get yourself a real MUA :-)
> 
> I yet have to find one that isn't a steaming pile ... :-) Sadly I got
> used to evo and transferring all my filters etc... to another one is
> something I'm really not looking fwd to, though I suppose it will have
> to happen sooner or later.
> 
> Plus dwmw2 keeps convincing me that evo doesn't suck as much as I think
> it does :-)

There's a good solution there; make dwmw2 fix it ;-)

David; in particular; quilt mail adds "Content-Disposition: inline"
headers and Evo, in its infinite wisdom, makes the entire body into an
attachment.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 15:45         ` Peter Zijlstra
@ 2014-02-15 18:12           ` David Woodhouse
  2014-02-15 20:32             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 87+ messages in thread
From: David Woodhouse @ 2014-02-15 18:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Benjamin Herrenschmidt, Christoph Lameter, Tejun Heo, akpm,
	rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

On Sat, 2014-02-15 at 16:45 +0100, Peter Zijlstra wrote:
> 
> David; in particular; quilt mail adds "Content-Disposition: inline"
> headers and Evo, in its infinite wisdom, makes the entire body into an
> attachment.

Hm, looking at the original email I see it as an inline part, displayed
by default. I can read it, select parts of it and hit 'reply' to cite
only the selected part... everything I can do with a normal email.

Ben, what precisely is it that makes it hard to read and review? All I
can see is a trivial cosmetic issue; it's a few pixels down and right
from where it would normally be.

But even that looks like it should be considered a bug. Please file in
GNOME bugzilla.

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5745 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 18:12           ` David Woodhouse
@ 2014-02-15 20:32             ` Benjamin Herrenschmidt
  2014-02-15 20:52               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15 20:32 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Peter Zijlstra, Christoph Lameter, Tejun Heo, akpm, rostedt,
	linux-kernel, Ingo Molnar, Thomas Gleixner, Paul Mackerras

On Sat, 2014-02-15 at 18:12 +0000, David Woodhouse wrote:
> On Sat, 2014-02-15 at 16:45 +0100, Peter Zijlstra wrote:
> > 
> > David; in particular; quilt mail adds "Content-Disposition: inline"
> > headers and Evo, in its infinite wisdom, makes the entire body into an
> > attachment.
> 
> Hm, looking at the original email I see it as an inline part, displayed
> by default. I can read it, select parts of it and hit 'reply' to cite
> only the selected part... everything I can do with a normal email.
> 
> Ben, what precisely is it that makes it hard to read and review? All I
> can see is a trivial cosmetic issue; it's a few pixels down and right
> from where it would normally be.
> 
> But even that looks like it should be considered a bug. Please file in
> GNOME bugzilla.

In my case it doesn't have an inline part. I only get an attachment,

I'll file it in gnome bz.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 39/48] powerpc: Replace __get_cpu_var uses
  2014-02-15 20:32             ` Benjamin Herrenschmidt
@ 2014-02-15 20:52               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 87+ messages in thread
From: Benjamin Herrenschmidt @ 2014-02-15 20:52 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Peter Zijlstra, Christoph Lameter, Tejun Heo, akpm, rostedt,
	linux-kernel, Ingo Molnar, Thomas Gleixner, Paul Mackerras


On Sun, 2014-02-16 at 07:32 +1100, Benjamin Herrenschmidt wrote:

> > But even that looks like it should be considered a bug. Please file in
> > GNOME bugzilla.
> 
> In my case it doesn't have an inline part. I only get an attachment,
> 
> I'll file it in gnome bz.

Hah, story of my life: 3.8.x is marked "obsolete" in gnome BZ :)

I suppose that's the price for using Ubuntu, the version of evolution is
always just old enough that nobody cares about bugs in it anymore :-)

Anyway, gnome3 PPA gave me 3.10.3 and the problem is still there, filing
as https://bugzilla.gnome.org/show_bug.cgi?id=724437

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 24/48] rcu: Replace __this_cpu_ptr uses with raw_cpu_ptr
  2014-02-14 20:19 ` [PATCH 24/48] rcu: Replace __this_cpu_ptr uses " Christoph Lameter
@ 2014-02-16 16:17   ` Paul E. McKenney
  0 siblings, 0 replies; 87+ messages in thread
From: Paul E. McKenney @ 2014-02-16 16:17 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Dipankar Sarma

On Fri, Feb 14, 2014 at 02:19:05PM -0600, Christoph Lameter wrote:
> [Patch depends on another patch in this series that introduces raw_cpu_ops]
> 
> __this_cpu_ptr is being phased out.
> 
> One special case is increment_cpu_stall_ticks().
> A per cpu variable is incremented so use raw_cpu_inc().
> 
> Cc: Dipankar Sarma <dipankar@in.ibm.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Index: linux/kernel/rcu/tree.c
> ===================================================================
> --- linux.orig/kernel/rcu/tree.c	2014-02-03 13:23:21.855072103 -0600
> +++ linux/kernel/rcu/tree.c	2014-02-03 13:23:21.845072311 -0600
> @@ -1951,7 +1951,7 @@
>  static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, unsigned long flags)
>  {
>  	int i;
> -	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
> +	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
> 
>  	/* No-CBs CPUs are handled specially. */
>  	if (rcu_nocb_adopt_orphan_cbs(rsp, rdp, flags))
> @@ -2334,7 +2334,7 @@
>  __rcu_process_callbacks(struct rcu_state *rsp)
>  {
>  	unsigned long flags;
> -	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
> +	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
> 
>  	WARN_ON_ONCE(rdp->beenonline == 0);
> 
> @@ -2936,7 +2936,7 @@
>  static void rcu_barrier_func(void *type)
>  {
>  	struct rcu_state *rsp = type;
> -	struct rcu_data *rdp = __this_cpu_ptr(rsp->rda);
> +	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
> 
>  	_rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done);
>  	atomic_inc(&rsp->barrier_cpu_count);
> Index: linux/kernel/rcu/tree_plugin.h
> ===================================================================
> --- linux.orig/kernel/rcu/tree_plugin.h	2014-02-03 13:23:21.855072103 -0600
> +++ linux/kernel/rcu/tree_plugin.h	2014-02-03 13:23:21.845072311 -0600
> @@ -1848,7 +1848,7 @@
>  	struct rcu_data *rdp;
> 
>  	for_each_rcu_flavor(rsp) {
> -		rdp = __this_cpu_ptr(rsp->rda);
> +		rdp = raw_cpu_ptr(rsp->rda);
>  		if (rdp->qlen_lazy != 0) {
>  			atomic_inc(&oom_callback_count);
>  			rsp->call(&rdp->oom_head, rcu_oom_callback);
> @@ -1990,7 +1990,7 @@
>  	struct rcu_state *rsp;
> 
>  	for_each_rcu_flavor(rsp)
> -		__this_cpu_ptr(rsp->rda)->ticks_this_gp++;
> +		raw_cpu_inc(rsp->rda->ticks_this_gp);
>  }
> 
>  #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
> 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops
  2014-02-14 20:18 ` [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops Christoph Lameter
@ 2014-03-04 22:27   ` Andrew Morton
  2014-03-04 23:27     ` Steven Rostedt
  2014-03-05  3:27     ` Christoph Lameter
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2014-03-04 22:27 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Fri, 14 Feb 2014 14:18:46 -0600 Christoph Lameter <cl@linux.com> wrote:

> [Patch depends on another patch in this series that introduces raw_cpu_ops]
> 
> We define a check function in order to avoid trouble with the
> include files. Then the higher level __this_cpu macros are
> modified to invoke the preemption check.
> 
> --- linux.orig/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
> +++ linux/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
> @@ -7,7 +7,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/sched.h>
>  
> -notrace unsigned int debug_smp_processor_id(void)
> +notrace static unsigned int check_preemption_disabled(char *what)
>  {
>  	int this_cpu = raw_smp_processor_id();
>  
> @@ -38,9 +38,9 @@
>  	if (!printk_ratelimit())
>  		goto out_enable;
>  
> -	printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] "
> -			"code: %s/%d\n",
> -			preempt_count() - 1, current->comm, current->pid);
> +	printk(KERN_ERR "BUG: using %s in preemptible [%08x] code: %s/%d\n",
> +		what, preempt_count() - 1, current->comm, current->pid);
> +
>  	print_symbol("caller is %s\n", (long)__builtin_return_address(0));
>  	dump_stack();

I wonder if there's any point in printing __builtin_return_address. 
Doesn't dump_stack() tell us the same thing?

> @@ -50,5 +50,17 @@
>  	return this_cpu;
>  }
>  
> +notrace unsigned int debug_smp_processor_id(void)
> +{
> +	return check_preemption_disabled("smp_processor_id()");
> +}
>  EXPORT_SYMBOL(debug_smp_processor_id);
>  
> +notrace void __this_cpu_preempt_check(const char *op)
> +{
> +	char text[40];
> +
> +	snprintf(text, sizeof(text), "__this_cpu_%s()", op);
> +	check_preemption_disabled(text);
> +}

I'd like to see a comment here telling scared readers why this can
never overflow text[].


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/48] percpu: Consistent per cpu operations V4
  2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
                   ` (47 preceding siblings ...)
  2014-02-14 20:19 ` [PATCH 48/48] percpu: Remove __this_cpu_ptr Christoph Lameter
@ 2014-03-04 22:27 ` Andrew Morton
  2014-03-05  3:29   ` Christoph Lameter
  48 siblings, 1 reply; 87+ messages in thread
From: Andrew Morton @ 2014-03-04 22:27 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Fri, 14 Feb 2014 14:18:41 -0600 Christoph Lameter <cl@linux.com> wrote:

> Can we please get this merged? The first patch alone would at least define
> the functions required to enable the merging of the rest in any order and
> through any tree.

This series is structured as

[patch 1]: make changes whcih trigger lots of runtime warnings
[patch 2-n]: fix up those warnings

yes?

So we're proposing adding a 48-patch bisection hole in which scary
warnings will be emitted.

I guess that's liveable with - we *could* fix it, by starting out with
do-nothing wrappers, then all the fixes and then finish up with patches
which turn do-nothing-wrapeprs into do-something-functions.  But I'm
not sure that the resulting obscuration is worth the effort.

> The kernel has never been audited to ensure that this_cpu operations are
> consistently used throughout the kernel. The code generated in many
> places can be improved through the use of this_cpu operations (which uses
> a segment register for relocation of per cpu offsets instead of
> performing address calculations).
> 
> The patch set also addresses various consistency issues in general with
> the per cpu macros.
> 
> A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
>    because checks are skipped. This is typically shown through a raw_
>    prefix. So this patch set changes the places where __this_cpu_ptr()
>    is used to raw_cpu_ptr().
> 
> B. There has been the long term wish by some that __this_cpu operations
>    would check for preemption. However, there are cases where preemption
>    checks need to be skipped. This patch set adds raw_cpu operations that
>    do not check for preemption and then adds preemption checks to the
>    __this_cpu operations.
> 
> C. The use of __get_cpu_var is always a reference to a percpu variable
>    that can also be handled via a this_cpu operation. This patch set
>    replaces all uses of __get_cpu_var with this_cpu operations.
> 
> D. We can then use this_cpu RMW operations in various places replacing
>    sequences of instructions by a single one.
> 
> E. The use of this_cpu operations throughout will allow other arches than
>    x86 to implement optimized references and RMV operations to work with
>    per cpu local data.
> 
> F. The use of this_cpu operations opens up the possibility to
>    further optimize code that relies on synchronization through
>    per cpu data.
> 
> 
> The patch set works in a couple of stages:
> 
> I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
>     Also converts the existing __this_cpu_xx_# primitive in the x86
>     code to raw_cpu_xx_#.
> 
> II. Patch 2-4 use the raw_cpu operations in places that would give
>      us false positives once they are enabled.
> 
> III. Patch 5 adds preemption checks to __this_cpu operations to allow
>     checking if preemption is properly disabled when these functions
>     are used.
> 
> IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
>    with this_cpu_ptr. They do not depend on any changes to the percpu
>    code. No preemption tests are skipped if they are applied.
> 
> V. Patches 21-46 are conversion patches that use this_cpu operations
>    in various kernel subsystems/drivers or arch code.

That all seems desirable.

> VI. Patches 47/48 remove no longer used functions (__this_cpu_ptr
>     and __get_cpu_var).  These should only be applied after all the
>     conversion patches have made it and after we have done additional
>     passes through the kernel to ensure that none of the uses of these
>     functions remain.

Yes, I'll skip those two.

In linux-next arch/arm/mach-msm/timer.c gets moved to
drivers/clocksource/qcom-timer.c, which I fixed up.  Apart from that it
all still merges OK...

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-02-14 20:19 ` [PATCH 31/48] uv: Replace __get_cpu_var Christoph Lameter
@ 2014-03-04 23:02   ` Andrew Morton
  2014-03-04 23:42     ` Steven Rostedt
                       ` (2 more replies)
  0 siblings, 3 replies; 87+ messages in thread
From: Andrew Morton @ 2014-03-04 23:02 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Fri, 14 Feb 2014 14:19:12 -0600 Christoph Lameter <cl@linux.com> wrote:

> Use __this_cpu_read instead.
> 
> ...
>
> --- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> +++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> @@ -618,7 +618,7 @@
>  };
>  
>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> -#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
> +#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)

arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use in this function)
arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is reported only once
arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)


This?

--- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
+++ a/arch/x86/include/asm/uv/uv_hub.h
@@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
 };
 
 DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
-#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
+#define uv_cpu_nmi			(*this_cpu_ptr(&__uv_cpu_nmi))
 #define uv_hub_nmi			(uv_cpu_nmi.hub)
 #define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
 #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)
_


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops
  2014-03-04 22:27   ` Andrew Morton
@ 2014-03-04 23:27     ` Steven Rostedt
  2014-03-05  3:27     ` Christoph Lameter
  1 sibling, 0 replies; 87+ messages in thread
From: Steven Rostedt @ 2014-03-04 23:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Tejun Heo, akpm, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Tue, 4 Mar 2014 14:27:27 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 14 Feb 2014 14:18:46 -0600 Christoph Lameter <cl@linux.com> wrote:
> 
> > [Patch depends on another patch in this series that introduces raw_cpu_ops]
> > 
> > We define a check function in order to avoid trouble with the
> > include files. Then the higher level __this_cpu macros are
> > modified to invoke the preemption check.
> > 
> > --- linux.orig/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
> > +++ linux/lib/smp_processor_id.c	2014-01-30 14:40:50.936519233 -0600
> > @@ -7,7 +7,7 @@
> >  #include <linux/kallsyms.h>
> >  #include <linux/sched.h>
> >  
> > -notrace unsigned int debug_smp_processor_id(void)
> > +notrace static unsigned int check_preemption_disabled(char *what)
> >  {
> >  	int this_cpu = raw_smp_processor_id();
> >  
> > @@ -38,9 +38,9 @@
> >  	if (!printk_ratelimit())
> >  		goto out_enable;
> >  
> > -	printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] "
> > -			"code: %s/%d\n",
> > -			preempt_count() - 1, current->comm, current->pid);
> > +	printk(KERN_ERR "BUG: using %s in preemptible [%08x] code: %s/%d\n",
> > +		what, preempt_count() - 1, current->comm, current->pid);
> > +
> >  	print_symbol("caller is %s\n", (long)__builtin_return_address(0));
> >  	dump_stack();
> 
> I wonder if there's any point in printing __builtin_return_address. 
> Doesn't dump_stack() tell us the same thing?

When frame pointers are enabled, sure. But without frame pointers, I'm
not so sure.

-- Steve

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-04 23:02   ` Andrew Morton
@ 2014-03-04 23:42     ` Steven Rostedt
  2014-03-04 23:47       ` Andrew Morton
  2014-03-05  0:18     ` H. Peter Anvin
  2014-03-05  3:31     ` Christoph Lameter
  2 siblings, 1 reply; 87+ messages in thread
From: Steven Rostedt @ 2014-03-04 23:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Tejun Heo, akpm, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Tue, 4 Mar 2014 15:02:17 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 14 Feb 2014 14:19:12 -0600 Christoph Lameter <cl@linux.com> wrote:
> 
> > Use __this_cpu_read instead.
> > 
> > ...
> >
> > --- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > +++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > @@ -618,7 +618,7 @@
> >  };
> >  
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
> > +#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> 
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use in this function)
> arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is reported only once
> arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
> 
> 
> This?
> 
> --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> +++ a/arch/x86/include/asm/uv/uv_hub.h
> @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
>  };
>  
>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> -#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> +#define uv_cpu_nmi			(*this_cpu_ptr(&__uv_cpu_nmi))

Looks like an extra "_" was added.

-- Steve

>  #define uv_hub_nmi			(uv_cpu_nmi.hub)
>  #define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
>  #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)
> _


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-04 23:42     ` Steven Rostedt
@ 2014-03-04 23:47       ` Andrew Morton
  0 siblings, 0 replies; 87+ messages in thread
From: Andrew Morton @ 2014-03-04 23:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Lameter, Tejun Heo, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Tue, 4 Mar 2014 18:42:10 -0500 Steven Rostedt <rostedt@goodmis.org> wrote:

> > --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> > +++ a/arch/x86/include/asm/uv/uv_hub.h
> > @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
> >  };
> >  
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> > +#define uv_cpu_nmi			(*this_cpu_ptr(&__uv_cpu_nmi))
> 
> Looks like an extra "_" was added.

yes, there were two mistakes in that line.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-04 23:02   ` Andrew Morton
  2014-03-04 23:42     ` Steven Rostedt
@ 2014-03-05  0:18     ` H. Peter Anvin
  2014-03-05  3:31     ` Christoph Lameter
  2 siblings, 0 replies; 87+ messages in thread
From: H. Peter Anvin @ 2014-03-05  0:18 UTC (permalink / raw)
  To: Andrew Morton, Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On 03/04/2014 03:02 PM, Andrew Morton wrote:
> On Fri, 14 Feb 2014 14:19:12 -0600 Christoph Lameter <cl@linux.com> wrote:
> 
>> Use __this_cpu_read instead.
>>
>>
>> --- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
>> +++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
>> @@ -618,7 +618,7 @@
>>  };
>>  
>>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
>> -#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
>> +#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> 
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use in this function)
> arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is reported only once
> arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
> 
> 
> This?
> 

More likely just add the missing second underscore.

	-hpa


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops
  2014-03-04 22:27   ` Andrew Morton
  2014-03-04 23:27     ` Steven Rostedt
@ 2014-03-05  3:27     ` Christoph Lameter
  2014-03-05 21:34       ` Andrew Morton
  1 sibling, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-03-05  3:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Tue, 4 Mar 2014, Andrew Morton wrote:

> >  	print_symbol("caller is %s\n", (long)__builtin_return_address(0));
> >  	dump_stack();
>
> I wonder if there's any point in printing __builtin_return_address.
> Doesn't dump_stack() tell us the same thing?

Yes it does. However, it was there before and software may scan the logs
for it.

> > +notrace void __this_cpu_preempt_check(const char *op)
> > +{
> > +	char text[40];
> > +
> > +	snprintf(text, sizeof(text), "__this_cpu_%s()", op);
> > +	check_preemption_disabled(text);
> > +}
>
> I'd like to see a comment here telling scared readers why this can
> never overflow text[].

Ok. I can also add VM_BUG_ON(strlen(op) >= sizeof(text)) ?


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 00/48] percpu: Consistent per cpu operations V4
  2014-03-04 22:27 ` [PATCH 00/48] percpu: Consistent per cpu operations V4 Andrew Morton
@ 2014-03-05  3:29   ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-03-05  3:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Tue, 4 Mar 2014, Andrew Morton wrote:

> This series is structured as
>
> [patch 1]: make changes whcih trigger lots of runtime warnings
> [patch 2-n]: fix up those warnings
>
> yes?

Nope. The warning causing things are eliminated before the checks are
introduced. The first patch adds new functionality. The following patches
fix warnings that will otherwise spew over the logs and then we add the
checks.

> So we're proposing adding a 48-patch bisection hole in which scary
> warnings will be emitted.

That is not the case.

What could occur is that there could be kernel configurations which
trigger warnings that so far have not been tested.


> In linux-next arch/arm/mach-msm/timer.c gets moved to
> drivers/clocksource/qcom-timer.c, which I fixed up.  Apart from that it
> all still merges OK...

Great.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-04 23:02   ` Andrew Morton
  2014-03-04 23:42     ` Steven Rostedt
  2014-03-05  0:18     ` H. Peter Anvin
@ 2014-03-05  3:31     ` Christoph Lameter
  2014-03-05  4:00       ` Andrew Morton
  2 siblings, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-03-05  3:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Tue, 4 Mar 2014, Andrew Morton wrote:

> >
> > ...
> >
> > --- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > +++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > @@ -618,7 +618,7 @@
> >  };
> >
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
> > +#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
>
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use in this function)
> arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is reported only once
> arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
>
>
> This?

Nope. I missed an underscore.


> --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> +++ a/arch/x86/include/asm/uv/uv_hub.h
> @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
>  };
>
>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> -#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> +#define uv_cpu_nmi			(*this_cpu_ptr(&__uv_cpu_nmi))

__this_cpu_read(__uv_cpu_nmi)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-05  3:31     ` Christoph Lameter
@ 2014-03-05  4:00       ` Andrew Morton
  2014-03-05 15:35         ` Christoph Lameter
  2014-03-05 21:57         ` Christoph Lameter
  0 siblings, 2 replies; 87+ messages in thread
From: Andrew Morton @ 2014-03-05  4:00 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Tue, 4 Mar 2014 21:31:12 -0600 (CST) Christoph Lameter <cl@linux.com> wrote:

> On Tue, 4 Mar 2014, Andrew Morton wrote:
> 
> > >
> > > ...
> > >
> > > --- linux.orig/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > > +++ linux/arch/x86/include/asm/uv/uv_hub.h	2014-02-03 14:16:53.987889372 -0600
> > > @@ -618,7 +618,7 @@
> > >  };
> > >
> > >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > > -#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
> > > +#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> >
> > arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> > arch/x86/platform/uv/uv_nmi.c:218: error: '_uv_cpu_nmi' undeclared (first use in this function)
> > arch/x86/platform/uv/uv_nmi.c:218: error: (Each undeclared identifier is reported only once
> > arch/x86/platform/uv/uv_nmi.c:218: error: for each function it appears in.)
> >
> >
> > This?
> 
> Nope. I missed an underscore.
> 
> 
> > --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> > +++ a/arch/x86/include/asm/uv/uv_hub.h
> > @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
> >  };
> >
> >  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> > -#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> > +#define uv_cpu_nmi			(*this_cpu_ptr(&__uv_cpu_nmi))
> 
> __this_cpu_read(__uv_cpu_nmi)

--- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
+++ a/arch/x86/include/asm/uv/uv_hub.h
@@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
 };
 
 DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
-#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
+#define uv_cpu_nmi			__this_cpu_read(__uv_cpu_nmi)
 #define uv_hub_nmi			(uv_cpu_nmi.hub)
 #define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
 #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)

arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
arch/x86/platform/uv/uv_nmi.c:218: error: lvalue required as increment operand
arch/x86/platform/uv/uv_nmi.c: In function 'uv_nmi_wait':
arch/x86/platform/uv/uv_nmi.c:362: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c: In function 'uv_nmi_dump_state_cpu':
arch/x86/platform/uv/uv_nmi.c:422: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c: In function 'uv_nmi_dump_state':
arch/x86/platform/uv/uv_nmi.c:491: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c: In function 'uv_handle_nmi':
arch/x86/platform/uv/uv_nmi.c:618: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c:642: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c: In function 'uv_handle_nmi_ping':
arch/x86/platform/uv/uv_nmi.c:669: error: lvalue required as increment operand
arch/x86/platform/uv/uv_nmi.c:670: error: lvalue required as unary '&' operand
arch/x86/platform/uv/uv_nmi.c:675: error: lvalue required as increment operand
arch/x86/platform/uv/uv_nmi.c:678: error: lvalue required as unary '&' operand


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-05  4:00       ` Andrew Morton
@ 2014-03-05 15:35         ` Christoph Lameter
  2014-03-05 21:57         ` Christoph Lameter
  1 sibling, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-03-05 15:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

On Tue, 4 Mar 2014, Andrew Morton wrote:

> On Tue, 4 Mar 2014 21:31:12 -0600 (CST) Christoph Lameter <cl@linux.com> wrote:
>
> --- a/arch/x86/include/asm/uv/uv_hub.h~uv-replace-__get_cpu_var-fix
> +++ a/arch/x86/include/asm/uv/uv_hub.h
> @@ -618,7 +618,7 @@ struct uv_cpu_nmi_s {
>  };
>
>  DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
> -#define uv_cpu_nmi			__this_cpu_read(_uv_cpu_nmi)
> +#define uv_cpu_nmi			__this_cpu_read(__uv_cpu_nmi)
>  #define uv_hub_nmi			(uv_cpu_nmi.hub)
>  #define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
>  #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)
>
> arch/x86/platform/uv/uv_nmi.c: In function 'uv_check_nmi':
> arch/x86/platform/uv/uv_nmi.c:218: error: lvalue required as increment operand

Uhh.. Lets drop this patch for now. This would mean more work is required.
Will submit a more extensive patch.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops
  2014-03-05  3:27     ` Christoph Lameter
@ 2014-03-05 21:34       ` Andrew Morton
  0 siblings, 0 replies; 87+ messages in thread
From: Andrew Morton @ 2014-03-05 21:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner

On Tue, 4 Mar 2014 21:27:14 -0600 (CST) Christoph Lameter <cl@linux.com> wrote:

> > > +notrace void __this_cpu_preempt_check(const char *op)
> > > +{
> > > +	char text[40];
> > > +
> > > +	snprintf(text, sizeof(text), "__this_cpu_%s()", op);
> > > +	check_preemption_disabled(text);
> > > +}
> >
> > I'd like to see a comment here telling scared readers why this can
> > never overflow text[].
> 
> Ok. I can also add VM_BUG_ON(strlen(op) >= sizeof(text)) ?

I misread the code - snprintf() will dtrt and we'll just end up with
truncated debug text.  Not worth worrying about.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-05  4:00       ` Andrew Morton
  2014-03-05 15:35         ` Christoph Lameter
@ 2014-03-05 21:57         ` Christoph Lameter
  2014-03-06  2:53           ` Mike Travis
  1 sibling, 1 reply; 87+ messages in thread
From: Christoph Lameter @ 2014-03-05 21:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Mike Travis,
	Dimitri Sivanich

The driver seems to use local64_t to define a single static instance of a
counter and then seems to think that it is safe to increment the counter
from multiple processors using local64_inc and friends. Common
misunderstanding and a reason why I wanted the this_cpu operations.

The counters seem to be exported via module parameters.. So I guess we
need to define these per cpu and then sum them up when they need to be
displayed.

Dimitri?

Maybe lets move this outside of this patchset.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-05 21:57         ` Christoph Lameter
@ 2014-03-06  2:53           ` Mike Travis
  2014-03-07 18:16             ` Christoph Lameter
  0 siblings, 1 reply; 87+ messages in thread
From: Mike Travis @ 2014-03-06  2:53 UTC (permalink / raw)
  To: Christoph Lameter, Andrew Morton
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Hedi Berriche, Dimitri Sivanich



On 3/5/2014 1:57 PM, Christoph Lameter wrote:
> The driver seems to use local64_t to define a single static instance of a
> counter and then seems to think that it is safe to increment the counter
> from multiple processors using local64_inc and friends. Common
> misunderstanding and a reason why I wanted the this_cpu operations.
> 
> The counters seem to be exported via module parameters.. So I guess we
> need to define these per cpu and then sum them up when they need to be
> displayed.
> 
> Dimitri?
> 
> Maybe lets move this outside of this patchset.
> 

Hi Christoph,

I haven't had much chance yet to look over your proposed changes but
FYI, the counters are strictly feedback to insure that there are not
unhandled NMI events from the perf subsystem.  The exact count is
irrelevant.  IOW, counts in the double or triple digits is okay,
counts > 100,000 is definitely not okay (there are multiple millions
of perf events every 'perf top' refresh.)

I'm not sure if this alters how you want to approach the changes.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 31/48] uv: Replace __get_cpu_var
  2014-03-06  2:53           ` Mike Travis
@ 2014-03-07 18:16             ` Christoph Lameter
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Lameter @ 2014-03-07 18:16 UTC (permalink / raw)
  To: Mike Travis
  Cc: Andrew Morton, Tejun Heo, akpm, rostedt, linux-kernel,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Hedi Berriche,
	Dimitri Sivanich

On Wed, 5 Mar 2014, Mike Travis wrote:

> I haven't had much chance yet to look over your proposed changes but
> FYI, the counters are strictly feedback to insure that there are not
> unhandled NMI events from the perf subsystem.  The exact count is
> irrelevant.  IOW, counts in the double or triple digits is okay,
> counts > 100,000 is definitely not okay (there are multiple millions
> of perf events every 'perf top' refresh.)
>
> I'm not sure if this alters how you want to approach the changes.

Gotta patch here that converts all the atomic per cpu counters to int but
the local64_t definitions look very strange to me. I have never seen a
local64_t definition that is global and used for a counters. That can only
work if there is only one and exactly one processor that is modifying the
count.


^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2014-03-07 18:16 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-14 20:18 [PATCH 00/48] percpu: Consistent per cpu operations V4 Christoph Lameter
2014-02-14 20:18 ` [PATCH 01/48] percpu: Add raw_cpu_ops Christoph Lameter
2014-02-14 20:18 ` [PATCH 02/48] mm: Use raw_cpu ops for determining current NUMA node Christoph Lameter
2014-02-14 20:18   ` Christoph Lameter
2014-02-14 20:18 ` [PATCH 03/48] modules: Use raw_cpu_write for initialization of per cpu refcount Christoph Lameter
2014-02-14 20:18 ` [PATCH 04/48] net: Replace __this_cpu_inc in route.c with raw_cpu_inc Christoph Lameter
2014-02-14 20:18 ` [PATCH 05/48] percpu: Add preemption checks to __this_cpu ops Christoph Lameter
2014-03-04 22:27   ` Andrew Morton
2014-03-04 23:27     ` Steven Rostedt
2014-03-05  3:27     ` Christoph Lameter
2014-03-05 21:34       ` Andrew Morton
2014-02-14 20:18 ` [PATCH 06/48] mm: Replace __get_cpu_var uses with this_cpu_ptr Christoph Lameter
2014-02-14 20:18   ` Christoph Lameter
2014-02-14 20:18 ` [PATCH 07/48] tracing: " Christoph Lameter
2014-02-14 20:18 ` [PATCH 08/48] percpu: Replace __get_cpu_var " Christoph Lameter
2014-02-14 20:18 ` [PATCH 09/48] kernel misc: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:18 ` [PATCH 10/48] drivers/char/random: " Christoph Lameter
2014-02-14 20:18 ` [PATCH 11/48] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
2014-02-14 20:18 ` [PATCH 12/48] drivers/oprofile: " Christoph Lameter
2014-02-14 20:18 ` [PATCH 13/48] drivers/leds: Replace __get_cpu_var use through this_cpu_ptr Christoph Lameter
2014-02-14 20:18 ` [PATCH 14/48] drivers/clocksource: Replace __get_cpu_var used for address calculation Christoph Lameter
2014-02-14 20:18 ` [PATCH 15/48] parisc: Replace __get_cpu_var uses " Christoph Lameter
2014-02-14 20:18   ` Christoph Lameter
2014-02-14 20:18 ` [PATCH 16/48] metag: " Christoph Lameter
2014-02-14 20:18 ` [PATCH 17/48] drivers/net/ethernet/tile: " Christoph Lameter
2014-02-14 20:18 ` [PATCH 18/48] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
2014-02-14 20:19 ` [PATCH 19/48] tilegx: Another case of get_cpu_var Christoph Lameter
2014-02-14 20:19 ` [PATCH 20/48] time: Replace __get_cpu_var uses Christoph Lameter
2014-02-15 11:33   ` Thomas Gleixner
2014-02-14 20:19 ` [PATCH 21/48] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 22/48] tick-sched: Fix two new uses of __get_cpu_ptr Christoph Lameter
2014-02-15 11:33   ` Thomas Gleixner
2014-02-14 20:19 ` [PATCH 23/48] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 24/48] rcu: Replace __this_cpu_ptr uses " Christoph Lameter
2014-02-16 16:17   ` Paul E. McKenney
2014-02-14 20:19 ` [PATCH 25/48] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
2014-02-14 20:19 ` [PATCH 26/48] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 27/48] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 28/48] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
2014-02-14 20:19 ` [PATCH 29/48] x86: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19 ` [PATCH 30/48] x86: Change __get_cpu_var calls introduced in 3.14 Christoph Lameter
2014-02-14 20:19 ` [PATCH 31/48] uv: Replace __get_cpu_var Christoph Lameter
2014-03-04 23:02   ` Andrew Morton
2014-03-04 23:42     ` Steven Rostedt
2014-03-04 23:47       ` Andrew Morton
2014-03-05  0:18     ` H. Peter Anvin
2014-03-05  3:31     ` Christoph Lameter
2014-03-05  4:00       ` Andrew Morton
2014-03-05 15:35         ` Christoph Lameter
2014-03-05 21:57         ` Christoph Lameter
2014-03-06  2:53           ` Mike Travis
2014-03-07 18:16             ` Christoph Lameter
2014-02-14 20:19 ` [PATCH 32/48] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 33/48] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
2014-02-14 20:19 ` [PATCH 34/48] mips: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19 ` [PATCH 35/48] s390: rename __this_cpu_ptr to raw_cpu_ptr Christoph Lameter
2014-02-14 20:19 ` [PATCH 36/48] s390: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19 ` [PATCH 37/48] s390: Handle new __get_cpu_var calls added in 3.14 Christoph Lameter
2014-02-14 20:19 ` [PATCH 38/48] ia64: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19   ` Christoph Lameter
2014-02-14 20:19 ` [PATCH 39/48] powerpc: " Christoph Lameter
2014-02-15  3:50   ` Benjamin Herrenschmidt
2014-02-15  4:26     ` Steven Rostedt
2014-02-15  7:54       ` Mike Galbraith
2014-02-15 10:00         ` Benjamin Herrenschmidt
2014-02-15 11:29           ` Mike Galbraith
2014-02-15  9:59       ` Benjamin Herrenschmidt
2014-02-15  9:42     ` Peter Zijlstra
2014-02-15 10:01       ` Benjamin Herrenschmidt
2014-02-15 12:07         ` Andreas Schwab
2014-02-15 15:45         ` Peter Zijlstra
2014-02-15 18:12           ` David Woodhouse
2014-02-15 20:32             ` Benjamin Herrenschmidt
2014-02-15 20:52               ` Benjamin Herrenschmidt
2014-02-14 20:19 ` [PATCH 40/48] powerpc: Handle new __get_cpu_var calls in 3.14 Christoph Lameter
2014-02-14 20:19 ` [PATCH 41/48] sparc: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19   ` Christoph Lameter
2014-02-14 20:19 ` [PATCH 42/48] tile: " Christoph Lameter
2014-02-14 20:19 ` [PATCH 43/48] blackfin: " Christoph Lameter
2014-02-14 20:19 ` [PATCH 44/48] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
2014-02-14 20:19 ` [PATCH 45/48] alpha: Replace __get_cpu_var Christoph Lameter
2014-02-14 20:19 ` [PATCH 46/48] sh: Replace __get_cpu_var uses Christoph Lameter
2014-02-14 20:19   ` Christoph Lameter
2014-02-14 20:19 ` [PATCH 47/48] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.16] Christoph Lameter
2014-02-14 20:19 ` [PATCH 48/48] percpu: Remove __this_cpu_ptr Christoph Lameter
2014-03-04 22:27 ` [PATCH 00/48] percpu: Consistent per cpu operations V4 Andrew Morton
2014-03-05  3:29   ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.