All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/35] percpu: Consistent per cpu operations V7
@ 2014-08-17 17:30 Christoph Lameter
  2014-08-17 17:30 ` [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses Christoph Lameter
                   ` (35 more replies)
  0 siblings, 36 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

V6->V7
- Separate out __get_cpu_var accesses of cpumasks and create a special
  patch that handles the configuration dependent conversion to either
  this_cpu_read() or this_cpu_ptr(). If this is problematic
  then the last 2 patches of the series can be dropped top leave __get_cpu_var use
  for these cases intact.
- Rebase against 3.17-rc1
- Convert newly introduced uses of __get_cpu_var
- Use Feng's korg testing to validate patch in a series of configurations
- Remove portions of the overview that describe patches already merged.
- Patches available in git tree on
	korg https://git.kernel.org/cgit/linux/kernel/git/christoph/percpu.git/


V4->V5
- Remove patches that were merged
- Update patches against 3.16-rc1
- randconfig debugging on x86
- All patches can be merged individually now (asid from the last two
  that remove functions)
- Non x86 and powerpc architecture patches have minimal verification.

V3->V4:
- Rediff patches
- Put the patches first that define the new API.
- Add patches to convert the __get_cpu_var stuff added in 3.14

V2->V3:
- Rediff patches
- Fix breakage caused by mips patches. Add a mips patch to convert from local_t.
- Update some descriptions.

V1->V2:
- Move legacy definition for __this_cpu_ptr into include/asm-generic/percpu.h
  so that users bypassing include/linux/percpu.h do not break (affects
  tile and s390)
- Merge raw_cpu_ops core and the patch to rename x86 __this_cpu primitives
  into one. Otherwise breakage will occur since x86 __this_cpu ops will fall
  back to generic ops which is not tolerated well by the preempt hackery
  in x86.
- Add notes to each patch that depends on another to avoid mismerges.
  Add acks etc.
- Use quilt-0.61 with the bug fix that ensures all mailing lists
  receive the postings intended for them.


The patches are follow on patches that complete the conversion of
__get_cpu_var uses. Aside from the 4 last patches all of them
can be applied independently. The last 4 patches deal with removal
of macros that are no longer needed because the earlier patches
converted the usage.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:04   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
                   ` (34 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, akpm

[-- Attachment #1: 0002-kernel-misc-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 1869 bytes --]

Replace uses of __get_cpu_var for address calculation with this_cpu_ptr.

Cc: akpm@linux-foundation.org
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 kernel/printk/printk.c | 4 ++--
 kernel/smp.c           | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

Index: linux/kernel/printk/printk.c
===================================================================
--- linux.orig/kernel/printk/printk.c
+++ linux/kernel/printk/printk.c
@@ -2628,7 +2628,7 @@ void wake_up_klogd(void)
 	preempt_disable();
 	if (waitqueue_active(&log_wait)) {
 		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
-		irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
 	}
 	preempt_enable();
 }
@@ -2644,7 +2644,7 @@ int printk_deferred(const char *fmt, ...
 	va_end(args);
 
 	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
-	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
+	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
 	preempt_enable();
 
 	return r;
Index: linux/kernel/smp.c
===================================================================
--- linux.orig/kernel/smp.c
+++ linux/kernel/smp.c
@@ -164,7 +164,7 @@ static int generic_exec_single(int cpu,
 	if (!csd) {
 		csd = &csd_stack;
 		if (!wait)
-			csd = &__get_cpu_var(csd_data);
+			csd = this_cpu_ptr(&csd_data);
 	}
 
 	csd_lock(csd);
@@ -229,7 +229,7 @@ static void flush_smp_call_function_queu
 
 	WARN_ON(!irqs_disabled());
 
-	head = &__get_cpu_var(call_single_queue);
+	head = this_cpu_ptr(&call_single_queue);
 	entry = llist_del_all(head);
 	entry = llist_reverse_order(entry);
 
@@ -419,7 +419,7 @@ void smp_call_function_many(const struct
 		return;
 	}
 
-	cfd = &__get_cpu_var(cfd_data);
+	cfd = this_cpu_ptr(&cfd_data);
 
 	cpumask_and(cfd->cpumask, mask, cpu_online_mask);
 	cpumask_clear_cpu(this_cpu, cfd->cpumask);


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 02/35] [PATCH 03/36] time: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
  2014-08-17 17:30 ` [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:05   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period Christoph Lameter
                   ` (33 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0003-time-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 10791 bytes --]

Convert uses of __get_cpu_var for creating a address from a percpu
offset to this_cpu_ptr.

The two cases where get_cpu_var is used to actually access a percpu
variable are changed to use this_cpu_read/raw_cpu_read.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/clocksource/dummy_timer.c |  2 +-
 kernel/hrtimer.c                  | 22 +++++++++++-----------
 kernel/irq_work.c                 |  6 +++---
 kernel/sched/clock.c              |  2 +-
 kernel/softirq.c                  |  4 ++--
 kernel/time/tick-broadcast.c      |  2 +-
 kernel/time/tick-common.c         |  6 +++---
 kernel/time/tick-oneshot.c        |  2 +-
 kernel/time/tick-sched.c          | 22 +++++++++++-----------
 kernel/timer.c                    |  2 +-
 10 files changed, 35 insertions(+), 35 deletions(-)

Index: linux/drivers/clocksource/dummy_timer.c
===================================================================
--- linux.orig/drivers/clocksource/dummy_timer.c
+++ linux/drivers/clocksource/dummy_timer.c
@@ -28,7 +28,7 @@ static void dummy_timer_set_mode(enum cl
 static void dummy_timer_setup(void)
 {
 	int cpu = smp_processor_id();
-	struct clock_event_device *evt = __this_cpu_ptr(&dummy_timer_evt);
+	struct clock_event_device *evt = raw_cpu_ptr(&dummy_timer_evt);
 
 	evt->name	= "dummy_timer";
 	evt->features	= CLOCK_EVT_FEAT_PERIODIC |
Index: linux/kernel/irq_work.c
===================================================================
--- linux.orig/kernel/irq_work.c
+++ linux/kernel/irq_work.c
@@ -95,11 +95,11 @@ bool irq_work_queue(struct irq_work *wor
 
 	/* If the work is "lazy", handle it from next tick if any */
 	if (work->flags & IRQ_WORK_LAZY) {
-		if (llist_add(&work->llnode, &__get_cpu_var(lazy_list)) &&
+		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
 		    tick_nohz_tick_stopped())
 			arch_irq_work_raise();
 	} else {
-		if (llist_add(&work->llnode, &__get_cpu_var(raised_list)))
+		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
 			arch_irq_work_raise();
 	}
 
@@ -113,8 +113,8 @@ bool irq_work_needs_cpu(void)
 {
 	struct llist_head *raised, *lazy;
 
-	raised = &__get_cpu_var(raised_list);
-	lazy = &__get_cpu_var(lazy_list);
+	raised = this_cpu_ptr(&raised_list);
+	lazy = this_cpu_ptr(&lazy_list);
 	if (llist_empty(raised) && llist_empty(lazy))
 		return false;
 
@@ -166,8 +166,8 @@ static void irq_work_run_list(struct lli
  */
 void irq_work_run(void)
 {
-	irq_work_run_list(&__get_cpu_var(raised_list));
-	irq_work_run_list(&__get_cpu_var(lazy_list));
+	irq_work_run_list(this_cpu_ptr(&raised_list));
+	irq_work_run_list(this_cpu_ptr(&lazy_list));
 }
 EXPORT_SYMBOL_GPL(irq_work_run);
 
Index: linux/kernel/sched/clock.c
===================================================================
--- linux.orig/kernel/sched/clock.c
+++ linux/kernel/sched/clock.c
@@ -134,7 +134,7 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(str
 
 static inline struct sched_clock_data *this_scd(void)
 {
-	return &__get_cpu_var(sched_clock_data);
+	return this_cpu_ptr(&sched_clock_data);
 }
 
 static inline struct sched_clock_data *cpu_sdc(int cpu)
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -485,7 +485,7 @@ static void tasklet_action(struct softir
 	local_irq_disable();
 	list = __this_cpu_read(tasklet_vec.head);
 	__this_cpu_write(tasklet_vec.head, NULL);
-	__this_cpu_write(tasklet_vec.tail, &__get_cpu_var(tasklet_vec).head);
+	__this_cpu_write(tasklet_vec.tail, this_cpu_ptr(&tasklet_vec.head));
 	local_irq_enable();
 
 	while (list) {
@@ -521,7 +521,7 @@ static void tasklet_hi_action(struct sof
 	local_irq_disable();
 	list = __this_cpu_read(tasklet_hi_vec.head);
 	__this_cpu_write(tasklet_hi_vec.head, NULL);
-	__this_cpu_write(tasklet_hi_vec.tail, &__get_cpu_var(tasklet_hi_vec).head);
+	__this_cpu_write(tasklet_hi_vec.tail, this_cpu_ptr(&tasklet_hi_vec.head));
 	local_irq_enable();
 
 	while (list) {
Index: linux/kernel/time/tick-broadcast.c
===================================================================
--- linux.orig/kernel/time/tick-broadcast.c
+++ linux/kernel/time/tick-broadcast.c
@@ -554,7 +554,7 @@ int tick_resume_broadcast_oneshot(struct
 void tick_check_oneshot_broadcast_this_cpu(void)
 {
 	if (cpumask_test_cpu(smp_processor_id(), tick_broadcast_oneshot_mask)) {
-		struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+		struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 
 		/*
 		 * We might be in the middle of switching over from
Index: linux/kernel/time/tick-common.c
===================================================================
--- linux.orig/kernel/time/tick-common.c
+++ linux/kernel/time/tick-common.c
@@ -224,7 +224,7 @@ static void tick_setup_device(struct tic
 
 void tick_install_replacement(struct clock_event_device *newdev)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	int cpu = smp_processor_id();
 
 	clockevents_exchange_device(td->evtdev, newdev);
@@ -374,14 +374,14 @@ void tick_shutdown(unsigned int *cpup)
 
 void tick_suspend(void)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 
 	clockevents_shutdown(td->evtdev);
 }
 
 void tick_resume(void)
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	int broadcast = tick_resume_broadcast();
 
 	clockevents_set_mode(td->evtdev, CLOCK_EVT_MODE_RESUME);
Index: linux/kernel/time/tick-oneshot.c
===================================================================
--- linux.orig/kernel/time/tick-oneshot.c
+++ linux/kernel/time/tick-oneshot.c
@@ -59,7 +59,7 @@ void tick_setup_oneshot(struct clock_eve
  */
 int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *))
 {
-	struct tick_device *td = &__get_cpu_var(tick_cpu_device);
+	struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
 	struct clock_event_device *dev = td->evtdev;
 
 	if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT) ||
Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c
+++ linux/kernel/time/tick-sched.c
@@ -205,7 +205,7 @@ static void tick_nohz_restart_sched_tick
  */
 void __tick_nohz_full_check(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (tick_nohz_full_cpu(smp_processor_id())) {
 		if (ts->tick_stopped && !is_idle_task(current)) {
@@ -545,7 +545,7 @@ static ktime_t tick_nohz_stop_sched_tick
 	unsigned long seq, last_jiffies, next_jiffies, delta_jiffies;
 	ktime_t last_update, expires, ret = { .tv64 = 0 };
 	unsigned long rcu_delta_jiffies;
-	struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;
+	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
 	u64 time_delta;
 
 	time_delta = timekeeping_max_deferment();
@@ -813,7 +813,7 @@ void tick_nohz_idle_enter(void)
 
 	local_irq_disable();
 
-	ts = &__get_cpu_var(tick_cpu_sched);
+	ts = this_cpu_ptr(&tick_cpu_sched);
 	ts->inidle = 1;
 	__tick_nohz_idle_enter(ts);
 
@@ -831,7 +831,7 @@ EXPORT_SYMBOL_GPL(tick_nohz_idle_enter);
  */
 void tick_nohz_irq_exit(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (ts->inidle)
 		__tick_nohz_idle_enter(ts);
@@ -846,7 +846,7 @@ void tick_nohz_irq_exit(void)
  */
 ktime_t tick_nohz_get_sleep_length(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	return ts->sleep_length;
 }
@@ -959,7 +959,7 @@ static int tick_nohz_reprogram(struct ti
  */
 static void tick_nohz_handler(struct clock_event_device *dev)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	struct pt_regs *regs = get_irq_regs();
 	ktime_t now = ktime_get();
 
@@ -979,7 +979,7 @@ static void tick_nohz_handler(struct clo
  */
 static void tick_nohz_switch_to_nohz(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t next;
 
 	if (!tick_nohz_enabled)
@@ -1115,7 +1115,7 @@ early_param("skew_tick", skew_tick);
  */
 void tick_setup_sched_timer(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now = ktime_get();
 
 	/*
@@ -1184,7 +1184,7 @@ void tick_clock_notify(void)
  */
 void tick_oneshot_notify(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	set_bit(0, &ts->check_clocks);
 }
@@ -1199,7 +1199,7 @@ void tick_oneshot_notify(void)
  */
 int tick_check_oneshot_change(int allow_nohz)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 
 	if (!test_and_clear_bit(0, &ts->check_clocks))
 		return 0;
Index: linux/kernel/time/hrtimer.c
===================================================================
--- linux.orig/kernel/time/hrtimer.c
+++ linux/kernel/time/hrtimer.c
@@ -1144,7 +1144,7 @@ static void __hrtimer_init(struct hrtime
 
 	memset(timer, 0, sizeof(struct hrtimer));
 
-	cpu_base = &__raw_get_cpu_var(hrtimer_bases);
+	cpu_base = raw_cpu_ptr(&hrtimer_bases);
 
 	if (clock_id == CLOCK_REALTIME && mode != HRTIMER_MODE_ABS)
 		clock_id = CLOCK_MONOTONIC;
@@ -1187,7 +1187,7 @@ int hrtimer_get_res(const clockid_t whic
 	struct hrtimer_cpu_base *cpu_base;
 	int base = hrtimer_clockid_to_base(which_clock);
 
-	cpu_base = &__raw_get_cpu_var(hrtimer_bases);
+	cpu_base = raw_cpu_ptr(&hrtimer_bases);
 	*tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
 
 	return 0;
@@ -1376,7 +1376,7 @@ static void __hrtimer_peek_ahead_timers(
 	if (!hrtimer_hres_active())
 		return;
 
-	td = &__get_cpu_var(tick_cpu_device);
+	td = this_cpu_ptr(&tick_cpu_device);
 	if (td && td->evtdev)
 		hrtimer_interrupt(td->evtdev);
 }
Index: linux/kernel/time/timer.c
===================================================================
--- linux.orig/kernel/time/timer.c
+++ linux/kernel/time/timer.c
@@ -655,7 +655,7 @@ static inline void debug_assert_init(str
 static void do_init_timer(struct timer_list *timer, unsigned int flags,
 			  const char *name, struct lock_class_key *key)
 {
-	struct tvec_base *base = __raw_get_cpu_var(tvec_bases);
+	struct tvec_base *base = raw_cpu_read(tvec_bases);
 
 	timer->entry.next = NULL;
 	timer->base = (void *)((unsigned long)base | flags);


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
  2014-08-17 17:30 ` [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses Christoph Lameter
  2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:05   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
                   ` (32 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: fixhrtimer --]
[-- Type: text/plain, Size: 3020 bytes --]

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/kernel/time/hrtimer.c
===================================================================
--- linux.orig/kernel/time/hrtimer.c
+++ linux/kernel/time/hrtimer.c
@@ -558,7 +558,7 @@ hrtimer_force_reprogram(struct hrtimer_c
 static int hrtimer_reprogram(struct hrtimer *timer,
 			     struct hrtimer_clock_base *base)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 	int res;
 
@@ -629,7 +629,7 @@ static inline ktime_t hrtimer_update_bas
  */
 static void retrigger_next_event(void *arg)
 {
-	struct hrtimer_cpu_base *base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
 	if (!hrtimer_hres_active())
 		return;
@@ -903,7 +903,7 @@ remove_hrtimer(struct hrtimer *timer, st
 		 */
 		debug_deactivate(timer);
 		timer_stats_hrtimer_clear_start_info(timer);
-		reprogram = base->cpu_base == &__get_cpu_var(hrtimer_bases);
+		reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
 		/*
 		 * We must preserve the CALLBACK state flag here,
 		 * otherwise we could move the timer base in
@@ -963,7 +963,7 @@ int __hrtimer_start_range_ns(struct hrti
 		 * on dynticks target.
 		 */
 		wake_up_nohz_cpu(new_base->cpu_base->cpu);
-	} else if (new_base->cpu_base == &__get_cpu_var(hrtimer_bases) &&
+	} else if (new_base->cpu_base == this_cpu_ptr(&hrtimer_bases) &&
 			hrtimer_reprogram(timer, new_base)) {
 		/*
 		 * Only allow reprogramming if the new base is on this CPU.
@@ -1103,7 +1103,7 @@ EXPORT_SYMBOL_GPL(hrtimer_get_remaining)
  */
 ktime_t hrtimer_get_next_event(void)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base = cpu_base->clock_base;
 	ktime_t delta, mindelta = { .tv64 = KTIME_MAX };
 	unsigned long flags;
@@ -1242,7 +1242,7 @@ static void __run_hrtimer(struct hrtimer
  */
 void hrtimer_interrupt(struct clock_event_device *dev)
 {
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires_next, now, entry_time, delta;
 	int i, retries = 0;
 
@@ -1440,7 +1440,7 @@ void hrtimer_run_pending(void)
 void hrtimer_run_queues(void)
 {
 	struct timerqueue_node *node;
-	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	struct hrtimer_clock_base *base;
 	int index, gettime = 1;
 
@@ -1679,7 +1679,7 @@ static void migrate_hrtimers(int scpu)
 
 	local_irq_disable();
 	old_base = &per_cpu(hrtimer_bases, scpu);
-	new_base = &__get_cpu_var(hrtimer_bases);
+	new_base = this_cpu_ptr(&hrtimer_bases);
 	/*
 	 * The caller is globally serialized and nobody else
 	 * takes two locks at once, deadlock is not possible.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (2 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:05   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (31 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0004-scheduler-Replace-__get_cpu_var-with-this_cpu_ptr.patch --]
[-- Type: text/plain, Size: 8593 bytes --]

Convert all uses of __get_cpu_var for address calculation to use
this_cpu_ptr instead.

[Uses of __get_cpu_var with cpumask_var_t are no longer
handled by this patch]

Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 include/linux/kernel_stat.h   |  4 ++--
 kernel/events/callchain.c     |  4 ++--
 kernel/events/core.c          | 24 ++++++++++++------------
 kernel/sched/sched.h          |  4 ++--
 kernel/taskstats.c            |  2 +-
 kernel/time/tick-sched.c      |  4 ++--
 kernel/user-return-notifier.c |  4 ++--
 7 files changed, 23 insertions(+), 23 deletions(-)

Index: linux/include/linux/kernel_stat.h
===================================================================
--- linux.orig/include/linux/kernel_stat.h
+++ linux/include/linux/kernel_stat.h
@@ -44,8 +44,8 @@ DECLARE_PER_CPU(struct kernel_stat, ksta
 DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
 
 /* Must have preemption disabled for this to be meaningful. */
-#define kstat_this_cpu (&__get_cpu_var(kstat))
-#define kcpustat_this_cpu (&__get_cpu_var(kernel_cpustat))
+#define kstat_this_cpu this_cpu_ptr(&kstat)
+#define kcpustat_this_cpu this_cpu_ptr(&kernel_cpustat)
 #define kstat_cpu(cpu) per_cpu(kstat, cpu)
 #define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)
 
Index: linux/kernel/events/callchain.c
===================================================================
--- linux.orig/kernel/events/callchain.c
+++ linux/kernel/events/callchain.c
@@ -137,7 +137,7 @@ static struct perf_callchain_entry *get_
 	int cpu;
 	struct callchain_cpus_entries *entries;
 
-	*rctx = get_recursion_context(__get_cpu_var(callchain_recursion));
+	*rctx = get_recursion_context(this_cpu_ptr(callchain_recursion));
 	if (*rctx == -1)
 		return NULL;
 
@@ -153,7 +153,7 @@ static struct perf_callchain_entry *get_
 static void
 put_callchain_entry(int rctx)
 {
-	put_recursion_context(__get_cpu_var(callchain_recursion), rctx);
+	put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
 }
 
 struct perf_callchain_entry *
Index: linux/kernel/events/core.c
===================================================================
--- linux.orig/kernel/events/core.c
+++ linux/kernel/events/core.c
@@ -239,7 +239,7 @@ static void perf_duration_warn(struct ir
 	u64 avg_local_sample_len;
 	u64 local_samples_len;
 
-	local_samples_len = __get_cpu_var(running_sample_length);
+	local_samples_len = __this_cpu_read(running_sample_length);
 	avg_local_sample_len = local_samples_len/NR_ACCUMULATED_SAMPLES;
 
 	printk_ratelimited(KERN_WARNING
@@ -261,10 +261,10 @@ void perf_sample_event_took(u64 sample_l
 		return;
 
 	/* decay the counter by 1 average sample */
-	local_samples_len = __get_cpu_var(running_sample_length);
+	local_samples_len = __this_cpu_read(running_sample_length);
 	local_samples_len -= local_samples_len/NR_ACCUMULATED_SAMPLES;
 	local_samples_len += sample_len_ns;
-	__get_cpu_var(running_sample_length) = local_samples_len;
+	__this_cpu_write(running_sample_length, local_samples_len);
 
 	/*
 	 * note: this will be biased artifically low until we have
@@ -877,7 +877,7 @@ static DEFINE_PER_CPU(struct list_head,
 static void perf_pmu_rotate_start(struct pmu *pmu)
 {
 	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
-	struct list_head *head = &__get_cpu_var(rotation_list);
+	struct list_head *head = this_cpu_ptr(&rotation_list);
 
 	WARN_ON(!irqs_disabled());
 
@@ -2389,7 +2389,7 @@ void __perf_event_task_sched_out(struct
 	 * to check if we have to switch out PMU state.
 	 * cgroup event are system-wide mode only
 	 */
-	if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+	if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
 		perf_cgroup_sched_out(task, next);
 }
 
@@ -2632,11 +2632,11 @@ void __perf_event_task_sched_in(struct t
 	 * to check if we have to switch in PMU state.
 	 * cgroup event are system-wide mode only
 	 */
-	if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
+	if (atomic_read(this_cpu_ptr(&perf_cgroup_events)))
 		perf_cgroup_sched_in(prev, task);
 
 	/* check for system-wide branch_stack events */
-	if (atomic_read(&__get_cpu_var(perf_branch_stack_events)))
+	if (atomic_read(this_cpu_ptr(&perf_branch_stack_events)))
 		perf_branch_stack_sched_in(prev, task);
 }
 
@@ -2891,7 +2891,7 @@ bool perf_event_can_stop_tick(void)
 
 void perf_event_task_tick(void)
 {
-	struct list_head *head = &__get_cpu_var(rotation_list);
+	struct list_head *head = this_cpu_ptr(&rotation_list);
 	struct perf_cpu_context *cpuctx, *tmp;
 	struct perf_event_context *ctx;
 	int throttled;
@@ -5671,7 +5671,7 @@ static void do_perf_sw_event(enum perf_t
 				    struct perf_sample_data *data,
 				    struct pt_regs *regs)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 	struct perf_event *event;
 	struct hlist_head *head;
 
@@ -5690,7 +5690,7 @@ end:
 
 int perf_swevent_get_recursion_context(void)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 
 	return get_recursion_context(swhash->recursion);
 }
@@ -5698,7 +5698,7 @@ EXPORT_SYMBOL_GPL(perf_swevent_get_recur
 
 inline void perf_swevent_put_recursion_context(int rctx)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 
 	put_recursion_context(swhash->recursion, rctx);
 }
@@ -5727,7 +5727,7 @@ static void perf_swevent_read(struct per
 
 static int perf_swevent_add(struct perf_event *event, int flags)
 {
-	struct swevent_htable *swhash = &__get_cpu_var(swevent_htable);
+	struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
 	struct hw_perf_event *hwc = &event->hw;
 	struct hlist_head *head;
 
Index: linux/kernel/sched/sched.h
===================================================================
--- linux.orig/kernel/sched/sched.h
+++ linux/kernel/sched/sched.h
@@ -650,10 +650,10 @@ static inline int cpu_of(struct rq *rq)
 DECLARE_PER_CPU(struct rq, runqueues);
 
 #define cpu_rq(cpu)		(&per_cpu(runqueues, (cpu)))
-#define this_rq()		(&__get_cpu_var(runqueues))
+#define this_rq()		this_cpu_ptr(&runqueues)
 #define task_rq(p)		cpu_rq(task_cpu(p))
 #define cpu_curr(cpu)		(cpu_rq(cpu)->curr)
-#define raw_rq()		(&__raw_get_cpu_var(runqueues))
+#define raw_rq()		raw_cpu_ptr(&runqueues)
 
 static inline u64 rq_clock(struct rq *rq)
 {
Index: linux/kernel/taskstats.c
===================================================================
--- linux.orig/kernel/taskstats.c
+++ linux/kernel/taskstats.c
@@ -638,7 +638,7 @@ void taskstats_exit(struct task_struct *
 		fill_tgid_exit(tsk);
 	}
 
-	listeners = __this_cpu_ptr(&listener_array);
+	listeners = raw_cpu_ptr(&listener_array);
 	if (list_empty(&listeners->list))
 		return;
 
Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c
+++ linux/kernel/time/tick-sched.c
@@ -924,7 +924,7 @@ static void tick_nohz_account_idle_ticks
  */
 void tick_nohz_idle_exit(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now;
 
 	local_irq_disable();
@@ -1041,7 +1041,7 @@ static void tick_nohz_kick_tick(struct t
 
 static inline void tick_nohz_irq_enter(void)
 {
-	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
+	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
 	ktime_t now;
 
 	if (!ts->idle_active && !ts->tick_stopped)
Index: linux/kernel/user-return-notifier.c
===================================================================
--- linux.orig/kernel/user-return-notifier.c
+++ linux/kernel/user-return-notifier.c
@@ -14,7 +14,7 @@ static DEFINE_PER_CPU(struct hlist_head,
 void user_return_notifier_register(struct user_return_notifier *urn)
 {
 	set_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
-	hlist_add_head(&urn->link, &__get_cpu_var(return_notifier_list));
+	hlist_add_head(&urn->link, this_cpu_ptr(&return_notifier_list));
 }
 EXPORT_SYMBOL_GPL(user_return_notifier_register);
 
@@ -25,7 +25,7 @@ EXPORT_SYMBOL_GPL(user_return_notifier_r
 void user_return_notifier_unregister(struct user_return_notifier *urn)
 {
 	hlist_del(&urn->link);
-	if (hlist_empty(&__get_cpu_var(return_notifier_list)))
+	if (hlist_empty(this_cpu_ptr(&return_notifier_list)))
 		clear_tsk_thread_flag(current, TIF_USER_RETURN_NOTIFY);
 }
 EXPORT_SYMBOL_GPL(user_return_notifier_unregister);


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (3 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:06   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses Christoph Lameter
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Jens Axboe

[-- Attachment #1: 0005-block-Replace-__this_cpu_ptr-with-raw_cpu_ptr.patch --]
[-- Type: text/plain, Size: 781 bytes --]

__this_cpu_ptr is being phased out use raw_cpu_ptr instead which was
introduced in 3.15-rc1.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 fs/ext4/mballoc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/fs/ext4/mballoc.c
===================================================================
--- linux.orig/fs/ext4/mballoc.c
+++ linux/fs/ext4/mballoc.c
@@ -4129,7 +4129,7 @@ static void ext4_mb_group_or_file(struct
 	 * per cpu locality group is to reduce the contention between block
 	 * request from multiple CPUs.
 	 */
-	ac->ac_lg = __this_cpu_ptr(sbi->s_locality_groups);
+	ac->ac_lg = raw_cpu_ptr(sbi->s_locality_groups);
 
 	/* we're going to use group allocation */
 	ac->ac_flags |= EXT4_MB_HINT_GROUP_ALLOC;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (4 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:07   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
                   ` (29 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Arnd Bergmann, Greg Kroah-Hartman

[-- Attachment #1: 0006-drivers-char-random-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 842 bytes --]

A single case of using __get_cpu_var for address calculation.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/char/random.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/char/random.c
===================================================================
--- linux.orig/drivers/char/random.c
+++ linux/drivers/char/random.c
@@ -874,7 +874,7 @@ static __u32 get_reg(struct fast_pool *f
 void add_interrupt_randomness(int irq, int irq_flags)
 {
 	struct entropy_store	*r;
-	struct fast_pool	*fast_pool = &__get_cpu_var(irq_randomness);
+	struct fast_pool	*fast_pool = this_cpu_ptr(&irq_randomness);
 	struct pt_regs		*regs = get_irq_regs();
 	unsigned long		now = jiffies;
 	cycles_t		cycles = random_get_entropy();


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (5 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:07   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 08/35] [PATCH 08/36] drivers/oprofile: " Christoph Lameter
                   ` (28 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Daniel Lezcano, linux-pm, Rafael J. Wysocki

[-- Attachment #1: 0007-drivers-cpuidle-Replace-__get_cpu_var-uses-for-addre.patch --]
[-- Type: text/plain, Size: 2625 bytes --]

All of these are for address calculation. Replace with
this_cpu_ptr().

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
[cpufreq changes]
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/cpuidle/governors/ladder.c | 4 ++--
 drivers/cpuidle/governors/menu.c   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

Index: linux/drivers/cpuidle/governors/ladder.c
===================================================================
--- linux.orig/drivers/cpuidle/governors/ladder.c
+++ linux/drivers/cpuidle/governors/ladder.c
@@ -66,7 +66,7 @@ static inline void ladder_do_selection(s
 static int ladder_select_state(struct cpuidle_driver *drv,
 				struct cpuidle_device *dev)
 {
-	struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
+	struct ladder_device *ldev = this_cpu_ptr(&ladder_devices);
 	struct ladder_device_state *last_state;
 	int last_residency, last_idx = ldev->last_state_idx;
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
@@ -170,7 +170,7 @@ static int ladder_enable_device(struct c
  */
 static void ladder_reflect(struct cpuidle_device *dev, int index)
 {
-	struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
+	struct ladder_device *ldev = this_cpu_ptr(&ladder_devices);
 	if (index > 0)
 		ldev->last_state_idx = index;
 }
Index: linux/drivers/cpuidle/governors/menu.c
===================================================================
--- linux.orig/drivers/cpuidle/governors/menu.c
+++ linux/drivers/cpuidle/governors/menu.c
@@ -284,7 +284,7 @@ again:
  */
 static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	unsigned int interactivity_req;
@@ -369,7 +369,7 @@ static int menu_select(struct cpuidle_dr
  */
 static void menu_reflect(struct cpuidle_device *dev, int index)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	data->last_state_idx = index;
 	if (index >= 0)
 		data->needs_update = 1;
@@ -382,7 +382,7 @@ static void menu_reflect(struct cpuidle_
  */
 static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
-	struct menu_device *data = &__get_cpu_var(menu_devices);
+	struct menu_device *data = this_cpu_ptr(&menu_devices);
 	int last_idx = data->last_state_idx;
 	struct cpuidle_state *target = &drv->states[last_idx];
 	unsigned int measured_us;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 08/35] [PATCH 08/36] drivers/oprofile: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (6 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:08   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used " Christoph Lameter
                   ` (27 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Robert Richter, oprofile-list

[-- Attachment #1: 0008-drivers-oprofile-Replace-__get_cpu_var-uses-for-addr.patch --]
[-- Type: text/plain, Size: 2499 bytes --]

Replace the uses of __get_cpu_var for address calculation with this_cpu_ptr.

Cc: Robert Richter <rric@kernel.org>
Cc: oprofile-list@lists.sf.net
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/oprofile/cpu_buffer.c | 10 +++++-----
 drivers/oprofile/timer_int.c  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

Index: linux/drivers/oprofile/cpu_buffer.c
===================================================================
--- linux.orig/drivers/oprofile/cpu_buffer.c
+++ linux/drivers/oprofile/cpu_buffer.c
@@ -45,7 +45,7 @@ unsigned long oprofile_get_cpu_buffer_si
 
 void oprofile_cpu_buffer_inc_smpl_lost(void)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	cpu_buf->sample_lost_overflow++;
 }
@@ -297,7 +297,7 @@ __oprofile_add_ext_sample(unsigned long
 			  unsigned long event, int is_kernel,
 			  struct task_struct *task)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 	unsigned long backtrace = oprofile_backtrace_depth;
 
 	/*
@@ -357,7 +357,7 @@ oprofile_write_reserve(struct op_entry *
 {
 	struct op_sample *sample;
 	int is_kernel = !user_mode(regs);
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	cpu_buf->sample_received++;
 
@@ -412,13 +412,13 @@ int oprofile_write_commit(struct op_entr
 
 void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 	log_sample(cpu_buf, pc, 0, is_kernel, event, NULL);
 }
 
 void oprofile_add_trace(unsigned long pc)
 {
-	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
+	struct oprofile_cpu_buffer *cpu_buf = this_cpu_ptr(&op_cpu_buffer);
 
 	if (!cpu_buf->tracing)
 		return;
Index: linux/drivers/oprofile/timer_int.c
===================================================================
--- linux.orig/drivers/oprofile/timer_int.c
+++ linux/drivers/oprofile/timer_int.c
@@ -32,7 +32,7 @@ static enum hrtimer_restart oprofile_hrt
 
 static void __oprofile_hrtimer_start(void *unused)
 {
-	struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer);
+	struct hrtimer *hrtimer = this_cpu_ptr(&oprofile_hrtimer);
 
 	if (!ctr_running)
 		return;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used for address calculation
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (7 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 08/35] [PATCH 08/36] drivers/oprofile: " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:08   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses " Christoph Lameter
                   ` (26 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

[-- Attachment #1: 0009-drivers-clocksource-Replace-__get_cpu_var-used-for-a.patch --]
[-- Type: text/plain, Size: 773 bytes --]

Replace __get_cpu_var used for address calculation with this_cpu_ptr.

Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/clocksource/metag_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/clocksource/metag_generic.c
===================================================================
--- linux.orig/drivers/clocksource/metag_generic.c
+++ linux/drivers/clocksource/metag_generic.c
@@ -90,7 +90,7 @@ static struct clocksource clocksource_me
 
 static irqreturn_t metag_timer_interrupt(int irq, void *dummy)
 {
-	struct clock_event_device *evt = &__get_cpu_var(local_clockevent);
+	struct clock_event_device *evt = this_cpu_ptr(&local_clockevent);
 
 	evt->event_handler(evt);
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (8 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:08   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
                   ` (25 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

[-- Attachment #1: 0010-drivers-net-ethernet-tile-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 4658 bytes --]

Replace with this_cpu_ptr.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/net/ethernet/tile/tilegx.c  | 18 +++++++++---------
 drivers/net/ethernet/tile/tilepro.c |  8 ++++----
 2 files changed, 13 insertions(+), 13 deletions(-)

Index: linux/drivers/net/ethernet/tile/tilegx.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilegx.c
+++ linux/drivers/net/ethernet/tile/tilegx.c
@@ -423,7 +423,7 @@ static void tile_net_pop_all_buffers(int
 /* Provide linux buffers to mPIPE. */
 static void tile_net_provide_needed_buffers(void)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	int instance, kind;
 	for (instance = 0; instance < NR_MPIPE_MAX &&
 		     info->mpipe[instance].has_iqueue; instance++)	{
@@ -585,7 +585,7 @@ static void tile_net_receive_skb(struct
 /* Handle a packet.  Return true if "processed", false if "filtered". */
 static bool tile_net_handle_packet(int instance, gxio_mpipe_idesc_t *idesc)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct mpipe_data *md = &mpipe_data[instance];
 	struct net_device *dev = md->tile_net_devs_for_channel[idesc->channel];
 	uint8_t l2_offset;
@@ -651,7 +651,7 @@ drop:
  */
 static int tile_net_poll(struct napi_struct *napi, int budget)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	unsigned int work = 0;
 	gxio_mpipe_idesc_t *idesc;
 	int instance, i, n;
@@ -700,7 +700,7 @@ done:
 /* Handle an ingress interrupt from an instance on the current cpu. */
 static irqreturn_t tile_net_handle_ingress_irq(int irq, void *id)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	napi_schedule(&info->mpipe[(uint64_t)id].napi);
 	return IRQ_HANDLED;
 }
@@ -763,7 +763,7 @@ static enum hrtimer_restart tile_net_han
 /* Make sure the egress timer is scheduled. */
 static void tile_net_schedule_egress_timer(void)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 
 	if (!info->egress_timer_scheduled) {
 		hrtimer_start(&info->egress_timer,
@@ -780,7 +780,7 @@ static void tile_net_schedule_egress_tim
  */
 static enum hrtimer_restart tile_net_handle_egress_timer(struct hrtimer *t)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	unsigned long irqflags;
 	bool pending = false;
 	int i, instance;
@@ -1996,7 +1996,7 @@ static unsigned int tile_net_tx_frags(st
 /* Help the kernel transmit a packet. */
 static int tile_net_tx(struct sk_buff *skb, struct net_device *dev)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int instance = priv->instance;
 	struct mpipe_data *md = &mpipe_data[instance];
@@ -2138,7 +2138,7 @@ static int tile_net_set_mac_address(stru
 static void tile_net_netpoll(struct net_device *dev)
 {
 	int instance = mpipe_instance(dev);
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct mpipe_data *md = &mpipe_data[instance];
 
 	disable_percpu_irq(md->ingress_irq);
@@ -2237,7 +2237,7 @@ static void tile_net_dev_init(const char
 /* Per-cpu module initialization. */
 static void tile_net_init_module_percpu(void *unused)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	int my_cpu = smp_processor_id();
 	int instance;
 
Index: linux/drivers/net/ethernet/tile/tilepro.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilepro.c
+++ linux/drivers/net/ethernet/tile/tilepro.c
@@ -996,13 +996,13 @@ static void tile_net_register(void *dev_
 	PDEBUG("tile_net_register(queue_id %d)\n", queue_id);
 
 	if (!strcmp(dev->name, "xgbe0"))
-		info = &__get_cpu_var(hv_xgbe0);
+		info = this_cpu_ptr(&hv_xgbe0);
 	else if (!strcmp(dev->name, "xgbe1"))
-		info = &__get_cpu_var(hv_xgbe1);
+		info = this_cpu_ptr(&hv_xgbe1);
 	else if (!strcmp(dev->name, "gbe0"))
-		info = &__get_cpu_var(hv_gbe0);
+		info = this_cpu_ptr(&hv_gbe0);
 	else if (!strcmp(dev->name, "gbe1"))
-		info = &__get_cpu_var(hv_gbe1);
+		info = this_cpu_ptr(&hv_gbe1);
 	else
 		BUG();
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (9 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:08   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
                   ` (24 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Wim Van Sebroeck, linux-watchdog

[-- Attachment #1: 0011-watchdog-Replace-__raw_get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 2106 bytes --]

Most of these are the uses of &__raw_get_cpu_var for address calculation.

touch_softlockup_watchdog_sync() uses __raw_get_cpu_var to write to
per cpu variables. Use __this_cpu_write instead.

Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 kernel/watchdog.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/kernel/watchdog.c
===================================================================
--- linux.orig/kernel/watchdog.c
+++ linux/kernel/watchdog.c
@@ -185,7 +185,7 @@ void touch_nmi_watchdog(void)
 	 * case we shouldn't have to worry about the watchdog
 	 * going off.
 	 */
-	__raw_get_cpu_var(watchdog_nmi_touch) = true;
+	raw_cpu_write(watchdog_nmi_touch, true);
 	touch_softlockup_watchdog();
 }
 EXPORT_SYMBOL(touch_nmi_watchdog);
@@ -194,8 +194,8 @@ EXPORT_SYMBOL(touch_nmi_watchdog);
 
 void touch_softlockup_watchdog_sync(void)
 {
-	__raw_get_cpu_var(softlockup_touch_sync) = true;
-	__raw_get_cpu_var(watchdog_touch_ts) = 0;
+	__this_cpu_write(softlockup_touch_sync, true);
+	__this_cpu_write(watchdog_touch_ts, 0);
 }
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
@@ -387,7 +387,7 @@ static void watchdog_set_prio(unsigned i
 
 static void watchdog_enable(unsigned int cpu)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 
 	/* kick off the timer for the hardlockup detector */
 	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
@@ -407,7 +407,7 @@ static void watchdog_enable(unsigned int
 
 static void watchdog_disable(unsigned int cpu)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 
 	watchdog_set_prio(SCHED_NORMAL, 0);
 	hrtimer_cancel(hrtimer);
@@ -534,7 +534,7 @@ static struct smp_hotplug_thread watchdo
 
 static void restart_watchdog_hrtimer(void *info)
 {
-	struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+	struct hrtimer *hrtimer = raw_cpu_ptr(&watchdog_hrtimer);
 	int ret;
 
 	/*


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (10 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:09   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, netdev, Eric Dumazet, David S. Miller

[-- Attachment #1: 0012-net-Replace-get_cpu_var-through-this_cpu_ptr.patch --]
[-- Type: text/plain, Size: 8378 bytes --]

Replace uses of get_cpu_var for address calculation through this_cpu_ptr.

Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 include/net/netfilter/nf_conntrack.h |  2 +-
 include/net/snmp.h                   |  6 +++---
 net/core/dev.c                       | 14 +++++++-------
 net/core/drop_monitor.c              |  2 +-
 net/core/skbuff.c                    |  2 +-
 net/ipv4/route.c                     |  4 ++--
 net/ipv4/syncookies.c                |  2 +-
 net/ipv4/tcp.c                       |  2 +-
 net/ipv4/tcp_output.c                |  2 +-
 net/ipv6/syncookies.c                |  2 +-
 net/rds/ib_rdma.c                    |  2 +-
 11 files changed, 20 insertions(+), 20 deletions(-)

Index: linux/include/net/netfilter/nf_conntrack.h
===================================================================
--- linux.orig/include/net/netfilter/nf_conntrack.h
+++ linux/include/net/netfilter/nf_conntrack.h
@@ -242,7 +242,7 @@ extern s32 (*nf_ct_nat_offset)(const str
 DECLARE_PER_CPU(struct nf_conn, nf_conntrack_untracked);
 static inline struct nf_conn *nf_ct_untracked_get(void)
 {
-	return &__raw_get_cpu_var(nf_conntrack_untracked);
+	return raw_cpu_ptr(&nf_conntrack_untracked);
 }
 void nf_ct_untracked_status_or(unsigned long bits);
 
Index: linux/include/net/snmp.h
===================================================================
--- linux.orig/include/net/snmp.h
+++ linux/include/net/snmp.h
@@ -168,7 +168,7 @@ struct linux_xfrm_mib {
 
 #define SNMP_ADD_STATS64_BH(mib, field, addend) 			\
 	do {								\
-		__typeof__(*mib) *ptr = __this_cpu_ptr(mib);		\
+		__typeof__(*mib) *ptr = raw_cpu_ptr(mib);		\
 		u64_stats_update_begin(&ptr->syncp);			\
 		ptr->mibs[field] += addend;				\
 		u64_stats_update_end(&ptr->syncp);			\
@@ -189,8 +189,8 @@ struct linux_xfrm_mib {
 #define SNMP_INC_STATS64(mib, field) SNMP_ADD_STATS64(mib, field, 1)
 #define SNMP_UPD_PO_STATS64_BH(mib, basefield, addend)			\
 	do {								\
-		__typeof__(*mib) *ptr;					\
-		ptr = __this_cpu_ptr(mib);				\
+		__typeof__(*mib) *ptr;				\
+		ptr = raw_cpu_ptr((mib));				\
 		u64_stats_update_begin(&ptr->syncp);			\
 		ptr->mibs[basefield##PKTS]++;				\
 		ptr->mibs[basefield##OCTETS] += addend;			\
Index: linux/net/core/dev.c
===================================================================
--- linux.orig/net/core/dev.c
+++ linux/net/core/dev.c
@@ -2153,7 +2153,7 @@ static inline void __netif_reschedule(st
 	unsigned long flags;
 
 	local_irq_save(flags);
-	sd = &__get_cpu_var(softnet_data);
+	sd = this_cpu_ptr(&softnet_data);
 	q->next_sched = NULL;
 	*sd->output_queue_tailp = q;
 	sd->output_queue_tailp = &q->next_sched;
@@ -3195,7 +3195,7 @@ static void rps_trigger_softirq(void *da
 static int rps_ipi_queued(struct softnet_data *sd)
 {
 #ifdef CONFIG_RPS
-	struct softnet_data *mysd = &__get_cpu_var(softnet_data);
+	struct softnet_data *mysd = this_cpu_ptr(&softnet_data);
 
 	if (sd != mysd) {
 		sd->rps_ipi_next = mysd->rps_ipi_list;
@@ -3222,7 +3222,7 @@ static bool skb_flow_limit(struct sk_buf
 	if (qlen < (netdev_max_backlog >> 1))
 		return false;
 
-	sd = &__get_cpu_var(softnet_data);
+	sd = this_cpu_ptr(&softnet_data);
 
 	rcu_read_lock();
 	fl = rcu_dereference(sd->flow_limit);
@@ -3369,7 +3369,7 @@ EXPORT_SYMBOL(netif_rx_ni);
 
 static void net_tx_action(struct softirq_action *h)
 {
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 
 	if (sd->completion_queue) {
 		struct sk_buff *clist;
@@ -3794,7 +3794,7 @@ EXPORT_SYMBOL(netif_receive_skb);
 static void flush_backlog(void *arg)
 {
 	struct net_device *dev = arg;
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	struct sk_buff *skb, *tmp;
 
 	rps_lock(sd);
@@ -4301,7 +4301,7 @@ void __napi_schedule(struct napi_struct
 	unsigned long flags;
 
 	local_irq_save(flags);
-	____napi_schedule(&__get_cpu_var(softnet_data), n);
+	____napi_schedule(this_cpu_ptr(&softnet_data), n);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL(__napi_schedule);
@@ -4422,7 +4422,7 @@ EXPORT_SYMBOL(netif_napi_del);
 
 static void net_rx_action(struct softirq_action *h)
 {
-	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies + 2;
 	int budget = netdev_budget;
 	void *have;
Index: linux/net/core/drop_monitor.c
===================================================================
--- linux.orig/net/core/drop_monitor.c
+++ linux/net/core/drop_monitor.c
@@ -146,7 +146,7 @@ static void trace_drop_common(struct sk_
 	unsigned long flags;
 
 	local_irq_save(flags);
-	data = &__get_cpu_var(dm_cpu_data);
+	data = this_cpu_ptr(&dm_cpu_data);
 	spin_lock(&data->lock);
 	dskb = data->skb;
 
Index: linux/net/core/skbuff.c
===================================================================
--- linux.orig/net/core/skbuff.c
+++ linux/net/core/skbuff.c
@@ -345,7 +345,7 @@ static void *__netdev_alloc_frag(unsigne
 	unsigned long flags;
 
 	local_irq_save(flags);
-	nc = &__get_cpu_var(netdev_alloc_cache);
+	nc = this_cpu_ptr(&netdev_alloc_cache);
 	if (unlikely(!nc->frag.page)) {
 refill:
 		for (order = NETDEV_FRAG_PAGE_MAX_ORDER; ;) {
Index: linux/net/ipv4/route.c
===================================================================
--- linux.orig/net/ipv4/route.c
+++ linux/net/ipv4/route.c
@@ -1311,7 +1311,7 @@ static bool rt_cache_route(struct fib_nh
 	if (rt_is_input_route(rt)) {
 		p = (struct rtable **)&nh->nh_rth_input;
 	} else {
-		p = (struct rtable **)__this_cpu_ptr(nh->nh_pcpu_rth_output);
+		p = (struct rtable **)raw_cpu_ptr(nh->nh_pcpu_rth_output);
 	}
 	orig = *p;
 
@@ -1939,7 +1939,7 @@ static struct rtable *__mkroute_output(c
 				do_cache = false;
 				goto add;
 			}
-			prth = __this_cpu_ptr(nh->nh_pcpu_rth_output);
+			prth = raw_cpu_ptr(nh->nh_pcpu_rth_output);
 		}
 		rth = rcu_dereference(*prth);
 		if (rt_cache_valid(rth)) {
Index: linux/net/ipv4/syncookies.c
===================================================================
--- linux.orig/net/ipv4/syncookies.c
+++ linux/net/ipv4/syncookies.c
@@ -40,7 +40,7 @@ static u32 cookie_hash(__be32 saddr, __b
 
 	net_get_random_once(syncookie_secret, sizeof(syncookie_secret));
 
-	tmp  = __get_cpu_var(ipv4_cookie_scratch);
+	tmp  = this_cpu_ptr(ipv4_cookie_scratch);
 	memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c]));
 	tmp[0] = (__force u32)saddr;
 	tmp[1] = (__force u32)daddr;
Index: linux/net/ipv4/tcp.c
===================================================================
--- linux.orig/net/ipv4/tcp.c
+++ linux/net/ipv4/tcp.c
@@ -3058,7 +3058,7 @@ struct tcp_md5sig_pool *tcp_get_md5sig_p
 	local_bh_disable();
 	p = ACCESS_ONCE(tcp_md5sig_pool);
 	if (p)
-		return __this_cpu_ptr(p);
+		return raw_cpu_ptr(p);
 
 	local_bh_enable();
 	return NULL;
Index: linux/net/ipv4/tcp_output.c
===================================================================
--- linux.orig/net/ipv4/tcp_output.c
+++ linux/net/ipv4/tcp_output.c
@@ -842,7 +842,7 @@ void tcp_wfree(struct sk_buff *skb)
 
 		/* queue this socket to tasklet queue */
 		local_irq_save(flags);
-		tsq = &__get_cpu_var(tsq_tasklet);
+		tsq = this_cpu_ptr(&tsq_tasklet);
 		list_add(&tp->tsq_node, &tsq->head);
 		tasklet_schedule(&tsq->tasklet);
 		local_irq_restore(flags);
Index: linux/net/ipv6/syncookies.c
===================================================================
--- linux.orig/net/ipv6/syncookies.c
+++ linux/net/ipv6/syncookies.c
@@ -67,7 +67,7 @@ static u32 cookie_hash(const struct in6_
 
 	net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret));
 
-	tmp  = __get_cpu_var(ipv6_cookie_scratch);
+	tmp  = this_cpu_ptr(ipv6_cookie_scratch);
 
 	/*
 	 * we have 320 bits of information to hash, copy in the remaining
Index: linux/net/rds/ib_rdma.c
===================================================================
--- linux.orig/net/rds/ib_rdma.c
+++ linux/net/rds/ib_rdma.c
@@ -267,7 +267,7 @@ static inline struct rds_ib_mr *rds_ib_r
 	unsigned long *flag;
 
 	preempt_disable();
-	flag = &__get_cpu_var(clean_list_grace);
+	flag = this_cpu_ptr(&clean_list_grace);
 	set_bit(CLEAN_LIST_BUSY_BIT, flag);
 	ret = llist_del_first(&pool->clean_list);
 	if (ret)


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (11 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:09   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation Christoph Lameter
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0013-md-Replace-__this_cpu_ptr-with-raw_cpu_ptr.patch --]
[-- Type: text/plain, Size: 746 bytes --]

__this_cpu_ptr is being phased out.

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/md/dm-stats.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/md/dm-stats.c
===================================================================
--- linux.orig/drivers/md/dm-stats.c
+++ linux/drivers/md/dm-stats.c
@@ -548,7 +548,7 @@ void dm_stats_account_io(struct dm_stats
 		 * A race condition can at worst result in the merged flag being
 		 * misrepresented, so we don't have to disable preemption here.
 		 */
-		last = __this_cpu_ptr(stats->last);
+		last = raw_cpu_ptr(stats->last);
 		stats_aux->merged =
 			(bi_sector == (ACCESS_ONCE(last->last_sector) &&
 				       ((bi_rw & (REQ_WRITE | REQ_DISCARD)) ==


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (12 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:09   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
                   ` (21 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

[-- Attachment #1: 0014-metag-Replace-__get_cpu_var-uses-for-address-calcula.patch --]
[-- Type: text/plain, Size: 2732 bytes --]

Replace __get_cpu_var uses for address calculation with this_cpu_ptr().

Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/metag/kernel/perf/perf_event.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Index: linux/arch/metag/kernel/perf/perf_event.c
===================================================================
--- linux.orig/arch/metag/kernel/perf/perf_event.c
+++ linux/arch/metag/kernel/perf/perf_event.c
@@ -258,7 +258,7 @@ int metag_pmu_event_set_period(struct pe
 
 static void metag_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -306,7 +306,7 @@ static void metag_pmu_stop(struct perf_e
 
 static int metag_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = 0, ret = 0;
 
@@ -348,7 +348,7 @@ out:
 
 static void metag_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -597,7 +597,7 @@ static int _hw_perf_event_init(struct pe
 
 static void metag_pmu_enable_counter(struct hw_perf_event *event, int idx)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	unsigned int config = event->config;
 	unsigned int tmp = config & 0xf0;
 	unsigned long flags;
@@ -670,7 +670,7 @@ unlock:
 
 static void metag_pmu_disable_counter(struct hw_perf_event *event, int idx)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	unsigned int tmp = 0;
 	unsigned long flags;
 
@@ -718,7 +718,7 @@ out:
 
 static void metag_pmu_write_counter(int idx, u32 val)
 {
-	struct cpu_hw_events *events = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *events = this_cpu_ptr(&cpu_hw_events);
 	u32 tmp = 0;
 	unsigned long flags;
 
@@ -751,7 +751,7 @@ static int metag_pmu_event_map(int idx)
 static irqreturn_t metag_pmu_counter_overflow(int irq, void *dev)
 {
 	int idx = (int)dev;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event = cpuhw->events[idx];
 	struct hw_perf_event *hwc = &event->hw;
 	struct pt_regs *regs = get_irq_regs();


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (13 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:09   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0015-drivers-net-ethernet-tile-__get_cpu_var-call-introdu.patch --]
[-- Type: text/plain, Size: 1175 bytes --]

Another case was merged for 3.14-rc1

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/net/ethernet/tile/tilegx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/drivers/net/ethernet/tile/tilegx.c
===================================================================
--- linux.orig/drivers/net/ethernet/tile/tilegx.c
+++ linux/drivers/net/ethernet/tile/tilegx.c
@@ -551,7 +551,7 @@ static inline bool filter_packet(struct
 static void tile_net_receive_skb(struct net_device *dev, struct sk_buff *skb,
 				 gxio_mpipe_idesc_t *idesc, unsigned long len)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int instance = priv->instance;
 
@@ -1927,7 +1927,7 @@ static void tso_egress(struct net_device
  */
 static int tile_net_tx_tso(struct sk_buff *skb, struct net_device *dev)
 {
-	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	struct tile_net_info *info = this_cpu_ptr(&per_cpu_info);
 	struct tile_net_priv *priv = netdev_priv(dev);
 	int channel = priv->echannel;
 	int instance = priv->instance;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (14 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:10   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses Christoph Lameter
                   ` (19 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, nicolas.pitre, Russell King

[-- Attachment #1: 0016-irqchips-Replace-__this_cpu_ptr-uses.patch --]
[-- Type: text/plain, Size: 2566 bytes --]

[ARM specific]

These are generally replaced with raw_cpu_ptr. However, in
gic_get_percpu_base() we immediately dereference the pointer. This is
equivalent to a raw_cpu_read. So use that operation there.

Cc: nicolas.pitre@linaro.org
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/irqchip/irq-gic.c | 10 +++++-----
 kernel/irq/chip.c         |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

Index: linux/drivers/irqchip/irq-gic.c
===================================================================
--- linux.orig/drivers/irqchip/irq-gic.c
+++ linux/drivers/irqchip/irq-gic.c
@@ -102,7 +102,7 @@ static struct gic_chip_data gic_data[MAX
 #ifdef CONFIG_GIC_NON_BANKED
 static void __iomem *gic_get_percpu_base(union gic_base *base)
 {
-	return *__this_cpu_ptr(base->percpu_base);
+	return raw_cpu_read(base->percpu_base);
 }
 
 static void __iomem *gic_get_common_base(union gic_base *base)
@@ -504,11 +504,11 @@ static void gic_cpu_save(unsigned int gi
 	if (!dist_base || !cpu_base)
 		return;
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
 	for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
 		ptr[i] = readl_relaxed(dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
 	for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
 		ptr[i] = readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
 
@@ -530,11 +530,11 @@ static void gic_cpu_restore(unsigned int
 	if (!dist_base || !cpu_base)
 		return;
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_enable);
 	for (i = 0; i < DIV_ROUND_UP(32, 32); i++)
 		writel_relaxed(ptr[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	ptr = __this_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
+	ptr = raw_cpu_ptr(gic_data[gic_nr].saved_ppi_conf);
 	for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
 		writel_relaxed(ptr[i], dist_base + GIC_DIST_CONFIG + i * 4);
 
Index: linux/kernel/irq/chip.c
===================================================================
--- linux.orig/kernel/irq/chip.c
+++ linux/kernel/irq/chip.c
@@ -669,7 +669,7 @@ void handle_percpu_devid_irq(unsigned in
 {
 	struct irq_chip *chip = irq_desc_get_chip(desc);
 	struct irqaction *action = desc->action;
-	void *dev_id = __this_cpu_ptr(action->percpu_dev_id);
+	void *dev_id = raw_cpu_ptr(action->percpu_dev_id);
 	irqreturn_t res;
 
 	kstat_incr_irqs_this_cpu(irq, desc);


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (15 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:10   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var Christoph Lameter
                   ` (18 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, x86, H. Peter Anvin

[-- Attachment #1: 0017-x86-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 50235 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/x86/include/asm/debugreg.h             |  4 +--
 arch/x86/include/asm/uv/uv_hub.h            |  2 +-
 arch/x86/kernel/apb_timer.c                 |  4 +--
 arch/x86/kernel/apic/apic.c                 |  4 +--
 arch/x86/kernel/apic/x2apic_cluster.c       |  2 +-
 arch/x86/kernel/cpu/common.c                |  6 ++--
 arch/x86/kernel/cpu/mcheck/mce-inject.c     |  6 ++--
 arch/x86/kernel/cpu/mcheck/mce.c            | 46 ++++++++++++++---------------
 arch/x86/kernel/cpu/mcheck/mce_amd.c        |  2 +-
 arch/x86/kernel/cpu/mcheck/mce_intel.c      | 22 +++++++-------
 arch/x86/kernel/cpu/perf_event.c            | 22 +++++++-------
 arch/x86/kernel/cpu/perf_event_amd.c        |  4 +--
 arch/x86/kernel/cpu/perf_event_intel.c      | 18 +++++------
 arch/x86/kernel/cpu/perf_event_intel_ds.c   | 20 ++++++-------
 arch/x86/kernel/cpu/perf_event_intel_lbr.c  | 12 ++++----
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 ++++----
 arch/x86/kernel/cpu/perf_event_knc.c        |  2 +-
 arch/x86/kernel/cpu/perf_event_p4.c         |  6 ++--
 arch/x86/kernel/hw_breakpoint.c             |  8 ++---
 arch/x86/kernel/irq_64.c                    |  6 ++--
 arch/x86/kernel/kvm.c                       | 22 +++++++-------
 arch/x86/kvm/svm.c                          |  6 ++--
 arch/x86/kvm/vmx.c                          | 10 +++----
 arch/x86/kvm/x86.c                          |  2 +-
 arch/x86/mm/kmemcheck/kmemcheck.c           | 14 ++++-----
 arch/x86/oprofile/nmi_int.c                 |  8 ++---
 arch/x86/platform/uv/uv_time.c              |  2 +-
 arch/x86/xen/enlighten.c                    |  4 +--
 arch/x86/xen/multicalls.c                   |  8 ++---
 arch/x86/xen/spinlock.c                     |  2 +-
 arch/x86/xen/time.c                         | 10 +++----
 31 files changed, 148 insertions(+), 148 deletions(-)

Index: linux/arch/x86/include/asm/debugreg.h
===================================================================
--- linux.orig/arch/x86/include/asm/debugreg.h
+++ linux/arch/x86/include/asm/debugreg.h
@@ -97,11 +97,11 @@ extern void hw_breakpoint_restore(void);
 DECLARE_PER_CPU(int, debug_stack_usage);
 static inline void debug_stack_usage_inc(void)
 {
-	__get_cpu_var(debug_stack_usage)++;
+	__this_cpu_inc(debug_stack_usage);
 }
 static inline void debug_stack_usage_dec(void)
 {
-	__get_cpu_var(debug_stack_usage)--;
+	__this_cpu_dec(debug_stack_usage);
 }
 int is_debug_stack(unsigned long addr);
 void debug_stack_set_zero(void);
Index: linux/arch/x86/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -164,7 +164,7 @@ struct uv_hub_info_s {
 };
 
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/x86/kernel/apb_timer.c
===================================================================
--- linux.orig/arch/x86/kernel/apb_timer.c
+++ linux/arch/x86/kernel/apb_timer.c
@@ -146,7 +146,7 @@ static inline int is_apbt_capable(void)
 static int __init apbt_clockevent_register(void)
 {
 	struct sfi_timer_table_entry *mtmr;
-	struct apbt_dev *adev = &__get_cpu_var(cpu_apbt_dev);
+	struct apbt_dev *adev = this_cpu_ptr(&cpu_apbt_dev);
 
 	mtmr = sfi_get_mtmr(APBT_CLOCKEVENT0_NUM);
 	if (mtmr == NULL) {
@@ -200,7 +200,7 @@ void apbt_setup_secondary_clock(void)
 	if (!cpu)
 		return;
 
-	adev = &__get_cpu_var(cpu_apbt_dev);
+	adev = this_cpu_ptr(&cpu_apbt_dev);
 	if (!adev->timer) {
 		adev->timer = dw_apb_clockevent_init(cpu, adev->name,
 			APBT_CLOCKEVENT_RATING, adev_virt_addr(adev),
Index: linux/arch/x86/kernel/apic/apic.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/apic.c
+++ linux/arch/x86/kernel/apic/apic.c
@@ -561,7 +561,7 @@ static DEFINE_PER_CPU(struct clock_event
  */
 static void setup_APIC_timer(void)
 {
-	struct clock_event_device *levt = &__get_cpu_var(lapic_events);
+	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
 
 	if (this_cpu_has(X86_FEATURE_ARAT)) {
 		lapic_clockevent.features &= ~CLOCK_EVT_FEAT_C3STOP;
@@ -696,7 +696,7 @@ calibrate_by_pmtimer(long deltapm, long
 
 static int __init calibrate_APIC_clock(void)
 {
-	struct clock_event_device *levt = &__get_cpu_var(lapic_events);
+	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
 	void (*real_handler)(struct clock_event_device *dev);
 	unsigned long deltaj;
 	long delta, deltatsc;
Index: linux/arch/x86/kernel/cpu/common.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/common.c
+++ linux/arch/x86/kernel/cpu/common.c
@@ -1198,9 +1198,9 @@ DEFINE_PER_CPU(int, debug_stack_usage);
 
 int is_debug_stack(unsigned long addr)
 {
-	return __get_cpu_var(debug_stack_usage) ||
-		(addr <= __get_cpu_var(debug_stack_addr) &&
-		 addr > (__get_cpu_var(debug_stack_addr) - DEBUG_STKSZ));
+	return __this_cpu_read(debug_stack_usage) ||
+		(addr <= __this_cpu_read(debug_stack_addr) &&
+		 addr > (__this_cpu_read(debug_stack_addr) - DEBUG_STKSZ));
 }
 NOKPROBE_SYMBOL(is_debug_stack);
 
Index: linux/arch/x86/kernel/cpu/mcheck/mce-inject.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -83,7 +83,7 @@ static DEFINE_MUTEX(mce_inject_mutex);
 static int mce_raise_notify(unsigned int cmd, struct pt_regs *regs)
 {
 	int cpu = smp_processor_id();
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 	if (!cpumask_test_cpu(cpu, mce_inject_cpumask))
 		return NMI_DONE;
 	cpumask_clear_cpu(cpu, mce_inject_cpumask);
@@ -97,7 +97,7 @@ static int mce_raise_notify(unsigned int
 static void mce_irq_ipi(void *info)
 {
 	int cpu = smp_processor_id();
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 
 	if (cpumask_test_cpu(cpu, mce_inject_cpumask) &&
 			m->inject_flags & MCJ_EXCEPTION) {
@@ -109,7 +109,7 @@ static void mce_irq_ipi(void *info)
 /* Inject mce on current CPU */
 static int raise_local(void)
 {
-	struct mce *m = &__get_cpu_var(injectm);
+	struct mce *m = this_cpu_ptr(&injectm);
 	int context = MCJ_CTX(m->inject_flags);
 	int ret = 0;
 	int cpu = m->extcpu;
Index: linux/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce.c
@@ -400,7 +400,7 @@ static u64 mce_rdmsrl(u32 msr)
 
 		if (offset < 0)
 			return 0;
-		return *(u64 *)((char *)&__get_cpu_var(injectm) + offset);
+		return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset);
 	}
 
 	if (rdmsrl_safe(msr, &v)) {
@@ -422,7 +422,7 @@ static void mce_wrmsrl(u32 msr, u64 v)
 		int offset = msr_to_offset(msr);
 
 		if (offset >= 0)
-			*(u64 *)((char *)&__get_cpu_var(injectm) + offset) = v;
+			*(u64 *)((char *)this_cpu_ptr(&injectm) + offset) = v;
 		return;
 	}
 	wrmsrl(msr, v);
@@ -478,7 +478,7 @@ static DEFINE_PER_CPU(struct mce_ring, m
 /* Runs with CPU affinity in workqueue */
 static int mce_ring_empty(void)
 {
-	struct mce_ring *r = &__get_cpu_var(mce_ring);
+	struct mce_ring *r = this_cpu_ptr(&mce_ring);
 
 	return r->start == r->end;
 }
@@ -490,7 +490,7 @@ static int mce_ring_get(unsigned long *p
 
 	*pfn = 0;
 	get_cpu();
-	r = &__get_cpu_var(mce_ring);
+	r = this_cpu_ptr(&mce_ring);
 	if (r->start == r->end)
 		goto out;
 	*pfn = r->ring[r->start];
@@ -504,7 +504,7 @@ out:
 /* Always runs in MCE context with preempt off */
 static int mce_ring_add(unsigned long pfn)
 {
-	struct mce_ring *r = &__get_cpu_var(mce_ring);
+	struct mce_ring *r = this_cpu_ptr(&mce_ring);
 	unsigned next;
 
 	next = (r->end + 1) % MCE_RING_SIZE;
@@ -526,7 +526,7 @@ int mce_available(struct cpuinfo_x86 *c)
 static void mce_schedule_work(void)
 {
 	if (!mce_ring_empty())
-		schedule_work(&__get_cpu_var(mce_work));
+		schedule_work(this_cpu_ptr(&mce_work));
 }
 
 DEFINE_PER_CPU(struct irq_work, mce_irq_work);
@@ -551,7 +551,7 @@ static void mce_report_event(struct pt_r
 		return;
 	}
 
-	irq_work_queue(&__get_cpu_var(mce_irq_work));
+	irq_work_queue(this_cpu_ptr(&mce_irq_work));
 }
 
 /*
@@ -1045,7 +1045,7 @@ void do_machine_check(struct pt_regs *re
 
 	mce_gather_info(&m, regs);
 
-	final = &__get_cpu_var(mces_seen);
+	final = this_cpu_ptr(&mces_seen);
 	*final = m;
 
 	memset(valid_banks, 0, sizeof(valid_banks));
@@ -1278,22 +1278,22 @@ static unsigned long (*mce_adjust_timer)
 
 static int cmc_error_seen(void)
 {
-	unsigned long *v = &__get_cpu_var(mce_polled_error);
+	unsigned long *v = this_cpu_ptr(&mce_polled_error);
 
 	return test_and_clear_bit(0, v);
 }
 
 static void mce_timer_fn(unsigned long data)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned long iv;
 	int notify;
 
 	WARN_ON(smp_processor_id() != data);
 
-	if (mce_available(__this_cpu_ptr(&cpu_info))) {
+	if (mce_available(this_cpu_ptr(&cpu_info))) {
 		machine_check_poll(MCP_TIMESTAMP,
-				&__get_cpu_var(mce_poll_banks));
+				this_cpu_ptr(&mce_poll_banks));
 		mce_intel_cmci_poll();
 	}
 
@@ -1323,7 +1323,7 @@ static void mce_timer_fn(unsigned long d
  */
 void mce_timer_kick(unsigned long interval)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned long when = jiffies + interval;
 	unsigned long iv = __this_cpu_read(mce_next_interval);
 
@@ -1659,7 +1659,7 @@ static void mce_start_timer(unsigned int
 
 static void __mcheck_cpu_init_timer(void)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
+	struct timer_list *t = this_cpu_ptr(&mce_timer);
 	unsigned int cpu = smp_processor_id();
 
 	setup_timer(t, mce_timer_fn, cpu);
@@ -1702,8 +1702,8 @@ void mcheck_cpu_init(struct cpuinfo_x86
 	__mcheck_cpu_init_generic();
 	__mcheck_cpu_init_vendor(c);
 	__mcheck_cpu_init_timer();
-	INIT_WORK(&__get_cpu_var(mce_work), mce_process_work);
-	init_irq_work(&__get_cpu_var(mce_irq_work), &mce_irq_work_cb);
+	INIT_WORK(this_cpu_ptr(&mce_work), mce_process_work);
+	init_irq_work(this_cpu_ptr(&mce_irq_work), &mce_irq_work_cb);
 }
 
 /*
@@ -1955,7 +1955,7 @@ static struct miscdevice mce_chrdev_devi
 static void __mce_disable_bank(void *arg)
 {
 	int bank = *((int *)arg);
-	__clear_bit(bank, __get_cpu_var(mce_poll_banks));
+	__clear_bit(bank, this_cpu_ptr(mce_poll_banks));
 	cmci_disable_bank(bank);
 }
 
@@ -2065,7 +2065,7 @@ static void mce_syscore_shutdown(void)
 static void mce_syscore_resume(void)
 {
 	__mcheck_cpu_init_generic();
-	__mcheck_cpu_init_vendor(__this_cpu_ptr(&cpu_info));
+	__mcheck_cpu_init_vendor(raw_cpu_ptr(&cpu_info));
 }
 
 static struct syscore_ops mce_syscore_ops = {
@@ -2080,7 +2080,7 @@ static struct syscore_ops mce_syscore_op
 
 static void mce_cpu_restart(void *data)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	__mcheck_cpu_init_generic();
 	__mcheck_cpu_init_timer();
@@ -2096,14 +2096,14 @@ static void mce_restart(void)
 /* Toggle features for corrected errors */
 static void mce_disable_cmci(void *data)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	cmci_clear();
 }
 
 static void mce_enable_ce(void *all)
 {
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 	cmci_reenable();
 	cmci_recheck();
@@ -2336,7 +2336,7 @@ static void mce_disable_cpu(void *h)
 	unsigned long action = *(unsigned long *)h;
 	int i;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 
 	if (!(action & CPU_TASKS_FROZEN))
@@ -2354,7 +2354,7 @@ static void mce_reenable_cpu(void *h)
 	unsigned long action = *(unsigned long *)h;
 	int i;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)))
 		return;
 
 	if (!(action & CPU_TASKS_FROZEN))
Index: linux/arch/x86/kernel/cpu/mcheck/mce_amd.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -310,7 +310,7 @@ static void amd_threshold_interrupt(void
 			 * event.
 			 */
 			machine_check_poll(MCP_TIMESTAMP,
-					&__get_cpu_var(mce_poll_banks));
+					this_cpu_ptr(&mce_poll_banks));
 
 			if (high & MASK_OVERFLOW_HI) {
 				rdmsrl(address, m.misc);
Index: linux/arch/x86/kernel/cpu/mcheck/mce_intel.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ linux/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -86,7 +86,7 @@ void mce_intel_cmci_poll(void)
 {
 	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
 		return;
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 }
 
 void mce_intel_hcpu_update(unsigned long cpu)
@@ -145,7 +145,7 @@ static void cmci_storm_disable_banks(voi
 	u64 val;
 
 	raw_spin_lock_irqsave(&cmci_discover_lock, flags);
-	owned = __get_cpu_var(mce_banks_owned);
+	owned = this_cpu_ptr(mce_banks_owned);
 	for_each_set_bit(bank, owned, MAX_NR_BANKS) {
 		rdmsrl(MSR_IA32_MCx_CTL2(bank), val);
 		val &= ~MCI_CTL2_CMCI_EN;
@@ -195,7 +195,7 @@ static void intel_threshold_interrupt(vo
 {
 	if (cmci_storm_detect())
 		return;
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 	mce_notify_irq();
 }
 
@@ -206,7 +206,7 @@ static void intel_threshold_interrupt(vo
  */
 static void cmci_discover(int banks)
 {
-	unsigned long *owned = (void *)&__get_cpu_var(mce_banks_owned);
+	unsigned long *owned = (void *)this_cpu_ptr(&mce_banks_owned);
 	unsigned long flags;
 	int i;
 	int bios_wrong_thresh = 0;
@@ -228,7 +228,7 @@ static void cmci_discover(int banks)
 		/* Already owned by someone else? */
 		if (val & MCI_CTL2_CMCI_EN) {
 			clear_bit(i, owned);
-			__clear_bit(i, __get_cpu_var(mce_poll_banks));
+			__clear_bit(i, this_cpu_ptr(mce_poll_banks));
 			continue;
 		}
 
@@ -252,7 +252,7 @@ static void cmci_discover(int banks)
 		/* Did the enable bit stick? -- the bank supports CMCI */
 		if (val & MCI_CTL2_CMCI_EN) {
 			set_bit(i, owned);
-			__clear_bit(i, __get_cpu_var(mce_poll_banks));
+			__clear_bit(i, this_cpu_ptr(mce_poll_banks));
 			/*
 			 * We are able to set thresholds for some banks that
 			 * had a threshold of 0. This means the BIOS has not
@@ -263,7 +263,7 @@ static void cmci_discover(int banks)
 					(val & MCI_CTL2_CMCI_THRESHOLD_MASK))
 				bios_wrong_thresh = 1;
 		} else {
-			WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
+			WARN_ON(!test_bit(i, this_cpu_ptr(mce_poll_banks)));
 		}
 	}
 	raw_spin_unlock_irqrestore(&cmci_discover_lock, flags);
@@ -284,10 +284,10 @@ void cmci_recheck(void)
 	unsigned long flags;
 	int banks;
 
-	if (!mce_available(__this_cpu_ptr(&cpu_info)) || !cmci_supported(&banks))
+	if (!mce_available(raw_cpu_ptr(&cpu_info)) || !cmci_supported(&banks))
 		return;
 	local_irq_save(flags);
-	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+	machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_banks_owned));
 	local_irq_restore(flags);
 }
 
@@ -296,12 +296,12 @@ static void __cmci_disable_bank(int bank
 {
 	u64 val;
 
-	if (!test_bit(bank, __get_cpu_var(mce_banks_owned)))
+	if (!test_bit(bank, this_cpu_ptr(mce_banks_owned)))
 		return;
 	rdmsrl(MSR_IA32_MCx_CTL2(bank), val);
 	val &= ~MCI_CTL2_CMCI_EN;
 	wrmsrl(MSR_IA32_MCx_CTL2(bank), val);
-	__clear_bit(bank, __get_cpu_var(mce_banks_owned));
+	__clear_bit(bank, this_cpu_ptr(mce_banks_owned));
 }
 
 /*
Index: linux/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux/arch/x86/kernel/cpu/perf_event.c
@@ -487,7 +487,7 @@ static int __x86_pmu_event_init(struct p
 
 void x86_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -505,7 +505,7 @@ void x86_pmu_disable_all(void)
 
 static void x86_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu_initialized())
 		return;
@@ -522,7 +522,7 @@ static void x86_pmu_disable(struct pmu *
 
 void x86_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -869,7 +869,7 @@ static void x86_pmu_start(struct perf_ev
 
 static void x86_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
 	int i, added = cpuc->n_added;
@@ -1020,7 +1020,7 @@ void x86_pmu_enable_event(struct perf_ev
  */
 static int x86_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc;
 	int assign[X86_PMC_IDX_MAX];
 	int n, n0, ret;
@@ -1071,7 +1071,7 @@ out:
 
 static void x86_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = event->hw.idx;
 
 	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
@@ -1150,7 +1150,7 @@ void perf_event_print_debug(void)
 
 void x86_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (__test_and_clear_bit(hwc->idx, cpuc->active_mask)) {
@@ -1172,7 +1172,7 @@ void x86_pmu_stop(struct perf_event *eve
 
 static void x86_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	/*
@@ -1227,7 +1227,7 @@ int x86_pmu_handle_irq(struct pt_regs *r
 	int idx, handled = 0;
 	u64 val;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * Some chipsets need to unmask the LVTPC in a particular spot
@@ -1636,7 +1636,7 @@ static void x86_pmu_cancel_txn(struct pm
  */
 static int x86_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int assign[X86_PMC_IDX_MAX];
 	int n, ret;
 
@@ -1995,7 +1995,7 @@ static unsigned long get_segment_base(un
 		if (idx > GDT_ENTRIES)
 			return 0;
 
-		desc = __this_cpu_ptr(&gdt_page.gdt[0]);
+		desc = raw_cpu_ptr(gdt_page.gdt);
 	}
 
 	return get_desc_base(desc + idx);
Index: linux/arch/x86/kernel/cpu/perf_event_amd.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_amd.c
+++ linux/arch/x86/kernel/cpu/perf_event_amd.c
@@ -699,7 +699,7 @@ __init int amd_pmu_init(void)
 
 void amd_pmu_enable_virt(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	cpuc->perf_ctr_virt_mask = 0;
 
@@ -711,7 +711,7 @@ EXPORT_SYMBOL_GPL(amd_pmu_enable_virt);
 
 void amd_pmu_disable_virt(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * We only mask out the Host-only bit so that host-only counting works
Index: linux/arch/x86/kernel/cpu/perf_event_intel.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel.c
+++ linux/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1045,7 +1045,7 @@ static inline bool intel_pmu_needs_lbr_s
 
 static void intel_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0);
 
@@ -1058,7 +1058,7 @@ static void intel_pmu_disable_all(void)
 
 static void intel_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	intel_pmu_pebs_enable_all();
 	intel_pmu_lbr_enable_all();
@@ -1092,7 +1092,7 @@ static void intel_pmu_enable_all(int add
  */
 static void intel_pmu_nhm_workaround(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	static const unsigned long nhm_magic[4] = {
 		0x4300B5,
 		0x4300D2,
@@ -1191,7 +1191,7 @@ static inline bool event_is_checkpointed
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
 		intel_pmu_disable_bts();
@@ -1255,7 +1255,7 @@ static void intel_pmu_enable_fixed(struc
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
 		if (!__this_cpu_read(cpu_hw_events.enabled))
@@ -1349,7 +1349,7 @@ static int intel_pmu_handle_irq(struct p
 	u64 status;
 	int handled;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
 	 * No known reason to not always do late ACK,
@@ -1781,7 +1781,7 @@ EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
 
 static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
 
 	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
@@ -1802,7 +1802,7 @@ static struct perf_guest_switch_msr *int
 
 static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
 	int idx;
 
@@ -1836,7 +1836,7 @@ static void core_pmu_enable_event(struct
 
 static void core_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
Index: linux/arch/x86/kernel/cpu/perf_event_intel_ds.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ linux/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -475,7 +475,7 @@ void intel_pmu_enable_bts(u64 config)
 
 void intel_pmu_disable_bts(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long debugctlmsr;
 
 	if (!cpuc->ds)
@@ -492,7 +492,7 @@ void intel_pmu_disable_bts(void)
 
 int intel_pmu_drain_bts_buffer(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct bts_record {
 		u64	from;
@@ -712,7 +712,7 @@ struct event_constraint *intel_pebs_cons
 
 void intel_pmu_pebs_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
@@ -727,7 +727,7 @@ void intel_pmu_pebs_enable(struct perf_e
 
 void intel_pmu_pebs_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
@@ -745,7 +745,7 @@ void intel_pmu_pebs_disable(struct perf_
 
 void intel_pmu_pebs_enable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->pebs_enabled)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
@@ -753,7 +753,7 @@ void intel_pmu_pebs_enable_all(void)
 
 void intel_pmu_pebs_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->pebs_enabled)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
@@ -761,7 +761,7 @@ void intel_pmu_pebs_disable_all(void)
 
 static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long from = cpuc->lbr_entries[0].from;
 	unsigned long old_to, to = cpuc->lbr_entries[0].to;
 	unsigned long ip = regs->ip;
@@ -868,7 +868,7 @@ static void __intel_pmu_pebs_event(struc
 	 * We cast to the biggest pebs_record but are careful not to
 	 * unconditionally access the 'extra' entries.
 	 */
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct pebs_record_hsw *pebs = __pebs;
 	struct perf_sample_data data;
 	struct pt_regs regs;
@@ -957,7 +957,7 @@ static void __intel_pmu_pebs_event(struc
 
 static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct perf_event *event = cpuc->events[0]; /* PMC0 only */
 	struct pebs_record_core *at, *top;
@@ -998,7 +998,7 @@ static void intel_pmu_drain_pebs_core(st
 
 static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct debug_store *ds = cpuc->ds;
 	struct perf_event *event = NULL;
 	void *at, *top;
Index: linux/arch/x86/kernel/cpu/perf_event_intel_lbr.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ linux/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -133,7 +133,7 @@ static void intel_pmu_lbr_filter(struct
 static void __intel_pmu_lbr_enable(void)
 {
 	u64 debugctl;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_sel)
 		wrmsrl(MSR_LBR_SELECT, cpuc->lbr_sel->config);
@@ -183,7 +183,7 @@ void intel_pmu_lbr_reset(void)
 
 void intel_pmu_lbr_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu.lbr_nr)
 		return;
@@ -203,7 +203,7 @@ void intel_pmu_lbr_enable(struct perf_ev
 
 void intel_pmu_lbr_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!x86_pmu.lbr_nr)
 		return;
@@ -220,7 +220,7 @@ void intel_pmu_lbr_disable(struct perf_e
 
 void intel_pmu_lbr_enable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_users)
 		__intel_pmu_lbr_enable();
@@ -228,7 +228,7 @@ void intel_pmu_lbr_enable_all(void)
 
 void intel_pmu_lbr_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->lbr_users)
 		__intel_pmu_lbr_disable();
@@ -332,7 +332,7 @@ static void intel_pmu_lbr_read_64(struct
 
 void intel_pmu_lbr_read(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuc->lbr_users)
 		return;
Index: linux/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ linux/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -135,7 +135,7 @@ static inline u64 rapl_scale(u64 v)
 	 * or use ldexp(count, -32).
 	 * Watts = Joules/Time delta
 	 */
-	return v << (32 - __get_cpu_var(rapl_pmu)->hw_unit);
+	return v << (32 - __this_cpu_read(rapl_pmu->hw_unit));
 }
 
 static u64 rapl_event_update(struct perf_event *event)
@@ -187,7 +187,7 @@ static void rapl_stop_hrtimer(struct rap
 
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct perf_event *event;
 	unsigned long flags;
 
@@ -234,7 +234,7 @@ static void __rapl_pmu_event_start(struc
 
 static void rapl_pmu_event_start(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	unsigned long flags;
 
 	spin_lock_irqsave(&pmu->lock, flags);
@@ -244,7 +244,7 @@ static void rapl_pmu_event_start(struct
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -278,7 +278,7 @@ static void rapl_pmu_event_stop(struct p
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __get_cpu_var(rapl_pmu);
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -696,7 +696,7 @@ static int __init rapl_pmu_init(void)
 		return -1;
 	}
 
-	pmu = __get_cpu_var(rapl_pmu);
+	pmu = __this_cpu_read(rapl_pmu);
 
 	pr_info("RAPL PMU detected, hw unit 2^-%d Joules,"
 		" API unit is 2^-32 Joules,"
Index: linux/arch/x86/kernel/cpu/perf_event_knc.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_knc.c
+++ linux/arch/x86/kernel/cpu/perf_event_knc.c
@@ -217,7 +217,7 @@ static int knc_pmu_handle_irq(struct pt_
 	int bit, loops;
 	u64 status;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	knc_pmu_disable_all();
 
Index: linux/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux/arch/x86/kernel/cpu/perf_event_p4.c
@@ -915,7 +915,7 @@ static inline void p4_pmu_disable_event(
 
 static void p4_pmu_disable_all(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -984,7 +984,7 @@ static void p4_pmu_enable_event(struct p
 
 static void p4_pmu_enable_all(int added)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
@@ -1004,7 +1004,7 @@ static int p4_pmu_handle_irq(struct pt_r
 	int idx, handled = 0;
 	u64 val;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
 		int overflow;
Index: linux/arch/x86/kernel/hw_breakpoint.c
===================================================================
--- linux.orig/arch/x86/kernel/hw_breakpoint.c
+++ linux/arch/x86/kernel/hw_breakpoint.c
@@ -108,7 +108,7 @@ int arch_install_hw_breakpoint(struct pe
 	int i;
 
 	for (i = 0; i < HBP_NUM; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (!*slot) {
 			*slot = bp;
@@ -122,7 +122,7 @@ int arch_install_hw_breakpoint(struct pe
 	set_debugreg(info->address, i);
 	__this_cpu_write(cpu_debugreg[i], info->address);
 
-	dr7 = &__get_cpu_var(cpu_dr7);
+	dr7 = this_cpu_ptr(&cpu_dr7);
 	*dr7 |= encode_dr7(i, info->len, info->type);
 
 	set_debugreg(*dr7, 7);
@@ -146,7 +146,7 @@ void arch_uninstall_hw_breakpoint(struct
 	int i;
 
 	for (i = 0; i < HBP_NUM; i++) {
-		struct perf_event **slot = &__get_cpu_var(bp_per_reg[i]);
+		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
 		if (*slot == bp) {
 			*slot = NULL;
@@ -157,7 +157,7 @@ void arch_uninstall_hw_breakpoint(struct
 	if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
 		return;
 
-	dr7 = &__get_cpu_var(cpu_dr7);
+	dr7 = this_cpu_ptr(&cpu_dr7);
 	*dr7 &= ~__encode_dr7(i, info->len, info->type);
 
 	set_debugreg(*dr7, 7);
Index: linux/arch/x86/kernel/irq_64.c
===================================================================
--- linux.orig/arch/x86/kernel/irq_64.c
+++ linux/arch/x86/kernel/irq_64.c
@@ -52,13 +52,13 @@ static inline void stack_overflow_check(
 	    regs->sp <= curbase + THREAD_SIZE)
 		return;
 
-	irq_stack_top = (u64)__get_cpu_var(irq_stack_union.irq_stack) +
+	irq_stack_top = (u64)this_cpu_ptr(irq_stack_union.irq_stack) +
 			STACK_TOP_MARGIN;
-	irq_stack_bottom = (u64)__get_cpu_var(irq_stack_ptr);
+	irq_stack_bottom = (u64)__this_cpu_read(irq_stack_ptr);
 	if (regs->sp >= irq_stack_top && regs->sp <= irq_stack_bottom)
 		return;
 
-	oist = &__get_cpu_var(orig_ist);
+	oist = this_cpu_ptr(&orig_ist);
 	estack_top = (u64)oist->ist[0] - EXCEPTION_STKSZ + STACK_TOP_MARGIN;
 	estack_bottom = (u64)oist->ist[N_EXCEPTION_STACKS - 1];
 	if (regs->sp >= estack_top && regs->sp <= estack_bottom)
Index: linux/arch/x86/kernel/kvm.c
===================================================================
--- linux.orig/arch/x86/kernel/kvm.c
+++ linux/arch/x86/kernel/kvm.c
@@ -243,9 +243,9 @@ u32 kvm_read_and_reset_pf_reason(void)
 {
 	u32 reason = 0;
 
-	if (__get_cpu_var(apf_reason).enabled) {
-		reason = __get_cpu_var(apf_reason).reason;
-		__get_cpu_var(apf_reason).reason = 0;
+	if (__this_cpu_read(apf_reason.enabled)) {
+		reason = __this_cpu_read(apf_reason.reason);
+		__this_cpu_write(apf_reason.reason, 0);
 	}
 
 	return reason;
@@ -318,7 +318,7 @@ static void kvm_guest_apic_eoi_write(u32
 	 * there's no need for lock or memory barriers.
 	 * An optimization barrier is implied in apic write.
 	 */
-	if (__test_and_clear_bit(KVM_PV_EOI_BIT, &__get_cpu_var(kvm_apic_eoi)))
+	if (__test_and_clear_bit(KVM_PV_EOI_BIT, this_cpu_ptr(&kvm_apic_eoi)))
 		return;
 	apic_write(APIC_EOI, APIC_EOI_ACK);
 }
@@ -329,13 +329,13 @@ void kvm_guest_cpu_init(void)
 		return;
 
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
-		u64 pa = slow_virt_to_phys(&__get_cpu_var(apf_reason));
+		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
-		__get_cpu_var(apf_reason).enabled = 1;
+		__this_cpu_write(apf_reason.enabled, 1);
 		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
 		       smp_processor_id());
 	}
@@ -344,8 +344,8 @@ void kvm_guest_cpu_init(void)
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
-		__get_cpu_var(kvm_apic_eoi) = 0;
-		pa = slow_virt_to_phys(&__get_cpu_var(kvm_apic_eoi))
+		__this_cpu_write(kvm_apic_eoi, 0);
+		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
 	}
@@ -356,11 +356,11 @@ void kvm_guest_cpu_init(void)
 
 static void kvm_pv_disable_apf(void)
 {
-	if (!__get_cpu_var(apf_reason).enabled)
+	if (!__this_cpu_read(apf_reason.enabled))
 		return;
 
 	wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
-	__get_cpu_var(apf_reason).enabled = 0;
+	__this_cpu_write(apf_reason.enabled, 0);
 
 	printk(KERN_INFO"Unregister pv shared memory for cpu %d\n",
 	       smp_processor_id());
@@ -716,7 +716,7 @@ __visible void kvm_lock_spinning(struct
 	if (in_nmi())
 		return;
 
-	w = &__get_cpu_var(klock_waiting);
+	w = this_cpu_ptr(&klock_waiting);
 	cpu = smp_processor_id();
 	start = spin_time_start();
 
Index: linux/arch/x86/kvm/svm.c
===================================================================
--- linux.orig/arch/x86/kvm/svm.c
+++ linux/arch/x86/kvm/svm.c
@@ -670,7 +670,7 @@ static int svm_hardware_enable(void *gar
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR)) {
 		wrmsrl(MSR_AMD64_TSC_RATIO, TSC_RATIO_DEFAULT);
-		__get_cpu_var(current_tsc_ratio) = TSC_RATIO_DEFAULT;
+		__this_cpu_write(current_tsc_ratio, TSC_RATIO_DEFAULT);
 	}
 
 
@@ -1312,8 +1312,8 @@ static void svm_vcpu_load(struct kvm_vcp
 		rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR) &&
-	    svm->tsc_ratio != __get_cpu_var(current_tsc_ratio)) {
-		__get_cpu_var(current_tsc_ratio) = svm->tsc_ratio;
+	    svm->tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
+		__this_cpu_write(current_tsc_ratio, svm->tsc_ratio);
 		wrmsrl(MSR_AMD64_TSC_RATIO, svm->tsc_ratio);
 	}
 }
Index: linux/arch/x86/kvm/vmx.c
===================================================================
--- linux.orig/arch/x86/kvm/vmx.c
+++ linux/arch/x86/kvm/vmx.c
@@ -1601,7 +1601,7 @@ static void reload_tss(void)
 	/*
 	 * VT restores TR but not its size.  Useless.
 	 */
-	struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+	struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 	struct desc_struct *descs;
 
 	descs = (void *)gdt->address;
@@ -1647,7 +1647,7 @@ static bool update_transition_efer(struc
 
 static unsigned long segment_base(u16 selector)
 {
-	struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+	struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 	struct desc_struct *d;
 	unsigned long table_base;
 	unsigned long v;
@@ -1777,7 +1777,7 @@ static void __vmx_load_host_state(struct
 	 */
 	if (!user_has_fpu() && !vmx->vcpu.guest_fpu_loaded)
 		stts();
-	load_gdt(&__get_cpu_var(host_gdt));
+	load_gdt(this_cpu_ptr(&host_gdt));
 }
 
 static void vmx_load_host_state(struct vcpu_vmx *vmx)
@@ -1807,7 +1807,7 @@ static void vmx_vcpu_load(struct kvm_vcp
 	}
 
 	if (vmx->loaded_vmcs->cpu != cpu) {
-		struct desc_ptr *gdt = &__get_cpu_var(host_gdt);
+		struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 		unsigned long sysenter_esp;
 
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
@@ -2744,7 +2744,7 @@ static int hardware_enable(void *garbage
 		ept_sync_global();
 	}
 
-	native_store_gdt(&__get_cpu_var(host_gdt));
+	native_store_gdt(this_cpu_ptr(&host_gdt));
 
 	return 0;
 }
Index: linux/arch/x86/kvm/x86.c
===================================================================
--- linux.orig/arch/x86/kvm/x86.c
+++ linux/arch/x86/kvm/x86.c
@@ -1556,7 +1556,7 @@ static int kvm_guest_time_update(struct
 
 	/* Keep irq disabled to prevent changes to the clock */
 	local_irq_save(flags);
-	this_tsc_khz = __get_cpu_var(cpu_tsc_khz);
+	this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
 	if (unlikely(this_tsc_khz == 0)) {
 		local_irq_restore(flags);
 		kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
Index: linux/arch/x86/mm/kmemcheck/kmemcheck.c
===================================================================
--- linux.orig/arch/x86/mm/kmemcheck/kmemcheck.c
+++ linux/arch/x86/mm/kmemcheck/kmemcheck.c
@@ -140,7 +140,7 @@ static DEFINE_PER_CPU(struct kmemcheck_c
 
 bool kmemcheck_active(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	return data->balance > 0;
 }
@@ -148,7 +148,7 @@ bool kmemcheck_active(struct pt_regs *re
 /* Save an address that needs to be shown/hidden */
 static void kmemcheck_save_addr(unsigned long addr)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	BUG_ON(data->n_addrs >= ARRAY_SIZE(data->addr));
 	data->addr[data->n_addrs++] = addr;
@@ -156,7 +156,7 @@ static void kmemcheck_save_addr(unsigned
 
 static unsigned int kmemcheck_show_all(void)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	unsigned int i;
 	unsigned int n;
 
@@ -169,7 +169,7 @@ static unsigned int kmemcheck_show_all(v
 
 static unsigned int kmemcheck_hide_all(void)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	unsigned int i;
 	unsigned int n;
 
@@ -185,7 +185,7 @@ static unsigned int kmemcheck_hide_all(v
  */
 void kmemcheck_show(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	BUG_ON(!irqs_disabled());
 
@@ -226,7 +226,7 @@ void kmemcheck_show(struct pt_regs *regs
  */
 void kmemcheck_hide(struct pt_regs *regs)
 {
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 	int n;
 
 	BUG_ON(!irqs_disabled());
@@ -528,7 +528,7 @@ static void kmemcheck_access(struct pt_r
 	const uint8_t *insn_primary;
 	unsigned int size;
 
-	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
 
 	/* Recursive fault -- ouch. */
 	if (data->busy) {
Index: linux/arch/x86/oprofile/nmi_int.c
===================================================================
--- linux.orig/arch/x86/oprofile/nmi_int.c
+++ linux/arch/x86/oprofile/nmi_int.c
@@ -64,11 +64,11 @@ u64 op_x86_get_ctrl(struct op_x86_model_
 static int profile_exceptions_notify(unsigned int val, struct pt_regs *regs)
 {
 	if (ctr_running)
-		model->check_ctrs(regs, &__get_cpu_var(cpu_msrs));
+		model->check_ctrs(regs, this_cpu_ptr(&cpu_msrs));
 	else if (!nmi_enabled)
 		return NMI_DONE;
 	else
-		model->stop(&__get_cpu_var(cpu_msrs));
+		model->stop(this_cpu_ptr(&cpu_msrs));
 	return NMI_HANDLED;
 }
 
@@ -91,7 +91,7 @@ static void nmi_cpu_save_registers(struc
 
 static void nmi_cpu_start(void *dummy)
 {
-	struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+	struct op_msrs const *msrs = this_cpu_ptr(&cpu_msrs);
 	if (!msrs->controls)
 		WARN_ON_ONCE(1);
 	else
@@ -111,7 +111,7 @@ static int nmi_start(void)
 
 static void nmi_cpu_stop(void *dummy)
 {
-	struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs);
+	struct op_msrs const *msrs = this_cpu_ptr(&cpu_msrs);
 	if (!msrs->controls)
 		WARN_ON_ONCE(1);
 	else
Index: linux/arch/x86/platform/uv/uv_time.c
===================================================================
--- linux.orig/arch/x86/platform/uv/uv_time.c
+++ linux/arch/x86/platform/uv/uv_time.c
@@ -365,7 +365,7 @@ __setup("uvrtcevt", uv_enable_evt_rtc);
 
 static __init void uv_rtc_register_clockevents(struct work_struct *dummy)
 {
-	struct clock_event_device *ced = &__get_cpu_var(cpu_ced);
+	struct clock_event_device *ced = this_cpu_ptr(&cpu_ced);
 
 	*ced = clock_event_device_uv;
 	ced->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/x86/xen/enlighten.c
===================================================================
--- linux.orig/arch/x86/xen/enlighten.c
+++ linux/arch/x86/xen/enlighten.c
@@ -821,7 +821,7 @@ static void xen_convert_trap_info(const
 
 void xen_copy_trap_info(struct trap_info *traps)
 {
-	const struct desc_ptr *desc = &__get_cpu_var(idt_desc);
+	const struct desc_ptr *desc = this_cpu_ptr(&idt_desc);
 
 	xen_convert_trap_info(desc, traps);
 }
@@ -838,7 +838,7 @@ static void xen_load_idt(const struct de
 
 	spin_lock(&lock);
 
-	__get_cpu_var(idt_desc) = *desc;
+	memcpy(this_cpu_ptr(&idt_desc), desc, sizeof(idt_desc));
 
 	xen_convert_trap_info(desc, traps);
 
Index: linux/arch/x86/xen/multicalls.c
===================================================================
--- linux.orig/arch/x86/xen/multicalls.c
+++ linux/arch/x86/xen/multicalls.c
@@ -54,7 +54,7 @@ DEFINE_PER_CPU(unsigned long, xen_mc_irq
 
 void xen_mc_flush(void)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_entry *mc;
 	int ret = 0;
 	unsigned long flags;
@@ -131,7 +131,7 @@ void xen_mc_flush(void)
 
 struct multicall_space __xen_mc_entry(size_t args)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_space ret;
 	unsigned argidx = roundup(b->argidx, sizeof(u64));
 
@@ -162,7 +162,7 @@ struct multicall_space __xen_mc_entry(si
 
 struct multicall_space xen_mc_extend_args(unsigned long op, size_t size)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct multicall_space ret = { NULL, NULL };
 
 	BUG_ON(preemptible());
@@ -192,7 +192,7 @@ out:
 
 void xen_mc_callback(void (*fn)(void *), void *data)
 {
-	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+	struct mc_buffer *b = this_cpu_ptr(&mc_buffer);
 	struct callback *cb;
 
 	if (b->cbidx == MC_BATCH) {
Index: linux/arch/x86/xen/spinlock.c
===================================================================
--- linux.orig/arch/x86/xen/spinlock.c
+++ linux/arch/x86/xen/spinlock.c
@@ -109,7 +109,7 @@ static bool xen_pvspin = true;
 __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
 {
 	int irq = __this_cpu_read(lock_kicker_irq);
-	struct xen_lock_waiting *w = &__get_cpu_var(lock_waiting);
+	struct xen_lock_waiting *w = this_cpu_ptr(&lock_waiting);
 	int cpu = smp_processor_id();
 	u64 start;
 	unsigned long flags;
Index: linux/arch/x86/xen/time.c
===================================================================
--- linux.orig/arch/x86/xen/time.c
+++ linux/arch/x86/xen/time.c
@@ -80,7 +80,7 @@ static void get_runstate_snapshot(struct
 
 	BUG_ON(preemptible());
 
-	state = &__get_cpu_var(xen_runstate);
+	state = this_cpu_ptr(&xen_runstate);
 
 	/*
 	 * The runstate info is always updated by the hypervisor on
@@ -123,7 +123,7 @@ static void do_stolen_accounting(void)
 
 	WARN_ON(state.state != RUNSTATE_running);
 
-	snap = &__get_cpu_var(xen_runstate_snapshot);
+	snap = this_cpu_ptr(&xen_runstate_snapshot);
 
 	/* work out how much time the VCPU has not been runn*ing*  */
 	runnable = state.time[RUNSTATE_runnable] - snap->time[RUNSTATE_runnable];
@@ -158,7 +158,7 @@ cycle_t xen_clocksource_read(void)
 	cycle_t ret;
 
 	preempt_disable_notrace();
-	src = &__get_cpu_var(xen_vcpu)->time;
+	src = this_cpu_ptr(&xen_vcpu->time);
 	ret = pvclock_clocksource_read(src);
 	preempt_enable_notrace();
 	return ret;
@@ -397,7 +397,7 @@ static DEFINE_PER_CPU(struct xen_clock_e
 
 static irqreturn_t xen_timer_interrupt(int irq, void *dev_id)
 {
-	struct clock_event_device *evt = &__get_cpu_var(xen_clock_events).evt;
+	struct clock_event_device *evt = this_cpu_ptr(&xen_clock_events.evt);
 	irqreturn_t ret;
 
 	ret = IRQ_NONE;
@@ -460,7 +460,7 @@ void xen_setup_cpu_clockevents(void)
 {
 	BUG_ON(preemptible());
 
-	clockevents_register_device(&__get_cpu_var(xen_clock_events).evt);
+	clockevents_register_device(this_cpu_ptr(&xen_clock_events.evt));
 }
 
 void xen_timer_resume(void)


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (16 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:11   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
                   ` (17 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Hedi Berriche, Mike Travis, Dimitri Sivanich

[-- Attachment #1: 0018-uv-Replace-__get_cpu_var.patch --]
[-- Type: text/plain, Size: 5669 bytes --]

Use __this_cpu_read instead.

Cc: Hedi Berriche <hedi@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/x86/include/asm/uv/uv_hub.h | 10 +++++-----
 arch/x86/platform/uv/uv_nmi.c    | 40 ++++++++++++++++++++--------------------
 2 files changed, 25 insertions(+), 25 deletions(-)

Index: linux/arch/x86/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -601,16 +601,16 @@ struct uv_hub_nmi_s {
 
 struct uv_cpu_nmi_s {
 	struct uv_hub_nmi_s	*hub;
-	atomic_t		state;
-	atomic_t		pinging;
+	int			state;
+	int			pinging;
 	int			queries;
 	int			pings;
 };
 
-DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
-#define uv_cpu_nmi			(__get_cpu_var(__uv_cpu_nmi))
+DECLARE_PER_CPU(struct uv_cpu_nmi_s, uv_cpu_nmi);
+
 #define uv_hub_nmi			(uv_cpu_nmi.hub)
-#define uv_cpu_nmi_per(cpu)		(per_cpu(__uv_cpu_nmi, cpu))
+#define uv_cpu_nmi_per(cpu)		(per_cpu(uv_cpu_nmi, cpu))
 #define uv_hub_nmi_per(cpu)		(uv_cpu_nmi_per(cpu).hub)
 
 /* uv_cpu_nmi_states */
Index: linux/arch/x86/platform/uv/uv_nmi.c
===================================================================
--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -63,8 +63,8 @@
 
 static struct uv_hub_nmi_s **uv_hub_nmi_list;
 
-DEFINE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
-EXPORT_PER_CPU_SYMBOL_GPL(__uv_cpu_nmi);
+DEFINE_PER_CPU(struct uv_cpu_nmi_s, uv_cpu_nmi);
+EXPORT_PER_CPU_SYMBOL_GPL(uv_cpu_nmi);
 
 static unsigned long nmi_mmr;
 static unsigned long nmi_mmr_clear;
@@ -215,7 +215,7 @@ static int uv_check_nmi(struct uv_hub_nm
 	int nmi = 0;
 
 	local64_inc(&uv_nmi_count);
-	uv_cpu_nmi.queries++;
+	this_cpu_inc(uv_cpu_nmi.queries);
 
 	do {
 		nmi = atomic_read(&hub_nmi->in_nmi);
@@ -293,7 +293,7 @@ static void uv_nmi_nr_cpus_ping(void)
 	int cpu;
 
 	for_each_cpu(cpu, uv_nmi_cpu_mask)
-		atomic_set(&uv_cpu_nmi_per(cpu).pinging, 1);
+		uv_cpu_nmi_per(cpu).pinging = 1;
 
 	apic->send_IPI_mask(uv_nmi_cpu_mask, APIC_DM_NMI);
 }
@@ -304,8 +304,8 @@ static void uv_nmi_cleanup_mask(void)
 	int cpu;
 
 	for_each_cpu(cpu, uv_nmi_cpu_mask) {
-		atomic_set(&uv_cpu_nmi_per(cpu).pinging, 0);
-		atomic_set(&uv_cpu_nmi_per(cpu).state, UV_NMI_STATE_OUT);
+		uv_cpu_nmi_per(cpu).pinging =  0;
+		uv_cpu_nmi_per(cpu).state = UV_NMI_STATE_OUT;
 		cpumask_clear_cpu(cpu, uv_nmi_cpu_mask);
 	}
 }
@@ -328,7 +328,7 @@ static int uv_nmi_wait_cpus(int first)
 		int loop_delay = uv_nmi_loop_delay;
 
 		for_each_cpu(j, uv_nmi_cpu_mask) {
-			if (atomic_read(&uv_cpu_nmi_per(j).state)) {
+			if (uv_cpu_nmi_per(j).state) {
 				cpumask_clear_cpu(j, uv_nmi_cpu_mask);
 				if (++k >= n)
 					break;
@@ -359,7 +359,7 @@ static int uv_nmi_wait_cpus(int first)
 static void uv_nmi_wait(int master)
 {
 	/* indicate this cpu is in */
-	atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_IN);
+	this_cpu_write(uv_cpu_nmi.state, UV_NMI_STATE_IN);
 
 	/* if not the first cpu in (the master), then we are a slave cpu */
 	if (!master)
@@ -419,7 +419,7 @@ static void uv_nmi_dump_state_cpu(int cp
 			"UV:%sNMI process trace for CPU %d\n", dots, cpu);
 		show_regs(regs);
 	}
-	atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_DUMP_DONE);
+	this_cpu_write(uv_cpu_nmi.state, UV_NMI_STATE_DUMP_DONE);
 }
 
 /* Trigger a slave cpu to dump it's state */
@@ -427,20 +427,20 @@ static void uv_nmi_trigger_dump(int cpu)
 {
 	int retry = uv_nmi_trigger_delay;
 
-	if (atomic_read(&uv_cpu_nmi_per(cpu).state) != UV_NMI_STATE_IN)
+	if (uv_cpu_nmi_per(cpu).state != UV_NMI_STATE_IN)
 		return;
 
-	atomic_set(&uv_cpu_nmi_per(cpu).state, UV_NMI_STATE_DUMP);
+	uv_cpu_nmi_per(cpu).state = UV_NMI_STATE_DUMP;
 	do {
 		cpu_relax();
 		udelay(10);
-		if (atomic_read(&uv_cpu_nmi_per(cpu).state)
+		if (uv_cpu_nmi_per(cpu).state
 				!= UV_NMI_STATE_DUMP)
 			return;
 	} while (--retry > 0);
 
 	pr_crit("UV: CPU %d stuck in process dump function\n", cpu);
-	atomic_set(&uv_cpu_nmi_per(cpu).state, UV_NMI_STATE_DUMP_DONE);
+	uv_cpu_nmi_per(cpu).state = UV_NMI_STATE_DUMP_DONE;
 }
 
 /* Wait until all cpus ready to exit */
@@ -488,7 +488,7 @@ static void uv_nmi_dump_state(int cpu, s
 	} else {
 		while (!atomic_read(&uv_nmi_slave_continue))
 			cpu_relax();
-		while (atomic_read(&uv_cpu_nmi.state) != UV_NMI_STATE_DUMP)
+		while (this_cpu_read(uv_cpu_nmi.state) != UV_NMI_STATE_DUMP)
 			cpu_relax();
 		uv_nmi_dump_state_cpu(cpu, regs);
 	}
@@ -615,7 +615,7 @@ int uv_handle_nmi(unsigned int reason, s
 	local_irq_save(flags);
 
 	/* If not a UV System NMI, ignore */
-	if (!atomic_read(&uv_cpu_nmi.pinging) && !uv_check_nmi(hub_nmi)) {
+	if (!this_cpu_read(uv_cpu_nmi.pinging) && !uv_check_nmi(hub_nmi)) {
 		local_irq_restore(flags);
 		return NMI_DONE;
 	}
@@ -639,7 +639,7 @@ int uv_handle_nmi(unsigned int reason, s
 		uv_call_kgdb_kdb(cpu, regs, master);
 
 	/* Clear per_cpu "in nmi" flag */
-	atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);
+	this_cpu_write(uv_cpu_nmi.state, UV_NMI_STATE_OUT);
 
 	/* Clear MMR NMI flag on each hub */
 	uv_clear_nmi(cpu);
@@ -666,16 +666,16 @@ static int uv_handle_nmi_ping(unsigned i
 {
 	int ret;
 
-	uv_cpu_nmi.queries++;
-	if (!atomic_read(&uv_cpu_nmi.pinging)) {
+	this_cpu_inc(uv_cpu_nmi.queries);
+	if (!this_cpu_read(uv_cpu_nmi.pinging)) {
 		local64_inc(&uv_nmi_ping_misses);
 		return NMI_DONE;
 	}
 
-	uv_cpu_nmi.pings++;
+	this_cpu_inc(uv_cpu_nmi.pings);
 	local64_inc(&uv_nmi_ping_count);
 	ret = uv_handle_nmi(reason, regs);
-	atomic_set(&uv_cpu_nmi.pinging, 0);
+	this_cpu_write(uv_cpu_nmi.pinging, 0);
 	return ret;
 }
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (17 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-22 17:48   ` Will Deacon
  2014-08-26 18:11   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
                   ` (16 subsequent siblings)
  35 siblings, 2 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Russell King, Catalin Marinas, Will Deacon

[-- Attachment #1: 0019-arm-Replace-__this_cpu_ptr-with-raw_cpu_ptr.patch --]
[-- Type: text/plain, Size: 2281 bytes --]

__this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
CC: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/arm/kernel/smp_twd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/arch/arm/kernel/smp_twd.c
===================================================================
--- linux.orig/arch/arm/kernel/smp_twd.c
+++ linux/arch/arm/kernel/smp_twd.c
@@ -92,7 +92,7 @@ static int twd_timer_ack(void)
 
 static void twd_timer_stop(void)
 {
-	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
+	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
 
 	twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
 	disable_percpu_irq(clk->irq);
@@ -108,7 +108,7 @@ static void twd_update_frequency(void *n
 {
 	twd_timer_rate = *((unsigned long *) new_rate);
 
-	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
+	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
 }
 
 static int twd_rate_change(struct notifier_block *nb,
@@ -134,7 +134,7 @@ static struct notifier_block twd_clk_nb
 
 static int twd_clk_init(void)
 {
-	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
+	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
 		return clk_notifier_register(twd_clk, &twd_clk_nb);
 
 	return 0;
@@ -153,7 +153,7 @@ static void twd_update_frequency(void *d
 {
 	twd_timer_rate = clk_get_rate(twd_clk);
 
-	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
+	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
 }
 
 static int twd_cpufreq_transition(struct notifier_block *nb,
@@ -179,7 +179,7 @@ static struct notifier_block twd_cpufreq
 
 static int twd_cpufreq_init(void)
 {
-	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
+	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
 		return cpufreq_register_notifier(&twd_cpufreq_nb,
 			CPUFREQ_TRANSITION_NOTIFIER);
 
@@ -269,7 +269,7 @@ static void twd_get_clock(struct device_
  */
 static void twd_timer_setup(void)
 {
-	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
+	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
 	int cpu = smp_processor_id();
 
 	/*


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator.
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (18 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:12   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses Christoph Lameter
                   ` (15 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, David Daney

[-- Attachment #1: 0020-MIPS-Replace-__get_cpu_var-uses-in-FPU-emulator.patch --]
[-- Type: text/plain, Size: 1625 bytes --]

The use of __this_cpu_inc() requires a fundamental integer type, so
change the type of all the counters to unsigned long, which is the
same width they were before, but not wrapped in local_t.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/mips/include/asm/fpu_emulator.h | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Index: linux/arch/mips/include/asm/fpu_emulator.h
===================================================================
--- linux.orig/arch/mips/include/asm/fpu_emulator.h
+++ linux/arch/mips/include/asm/fpu_emulator.h
@@ -33,17 +33,17 @@
 #ifdef CONFIG_DEBUG_FS
 
 struct mips_fpu_emulator_stats {
-	local_t emulated;
-	local_t loads;
-	local_t stores;
-	local_t cp1ops;
-	local_t cp1xops;
-	local_t errors;
-	local_t ieee754_inexact;
-	local_t ieee754_underflow;
-	local_t ieee754_overflow;
-	local_t ieee754_zerodiv;
-	local_t ieee754_invalidop;
+	unsigned long emulated;
+	unsigned long loads;
+	unsigned long stores;
+	unsigned long cp1ops;
+	unsigned long cp1xops;
+	unsigned long errors;
+	unsigned long ieee754_inexact;
+	unsigned long ieee754_underflow;
+	unsigned long ieee754_overflow;
+	unsigned long ieee754_zerodiv;
+	unsigned long ieee754_invalidop;
 };
 
 DECLARE_PER_CPU(struct mips_fpu_emulator_stats, fpuemustats);
@@ -51,7 +51,7 @@ DECLARE_PER_CPU(struct mips_fpu_emulator
 #define MIPS_FPU_EMU_INC_STATS(M)					\
 do {									\
 	preempt_disable();						\
-	__local_inc(&__get_cpu_var(fpuemustats).M);			\
+	__this_cpu_inc(fpuemustats.M);					\
 	preempt_enable();						\
 } while (0)
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (19 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:12   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 22/35] [PATCH 22/36] s390: " Christoph Lameter
                   ` (14 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ralf Baechle

[-- Attachment #1: 0021-mips-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 12241 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/mips/cavium-octeon/octeon-irq.c | 30 +++++++++++++++---------------
 arch/mips/kernel/kprobes.c           |  6 +++---
 arch/mips/kernel/perf_event_mipsxx.c | 14 +++++++-------
 arch/mips/kernel/smp-bmips.c         |  2 +-
 arch/mips/loongson/loongson-3/smp.c  |  6 +++---
 5 files changed, 29 insertions(+), 29 deletions(-)

Index: linux/arch/mips/cavium-octeon/octeon-irq.c
===================================================================
--- linux.orig/arch/mips/cavium-octeon/octeon-irq.c
+++ linux/arch/mips/cavium-octeon/octeon-irq.c
@@ -264,13 +264,13 @@ static void octeon_irq_ciu_enable_local(
 	unsigned long *pen;
 	unsigned long flags;
 	union octeon_ciu_chip_data cd;
-	raw_spinlock_t *lock = &__get_cpu_var(octeon_irq_ciu_spinlock);
+	raw_spinlock_t *lock = this_cpu_ptr(&octeon_irq_ciu_spinlock);
 
 	cd.p = irq_data_get_irq_chip_data(data);
 
 	raw_spin_lock_irqsave(lock, flags);
 	if (cd.s.line == 0) {
-		pen = &__get_cpu_var(octeon_irq_ciu0_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu0_en_mirror);
 		__set_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -279,7 +279,7 @@ static void octeon_irq_ciu_enable_local(
 		wmb();
 		cvmx_write_csr(CVMX_CIU_INTX_EN0(cvmx_get_core_num() * 2), *pen);
 	} else {
-		pen = &__get_cpu_var(octeon_irq_ciu1_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu1_en_mirror);
 		__set_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -296,13 +296,13 @@ static void octeon_irq_ciu_disable_local
 	unsigned long *pen;
 	unsigned long flags;
 	union octeon_ciu_chip_data cd;
-	raw_spinlock_t *lock = &__get_cpu_var(octeon_irq_ciu_spinlock);
+	raw_spinlock_t *lock = this_cpu_ptr(&octeon_irq_ciu_spinlock);
 
 	cd.p = irq_data_get_irq_chip_data(data);
 
 	raw_spin_lock_irqsave(lock, flags);
 	if (cd.s.line == 0) {
-		pen = &__get_cpu_var(octeon_irq_ciu0_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu0_en_mirror);
 		__clear_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -311,7 +311,7 @@ static void octeon_irq_ciu_disable_local
 		wmb();
 		cvmx_write_csr(CVMX_CIU_INTX_EN0(cvmx_get_core_num() * 2), *pen);
 	} else {
-		pen = &__get_cpu_var(octeon_irq_ciu1_en_mirror);
+		pen = this_cpu_ptr(&octeon_irq_ciu1_en_mirror);
 		__clear_bit(cd.s.bit, pen);
 		/*
 		 * Must be visible to octeon_irq_ip{2,3}_ciu() before
@@ -431,11 +431,11 @@ static void octeon_irq_ciu_enable_local_
 
 	if (cd.s.line == 0) {
 		int index = cvmx_get_core_num() * 2;
-		set_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu0_en_mirror));
+		set_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu0_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN0_W1S(index), mask);
 	} else {
 		int index = cvmx_get_core_num() * 2 + 1;
-		set_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu1_en_mirror));
+		set_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu1_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN1_W1S(index), mask);
 	}
 }
@@ -450,11 +450,11 @@ static void octeon_irq_ciu_disable_local
 
 	if (cd.s.line == 0) {
 		int index = cvmx_get_core_num() * 2;
-		clear_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu0_en_mirror));
+		clear_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu0_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN0_W1C(index), mask);
 	} else {
 		int index = cvmx_get_core_num() * 2 + 1;
-		clear_bit(cd.s.bit, &__get_cpu_var(octeon_irq_ciu1_en_mirror));
+		clear_bit(cd.s.bit, this_cpu_ptr(&octeon_irq_ciu1_en_mirror));
 		cvmx_write_csr(CVMX_CIU_INTX_EN1_W1C(index), mask);
 	}
 }
@@ -1063,7 +1063,7 @@ static void octeon_irq_ip2_ciu(void)
 	const unsigned long core_id = cvmx_get_core_num();
 	u64 ciu_sum = cvmx_read_csr(CVMX_CIU_INTX_SUM0(core_id * 2));
 
-	ciu_sum &= __get_cpu_var(octeon_irq_ciu0_en_mirror);
+	ciu_sum &= __this_cpu_read(octeon_irq_ciu0_en_mirror);
 	if (likely(ciu_sum)) {
 		int bit = fls64(ciu_sum) - 1;
 		int irq = octeon_irq_ciu_to_irq[0][bit];
@@ -1080,7 +1080,7 @@ static void octeon_irq_ip3_ciu(void)
 {
 	u64 ciu_sum = cvmx_read_csr(CVMX_CIU_INT_SUM1);
 
-	ciu_sum &= __get_cpu_var(octeon_irq_ciu1_en_mirror);
+	ciu_sum &= __this_cpu_read(octeon_irq_ciu1_en_mirror);
 	if (likely(ciu_sum)) {
 		int bit = fls64(ciu_sum) - 1;
 		int irq = octeon_irq_ciu_to_irq[1][bit];
@@ -1129,10 +1129,10 @@ static void octeon_irq_init_ciu_percpu(v
 	int coreid = cvmx_get_core_num();
 
 
-	__get_cpu_var(octeon_irq_ciu0_en_mirror) = 0;
-	__get_cpu_var(octeon_irq_ciu1_en_mirror) = 0;
+	__this_cpu_write(octeon_irq_ciu0_en_mirror, 0);
+	__this_cpu_write(octeon_irq_ciu1_en_mirror, 0);
 	wmb();
-	raw_spin_lock_init(&__get_cpu_var(octeon_irq_ciu_spinlock));
+	raw_spin_lock_init(this_cpu_ptr(&octeon_irq_ciu_spinlock));
 	/*
 	 * Disable All CIU Interrupts. The ones we need will be
 	 * enabled later.  Read the SUM register so we know the write
Index: linux/arch/mips/kernel/kprobes.c
===================================================================
--- linux.orig/arch/mips/kernel/kprobes.c
+++ linux/arch/mips/kernel/kprobes.c
@@ -224,7 +224,7 @@ static void save_previous_kprobe(struct
 
 static void restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_old_SR = kcb->prev_kprobe.old_SR;
 	kcb->kprobe_saved_SR = kcb->prev_kprobe.saved_SR;
@@ -234,7 +234,7 @@ static void restore_previous_kprobe(stru
 static void set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 			       struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_saved_SR = kcb->kprobe_old_SR = (regs->cp0_status & ST0_IE);
 	kcb->kprobe_saved_epc = regs->cp0_epc;
 }
@@ -385,7 +385,7 @@ static int __kprobes kprobe_handler(stru
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/mips/kernel/perf_event_mipsxx.c
===================================================================
--- linux.orig/arch/mips/kernel/perf_event_mipsxx.c
+++ linux/arch/mips/kernel/perf_event_mipsxx.c
@@ -340,7 +340,7 @@ static int mipsxx_pmu_alloc_counter(stru
 
 static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
 
@@ -360,7 +360,7 @@ static void mipsxx_pmu_enable_event(stru
 
 static void mipsxx_pmu_disable_event(int idx)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 
 	WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
@@ -460,7 +460,7 @@ static void mipspmu_stop(struct perf_eve
 
 static int mipspmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx;
 	int err = 0;
@@ -496,7 +496,7 @@ out:
 
 static void mipspmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -1275,7 +1275,7 @@ static int __hw_perf_event_init(struct p
 
 static void pause_local_counters(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int ctr = mipspmu.num_counters;
 	unsigned long flags;
 
@@ -1291,7 +1291,7 @@ static void pause_local_counters(void)
 
 static void resume_local_counters(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int ctr = mipspmu.num_counters;
 
 	do {
@@ -1302,7 +1302,7 @@ static void resume_local_counters(void)
 
 static int mipsxx_pmu_handle_shared_irq(void)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_sample_data data;
 	unsigned int counters = mipspmu.num_counters;
 	u64 counter;
Index: linux/arch/mips/kernel/smp-bmips.c
===================================================================
--- linux.orig/arch/mips/kernel/smp-bmips.c
+++ linux/arch/mips/kernel/smp-bmips.c
@@ -346,7 +346,7 @@ static irqreturn_t bmips43xx_ipi_interru
 	int action, cpu = irq - IPI0_IRQ;
 
 	spin_lock_irqsave(&ipi_lock, flags);
-	action = __get_cpu_var(ipi_action_mask);
+	action = __this_cpu_read(ipi_action_mask);
 	per_cpu(ipi_action_mask, cpu) = 0;
 	clear_c0_cause(cpu ? C_SW1 : C_SW0);
 	spin_unlock_irqrestore(&ipi_lock, flags);
Index: linux/arch/mips/loongson/loongson-3/smp.c
===================================================================
--- linux.orig/arch/mips/loongson/loongson-3/smp.c
+++ linux/arch/mips/loongson/loongson-3/smp.c
@@ -299,16 +299,16 @@ static void loongson3_init_secondary(voi
 	per_cpu(cpu_state, cpu) = CPU_ONLINE;
 
 	i = 0;
-	__get_cpu_var(core0_c0count) = 0;
+	__this_cpu_write(core0_c0count, 0);
 	loongson3_send_ipi_single(0, SMP_ASK_C0COUNT);
-	while (!__get_cpu_var(core0_c0count)) {
+	while (!__this_cpu_read(core0_c0count)) {
 		i++;
 		cpu_relax();
 	}
 
 	if (i > MAX_LOOPS)
 		i = MAX_LOOPS;
-	initcount = __get_cpu_var(core0_c0count) + i;
+	initcount = __this_cpu_read(core0_c0count) + i;
 	write_c0_count(initcount);
 }
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 22/35] [PATCH 22/36] s390: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (20 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:13   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 23/35] [PATCH 23/36] s390: cio driver &__get_cpu_var replacements Christoph Lameter
                   ` (13 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Martin Schwidefsky, linux390, Heiko Carstens

[-- Attachment #1: 0022-s390-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 17923 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: linux390@de.ibm.com
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/s390/include/asm/cputime.h |  2 +-
 arch/s390/include/asm/irq.h     |  2 +-
 arch/s390/include/asm/percpu.h  | 16 ++++++++--------
 arch/s390/kernel/irq.c          |  2 +-
 arch/s390/kernel/kprobes.c      |  8 ++++----
 arch/s390/kernel/nmi.c          | 10 +++++++---
 arch/s390/kernel/perf_cpum_cf.c | 22 +++++++++++-----------
 arch/s390/kernel/perf_cpum_sf.c | 16 ++++++++--------
 arch/s390/kernel/processor.c    |  4 ++--
 arch/s390/kernel/time.c         |  6 +++---
 arch/s390/kernel/vtime.c        |  2 +-
 arch/s390/oprofile/hwsampler.c  |  2 +-
 12 files changed, 48 insertions(+), 44 deletions(-)

Index: linux/arch/s390/include/asm/cputime.h
===================================================================
--- linux.orig/arch/s390/include/asm/cputime.h
+++ linux/arch/s390/include/asm/cputime.h
@@ -184,7 +184,7 @@ cputime64_t s390_get_idle_time(int cpu);
 
 static inline int s390_nohz_delay(int cpu)
 {
-	return __get_cpu_var(s390_idle).nohz_delay != 0;
+	return __this_cpu_read(s390_idle.nohz_delay) != 0;
 }
 
 #define arch_needs_cpu(cpu) s390_nohz_delay(cpu)
Index: linux/arch/s390/include/asm/irq.h
===================================================================
--- linux.orig/arch/s390/include/asm/irq.h
+++ linux/arch/s390/include/asm/irq.h
@@ -81,7 +81,7 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct ir
 
 static __always_inline void inc_irq_stat(enum interruption_class irq)
 {
-	__get_cpu_var(irq_stat).irqs[irq]++;
+	__this_cpu_inc(irq_stat.irqs[irq]);
 }
 
 struct ext_code {
Index: linux/arch/s390/include/asm/percpu.h
===================================================================
--- linux.orig/arch/s390/include/asm/percpu.h
+++ linux/arch/s390/include/asm/percpu.h
@@ -31,7 +31,7 @@
 	pcp_op_T__ old__, new__, prev__;				\
 	pcp_op_T__ *ptr__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	prev__ = *ptr__;						\
 	do {								\
 		old__ = prev__;						\
@@ -70,7 +70,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp)); 				\
 	if (__builtin_constant_p(val__) &&				\
 	    ((szcast)val__ > -129) && ((szcast)val__ < 128)) {		\
 		asm volatile(						\
@@ -97,7 +97,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp));	 				\
 	asm volatile(							\
 		op "    %[old__],%[val__],%[ptr__]\n"			\
 		: [old__] "=d" (old__), [ptr__] "+Q" (*ptr__)		\
@@ -116,7 +116,7 @@
 	pcp_op_T__ val__ = (val);					\
 	pcp_op_T__ old__, *ptr__;					\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp)); 				\
+	ptr__ = raw_cpu_ptr(&(pcp));	 				\
 	asm volatile(							\
 		op "    %[old__],%[val__],%[ptr__]\n"			\
 		: [old__] "=d" (old__), [ptr__] "+Q" (*ptr__)		\
@@ -138,7 +138,7 @@
 	pcp_op_T__ ret__;						\
 	pcp_op_T__ *ptr__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	ret__ = cmpxchg(ptr__, oval, nval);				\
 	preempt_enable();						\
 	ret__;								\
@@ -154,7 +154,7 @@
 	typeof(pcp) *ptr__;						\
 	typeof(pcp) ret__;						\
 	preempt_disable();						\
-	ptr__ = __this_cpu_ptr(&(pcp));					\
+	ptr__ = raw_cpu_ptr(&(pcp));					\
 	ret__ = xchg(ptr__, nval);					\
 	preempt_enable();						\
 	ret__;								\
@@ -173,8 +173,8 @@
 	typeof(pcp2) *p2__;						\
 	int ret__;							\
 	preempt_disable();						\
-	p1__ = __this_cpu_ptr(&(pcp1));					\
-	p2__ = __this_cpu_ptr(&(pcp2));					\
+	p1__ = raw_cpu_ptr(&(pcp1));					\
+	p2__ = raw_cpu_ptr(&(pcp2));					\
 	ret__ = __cmpxchg_double(p1__, p2__, o1__, o2__, n1__, n2__);	\
 	preempt_enable();						\
 	ret__;								\
Index: linux/arch/s390/kernel/irq.c
===================================================================
--- linux.orig/arch/s390/kernel/irq.c
+++ linux/arch/s390/kernel/irq.c
@@ -258,7 +258,7 @@ static irqreturn_t do_ext_interrupt(int
 
 	ext_code = *(struct ext_code *) &regs->int_code;
 	if (ext_code.code != EXT_IRQ_CLK_COMP)
-		__get_cpu_var(s390_idle).nohz_delay = 1;
+		__this_cpu_write(s390_idle.nohz_delay, 1);
 
 	index = ext_hash(ext_code.code);
 	rcu_read_lock();
Index: linux/arch/s390/kernel/kprobes.c
===================================================================
--- linux.orig/arch/s390/kernel/kprobes.c
+++ linux/arch/s390/kernel/kprobes.c
@@ -366,9 +366,9 @@ static void __kprobes disable_singlestep
  */
 static void __kprobes push_kprobe(struct kprobe_ctlblk *kcb, struct kprobe *p)
 {
-	kcb->prev_kprobe.kp = __get_cpu_var(current_kprobe);
+	kcb->prev_kprobe.kp = __this_cpu_read(current_kprobe);
 	kcb->prev_kprobe.status = kcb->kprobe_status;
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 /*
@@ -378,7 +378,7 @@ static void __kprobes push_kprobe(struct
  */
 static void __kprobes pop_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 }
 
@@ -459,7 +459,7 @@ static int __kprobes kprobe_handler(stru
 		enable_singlestep(kcb, regs, (unsigned long) p->ainsn.insn);
 		return 1;
 	} else if (kprobe_running()) {
-		p = __get_cpu_var(current_kprobe);
+		p = __this_cpu_read(current_kprobe);
 		if (p->break_handler && p->break_handler(p, regs)) {
 			/*
 			 * Continuation after the jprobe completed and
Index: linux/arch/s390/kernel/nmi.c
===================================================================
--- linux.orig/arch/s390/kernel/nmi.c
+++ linux/arch/s390/kernel/nmi.c
@@ -53,8 +53,12 @@ void s390_handle_mcck(void)
 	 */
 	local_irq_save(flags);
 	local_mcck_disable();
-	mcck = __get_cpu_var(cpu_mcck);
-	memset(&__get_cpu_var(cpu_mcck), 0, sizeof(struct mcck_struct));
+	/*
+	 * Ummm... Does this make sense at all? Copying the percpu struct
+	 * and then zapping it one statement later?
+	 */
+	memcpy(&mcck, this_cpu_ptr(&cpu_mcck), sizeof(mcck));
+	memset(&mcck, 0, sizeof(struct mcck_struct));
 	clear_cpu_flag(CIF_MCCK_PENDING);
 	local_mcck_enable();
 	local_irq_restore(flags);
@@ -253,7 +257,7 @@ void notrace s390_do_machine_check(struc
 	nmi_enter();
 	inc_irq_stat(NMI_NMI);
 	mci = (struct mci *) &S390_lowcore.mcck_interruption_code;
-	mcck = &__get_cpu_var(cpu_mcck);
+	mcck = this_cpu_ptr(&cpu_mcck);
 	umode = user_mode(regs);
 
 	if (mci->sd) {
Index: linux/arch/s390/kernel/perf_cpum_cf.c
===================================================================
--- linux.orig/arch/s390/kernel/perf_cpum_cf.c
+++ linux/arch/s390/kernel/perf_cpum_cf.c
@@ -173,7 +173,7 @@ static int validate_ctr_auth(const struc
  */
 static void cpumf_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	int err;
 
 	if (cpuhw->flags & PMU_F_ENABLED)
@@ -196,7 +196,7 @@ static void cpumf_pmu_enable(struct pmu
  */
 static void cpumf_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	int err;
 	u64 inactive;
 
@@ -230,7 +230,7 @@ static void cpumf_measurement_alert(stru
 		return;
 
 	inc_irq_stat(IRQEXT_CMC);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	/* Measurement alerts are shared and might happen when the PMU
 	 * is not reserved.  Ignore these alerts in this case. */
@@ -250,7 +250,7 @@ static void cpumf_measurement_alert(stru
 #define PMC_RELEASE   1
 static void setup_pmc_cpu(void *flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	switch (*((int *) flags)) {
 	case PMC_INIT:
@@ -475,7 +475,7 @@ static void cpumf_pmu_read(struct perf_e
 
 static void cpumf_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
@@ -506,7 +506,7 @@ static void cpumf_pmu_start(struct perf_
 
 static void cpumf_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 
 	if (!(hwc->state & PERF_HES_STOPPED)) {
@@ -527,7 +527,7 @@ static void cpumf_pmu_stop(struct perf_e
 
 static int cpumf_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	/* Check authorization for the counter set to which this
 	 * counter belongs.
@@ -551,7 +551,7 @@ static int cpumf_pmu_add(struct perf_eve
 
 static void cpumf_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpumf_pmu_stop(event, PERF_EF_UPDATE);
 
@@ -575,7 +575,7 @@ static void cpumf_pmu_del(struct perf_ev
  */
 static void cpumf_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->flags |= PERF_EVENT_TXN;
@@ -589,7 +589,7 @@ static void cpumf_pmu_start_txn(struct p
  */
 static void cpumf_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	WARN_ON(cpuhw->tx_state != cpuhw->state);
 
@@ -604,7 +604,7 @@ static void cpumf_pmu_cancel_txn(struct
  */
 static int cpumf_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	u64 state;
 
 	/* check if the updated state can be scheduled */
Index: linux/arch/s390/kernel/perf_cpum_sf.c
===================================================================
--- linux.orig/arch/s390/kernel/perf_cpum_sf.c
+++ linux/arch/s390/kernel/perf_cpum_sf.c
@@ -562,7 +562,7 @@ static DEFINE_MUTEX(pmc_reserve_mutex);
 static void setup_pmc_cpu(void *flags)
 {
 	int err;
-	struct cpu_hw_sf *cpusf = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpusf = this_cpu_ptr(&cpu_hw_sf);
 
 	err = 0;
 	switch (*((int *) flags)) {
@@ -849,7 +849,7 @@ static int cpumsf_pmu_event_init(struct
 
 static void cpumsf_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	struct hw_perf_event *hwc;
 	int err;
 
@@ -898,7 +898,7 @@ static void cpumsf_pmu_enable(struct pmu
 
 static void cpumsf_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	struct hws_lsctl_request_block inactive;
 	struct hws_qsi_info_block si;
 	int err;
@@ -1306,7 +1306,7 @@ static void cpumsf_pmu_read(struct perf_
  */
 static void cpumsf_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
 		return;
@@ -1327,7 +1327,7 @@ static void cpumsf_pmu_start(struct perf
  */
 static void cpumsf_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	if (event->hw.state & PERF_HES_STOPPED)
 		return;
@@ -1346,7 +1346,7 @@ static void cpumsf_pmu_stop(struct perf_
 
 static int cpumsf_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 	int err;
 
 	if (cpuhw->flags & PMU_F_IN_USE)
@@ -1397,7 +1397,7 @@ out:
 
 static void cpumsf_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_sf *cpuhw = &__get_cpu_var(cpu_hw_sf);
+	struct cpu_hw_sf *cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	perf_pmu_disable(event->pmu);
 	cpumsf_pmu_stop(event, PERF_EF_UPDATE);
@@ -1470,7 +1470,7 @@ static void cpumf_measurement_alert(stru
 	if (!(alert & CPU_MF_INT_SF_MASK))
 		return;
 	inc_irq_stat(IRQEXT_CMS);
-	cpuhw = &__get_cpu_var(cpu_hw_sf);
+	cpuhw = this_cpu_ptr(&cpu_hw_sf);
 
 	/* Measurement alerts are shared and might happen when the PMU
 	 * is not reserved.  Ignore these alerts in this case. */
Index: linux/arch/s390/kernel/processor.c
===================================================================
--- linux.orig/arch/s390/kernel/processor.c
+++ linux/arch/s390/kernel/processor.c
@@ -23,8 +23,8 @@ static DEFINE_PER_CPU(struct cpuid, cpu_
  */
 void cpu_init(void)
 {
-	struct s390_idle_data *idle = &__get_cpu_var(s390_idle);
-	struct cpuid *id = &__get_cpu_var(cpu_id);
+	struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
+	struct cpuid *id = this_cpu_ptr(&cpu_id);
 
 	get_cpu_id(id);
 	atomic_inc(&init_mm.mm_count);
Index: linux/arch/s390/kernel/time.c
===================================================================
--- linux.orig/arch/s390/kernel/time.c
+++ linux/arch/s390/kernel/time.c
@@ -92,7 +92,7 @@ void clock_comparator_work(void)
 	struct clock_event_device *cd;
 
 	S390_lowcore.clock_comparator = -1ULL;
-	cd = &__get_cpu_var(comparators);
+	cd = this_cpu_ptr(&comparators);
 	cd->event_handler(cd);
 }
 
@@ -360,7 +360,7 @@ EXPORT_SYMBOL(get_sync_clock);
  */
 static void disable_sync_clock(void *dummy)
 {
-	atomic_t *sw_ptr = &__get_cpu_var(clock_sync_word);
+	atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word);
 	/*
 	 * Clear the in-sync bit 2^31. All get_sync_clock calls will
 	 * fail until the sync bit is turned back on. In addition
@@ -377,7 +377,7 @@ static void disable_sync_clock(void *dum
  */
 static void enable_sync_clock(void)
 {
-	atomic_t *sw_ptr = &__get_cpu_var(clock_sync_word);
+	atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word);
 	atomic_set_mask(0x80000000, sw_ptr);
 }
 
Index: linux/arch/s390/kernel/vtime.c
===================================================================
--- linux.orig/arch/s390/kernel/vtime.c
+++ linux/arch/s390/kernel/vtime.c
@@ -154,7 +154,7 @@ EXPORT_SYMBOL_GPL(vtime_account_system);
 
 void __kprobes vtime_stop_cpu(void)
 {
-	struct s390_idle_data *idle = &__get_cpu_var(s390_idle);
+	struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
 	unsigned long long idle_time;
 	unsigned long psw_mask;
 
Index: linux/arch/s390/oprofile/hwsampler.c
===================================================================
--- linux.orig/arch/s390/oprofile/hwsampler.c
+++ linux/arch/s390/oprofile/hwsampler.c
@@ -178,7 +178,7 @@ static int smp_ctl_qsi(int cpu)
 static void hws_ext_handler(struct ext_code ext_code,
 			    unsigned int param32, unsigned long param64)
 {
-	struct hws_cpu_buffer *cb = &__get_cpu_var(sampler_cpu_buffer);
+	struct hws_cpu_buffer *cb = this_cpu_ptr(&sampler_cpu_buffer);
 
 	if (!(param32 & CPU_MF_INT_SF_MASK))
 		return;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 23/35] [PATCH 23/36] s390: cio driver &__get_cpu_var replacements
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (21 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 22/35] [PATCH 22/36] s390: " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:13   ` Tejun Heo
  2014-08-17 17:30   ` Christoph Lameter
                   ` (12 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0023-s390-cio-driver-__get_cpu_var-replacements.patch --]
[-- Type: text/plain, Size: 3679 bytes --]

Use this_cpu_ptr() instead of &__get_cpu_var()

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/s390/cio/ccwreq.c     | 2 +-
 drivers/s390/cio/chsc_sch.c   | 2 +-
 drivers/s390/cio/cio.c        | 6 +++---
 drivers/s390/cio/device_fsm.c | 4 ++--
 drivers/s390/cio/eadm_sch.c   | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

Index: linux/drivers/s390/cio/ccwreq.c
===================================================================
--- linux.orig/drivers/s390/cio/ccwreq.c
+++ linux/drivers/s390/cio/ccwreq.c
@@ -252,7 +252,7 @@ static void ccwreq_log_status(struct ccw
  */
 void ccw_request_handler(struct ccw_device *cdev)
 {
-	struct irb *irb = &__get_cpu_var(cio_irb);
+	struct irb *irb = this_cpu_ptr(&cio_irb);
 	struct ccw_request *req = &cdev->private->req;
 	enum io_status status;
 	int rc = -EOPNOTSUPP;
Index: linux/drivers/s390/cio/chsc_sch.c
===================================================================
--- linux.orig/drivers/s390/cio/chsc_sch.c
+++ linux/drivers/s390/cio/chsc_sch.c
@@ -58,7 +58,7 @@ static void chsc_subchannel_irq(struct s
 {
 	struct chsc_private *private = dev_get_drvdata(&sch->dev);
 	struct chsc_request *request = private->request;
-	struct irb *irb = &__get_cpu_var(cio_irb);
+	struct irb *irb = this_cpu_ptr(&cio_irb);
 
 	CHSC_LOG(4, "irb");
 	CHSC_LOG_HEX(4, irb, sizeof(*irb));
Index: linux/drivers/s390/cio/cio.c
===================================================================
--- linux.orig/drivers/s390/cio/cio.c
+++ linux/drivers/s390/cio/cio.c
@@ -563,7 +563,7 @@ static irqreturn_t do_cio_interrupt(int
 
 	__this_cpu_write(s390_idle.nohz_delay, 1);
 	tpi_info = (struct tpi_info *) &get_irq_regs()->int_code;
-	irb = &__get_cpu_var(cio_irb);
+	irb = this_cpu_ptr(&cio_irb);
 	sch = (struct subchannel *)(unsigned long) tpi_info->intparm;
 	if (!sch) {
 		/* Clear pending interrupt condition. */
@@ -613,7 +613,7 @@ void cio_tsch(struct subchannel *sch)
 	struct irb *irb;
 	int irq_context;
 
-	irb = &__get_cpu_var(cio_irb);
+	irb = this_cpu_ptr(&cio_irb);
 	/* Store interrupt response block to lowcore. */
 	if (tsch(sch->schid, irb) != 0)
 		/* Not status pending or not operational. */
@@ -751,7 +751,7 @@ __clear_io_subchannel_easy(struct subcha
 		struct tpi_info ti;
 
 		if (tpi(&ti)) {
-			tsch(ti.schid, &__get_cpu_var(cio_irb));
+			tsch(ti.schid, this_cpu_ptr(&cio_irb));
 			if (schid_equal(&ti.schid, &schid))
 				return 0;
 		}
Index: linux/drivers/s390/cio/device_fsm.c
===================================================================
--- linux.orig/drivers/s390/cio/device_fsm.c
+++ linux/drivers/s390/cio/device_fsm.c
@@ -739,7 +739,7 @@ ccw_device_irq(struct ccw_device *cdev,
 	struct irb *irb;
 	int is_cmd;
 
-	irb = &__get_cpu_var(cio_irb);
+	irb = this_cpu_ptr(&cio_irb);
 	is_cmd = !scsw_is_tm(&irb->scsw);
 	/* Check for unsolicited interrupt. */
 	if (!scsw_is_solicited(&irb->scsw)) {
@@ -805,7 +805,7 @@ ccw_device_w4sense(struct ccw_device *cd
 {
 	struct irb *irb;
 
-	irb = &__get_cpu_var(cio_irb);
+	irb = this_cpu_ptr(&cio_irb);
 	/* Check for unsolicited interrupt. */
 	if (scsw_stctl(&irb->scsw) ==
 	    (SCSW_STCTL_STATUS_PEND | SCSW_STCTL_ALERT_STATUS)) {
Index: linux/drivers/s390/cio/eadm_sch.c
===================================================================
--- linux.orig/drivers/s390/cio/eadm_sch.c
+++ linux/drivers/s390/cio/eadm_sch.c
@@ -134,7 +134,7 @@ static void eadm_subchannel_irq(struct s
 {
 	struct eadm_private *private = get_eadm_private(sch);
 	struct eadm_scsw *scsw = &sch->schib.scsw.eadm;
-	struct irb *irb = &__get_cpu_var(cio_irb);
+	struct irb *irb = this_cpu_ptr(&cio_irb);
 	int error = 0;
 
 	EADM_LOG(6, "irq");


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 24/35] [PATCH 24/36] ia64: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
@ 2014-08-17 17:30   ` Christoph Lameter
  2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

[-- Attachment #1: 0024-ia64-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 14566 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/ia64/include/asm/hw_irq.h     |  2 +-
 arch/ia64/include/asm/sn/arch.h    |  4 ++--
 arch/ia64/include/asm/sn/nodepda.h |  2 +-
 arch/ia64/include/asm/switch_to.h  |  2 +-
 arch/ia64/include/asm/uv/uv_hub.h  |  2 +-
 arch/ia64/kernel/irq.c             |  2 +-
 arch/ia64/kernel/irq_ia64.c        |  4 ++--
 arch/ia64/kernel/kprobes.c         |  6 +++---
 arch/ia64/kernel/mca.c             | 16 ++++++++--------
 arch/ia64/kernel/process.c         |  6 +++---
 arch/ia64/kernel/traps.c           |  2 +-
 arch/ia64/sn/kernel/sn2/sn2_smp.c  | 28 ++++++++++++++--------------
 12 files changed, 38 insertions(+), 38 deletions(-)

Index: linux/arch/ia64/include/asm/hw_irq.h
===================================================================
--- linux.orig/arch/ia64/include/asm/hw_irq.h
+++ linux/arch/ia64/include/asm/hw_irq.h
@@ -159,7 +159,7 @@ static inline ia64_vector __ia64_irq_to_
 static inline unsigned int
 __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/include/asm/sn/arch.h
===================================================================
--- linux.orig/arch/ia64/include/asm/sn/arch.h
+++ linux/arch/ia64/include/asm/sn/arch.h
@@ -57,7 +57,7 @@ struct sn_hub_info_s {
 	u16 nasid_bitmask;
 };
 DECLARE_PER_CPU(struct sn_hub_info_s, __sn_hub_info);
-#define sn_hub_info 	(&__get_cpu_var(__sn_hub_info))
+#define sn_hub_info 	this_cpu_ptr(&__sn_hub_info)
 #define is_shub2()	(sn_hub_info->shub2)
 #define is_shub1()	(sn_hub_info->shub2 == 0)
 
@@ -72,7 +72,7 @@ DECLARE_PER_CPU(struct sn_hub_info_s, __
  * cpu.
  */
 DECLARE_PER_CPU(short, __sn_cnodeid_to_nasid[MAX_COMPACT_NODES]);
-#define sn_cnodeid_to_nasid	(&__get_cpu_var(__sn_cnodeid_to_nasid[0]))
+#define sn_cnodeid_to_nasid	this_cpu_ptr(&__sn_cnodeid_to_nasid[0])
 
 
 extern u8 sn_partition_id;
Index: linux/arch/ia64/include/asm/sn/nodepda.h
===================================================================
--- linux.orig/arch/ia64/include/asm/sn/nodepda.h
+++ linux/arch/ia64/include/asm/sn/nodepda.h
@@ -70,7 +70,7 @@ typedef struct nodepda_s nodepda_t;
  */
 
 DECLARE_PER_CPU(struct nodepda_s *, __sn_nodepda);
-#define sn_nodepda		(__get_cpu_var(__sn_nodepda))
+#define sn_nodepda		__this_cpu_read(__sn_nodepda)
 #define	NODEPDA(cnodeid)	(sn_nodepda->pernode_pdaindr[cnodeid])
 
 /*
Index: linux/arch/ia64/include/asm/switch_to.h
===================================================================
--- linux.orig/arch/ia64/include/asm/switch_to.h
+++ linux/arch/ia64/include/asm/switch_to.h
@@ -32,7 +32,7 @@ extern void ia64_load_extra (struct task
 
 #ifdef CONFIG_PERFMON
   DECLARE_PER_CPU(unsigned long, pfm_syst_info);
-# define PERFMON_IS_SYSWIDE() (__get_cpu_var(pfm_syst_info) & 0x1)
+# define PERFMON_IS_SYSWIDE() (__this_cpu_read(pfm_syst_info) & 0x1)
 #else
 # define PERFMON_IS_SYSWIDE() (0)
 #endif
Index: linux/arch/ia64/include/asm/uv/uv_hub.h
===================================================================
--- linux.orig/arch/ia64/include/asm/uv/uv_hub.h
+++ linux/arch/ia64/include/asm/uv/uv_hub.h
@@ -108,7 +108,7 @@ struct uv_hub_info_s {
 	unsigned char	n_val;
 };
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info 		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info 		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/ia64/kernel/irq.c
===================================================================
--- linux.orig/arch/ia64/kernel/irq.c
+++ linux/arch/ia64/kernel/irq.c
@@ -42,7 +42,7 @@ ia64_vector __ia64_irq_to_vector(int irq
 
 unsigned int __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/kernel/irq_ia64.c
===================================================================
--- linux.orig/arch/ia64/kernel/irq_ia64.c
+++ linux/arch/ia64/kernel/irq_ia64.c
@@ -330,7 +330,7 @@ static irqreturn_t smp_irq_move_cleanup_
 		int irq;
 		struct irq_desc *desc;
 		struct irq_cfg *cfg;
-		irq = __get_cpu_var(vector_irq)[vector];
+		irq = __this_cpu_read(vector_irq[vector]);
 		if (irq < 0)
 			continue;
 
@@ -344,7 +344,7 @@ static irqreturn_t smp_irq_move_cleanup_
 			goto unlock;
 
 		spin_lock_irqsave(&vector_lock, flags);
-		__get_cpu_var(vector_irq)[vector] = -1;
+		__this_cpu_write(vector_irq[vector], -1);
 		cpu_clear(me, vector_table[vector]);
 		spin_unlock_irqrestore(&vector_lock, flags);
 		cfg->move_cleanup_count--;
Index: linux/arch/ia64/kernel/kprobes.c
===================================================================
--- linux.orig/arch/ia64/kernel/kprobes.c
+++ linux/arch/ia64/kernel/kprobes.c
@@ -396,7 +396,7 @@ static void __kprobes restore_previous_k
 {
 	unsigned int i;
 	i = atomic_read(&kcb->prev_kprobe_index);
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe[i-1].kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe[i-1].kp);
 	kcb->kprobe_status = kcb->prev_kprobe[i-1].status;
 	atomic_sub(1, &kcb->prev_kprobe_index);
 }
@@ -404,7 +404,7 @@ static void __kprobes restore_previous_k
 static void __kprobes set_current_kprobe(struct kprobe *p,
 			struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static void kretprobe_trampoline(void)
@@ -823,7 +823,7 @@ static int __kprobes pre_kprobes_handler
 			/*
 			 * jprobe instrumented function just completed
 			 */
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/ia64/kernel/mca.c
===================================================================
--- linux.orig/arch/ia64/kernel/mca.c
+++ linux/arch/ia64/kernel/mca.c
@@ -1341,7 +1341,7 @@ ia64_mca_handler(struct pt_regs *regs, s
 		ia64_mlogbuf_finish(1);
 	}
 
-	if (__get_cpu_var(ia64_mca_tr_reload)) {
+	if (__this_cpu_read(ia64_mca_tr_reload)) {
 		mca_insert_tr(0x1); /*Reload dynamic itrs*/
 		mca_insert_tr(0x2); /*Reload dynamic itrs*/
 	}
@@ -1868,14 +1868,14 @@ ia64_mca_cpu_init(void *cpu_data)
 		"MCA", cpu);
 	format_mca_init_stack(data, offsetof(struct ia64_mca_cpu, init_stack),
 		"INIT", cpu);
-	__get_cpu_var(ia64_mca_data) = __per_cpu_mca[cpu] = __pa(data);
+	__this_cpu_write(ia64_mca_data, (__per_cpu_mca[cpu] = __pa(data)));
 
 	/*
 	 * Stash away a copy of the PTE needed to map the per-CPU page.
 	 * We may need it during MCA recovery.
 	 */
-	__get_cpu_var(ia64_mca_per_cpu_pte) =
-		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_per_cpu_pte,
+		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL)));
 
 	/*
 	 * Also, stash away a copy of the PAL address and the PTE
@@ -1884,10 +1884,10 @@ ia64_mca_cpu_init(void *cpu_data)
 	pal_vaddr = efi_get_pal_addr();
 	if (!pal_vaddr)
 		return;
-	__get_cpu_var(ia64_mca_pal_base) =
-		GRANULEROUNDDOWN((unsigned long) pal_vaddr);
-	__get_cpu_var(ia64_mca_pal_pte) = pte_val(mk_pte_phys(__pa(pal_vaddr),
-							      PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_pal_base,
+		GRANULEROUNDDOWN((unsigned long) pal_vaddr));
+	__this_cpu_write(ia64_mca_pal_pte, pte_val(mk_pte_phys(__pa(pal_vaddr),
+							      PAGE_KERNEL)));
 }
 
 static void ia64_mca_cmc_vector_adjust(void *dummy)
Index: linux/arch/ia64/kernel/process.c
===================================================================
--- linux.orig/arch/ia64/kernel/process.c
+++ linux/arch/ia64/kernel/process.c
@@ -215,7 +215,7 @@ static inline void play_dead(void)
 	unsigned int this_cpu = smp_processor_id();
 
 	/* Ack it */
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 
 	max_xtp();
 	local_irq_disable();
@@ -273,7 +273,7 @@ ia64_save_extra (struct task_struct *tas
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_save_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE)
 		pfm_syst_wide_update_task(task, info, 0);
 #endif
@@ -293,7 +293,7 @@ ia64_load_extra (struct task_struct *tas
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_load_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE) 
 		pfm_syst_wide_update_task(task, info, 1);
 #endif
Index: linux/arch/ia64/kernel/traps.c
===================================================================
--- linux.orig/arch/ia64/kernel/traps.c
+++ linux/arch/ia64/kernel/traps.c
@@ -299,7 +299,7 @@ handle_fpu_swa (int fp_fault, struct pt_
 
 	if (!(current->thread.flags & IA64_THREAD_FPEMU_NOPRINT))  {
 		unsigned long count, current_jiffies = jiffies;
-		struct fpu_swa_msg *cp = &__get_cpu_var(cpulast);
+		struct fpu_swa_msg *cp = this_cpu_ptr(&cpulast);
 
 		if (unlikely(current_jiffies > cp->time))
 			cp->count = 0;
Index: linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
===================================================================
--- linux.orig/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -134,8 +134,8 @@ sn2_ipi_flush_all_tlb(struct mm_struct *
 	itc = ia64_get_itc();
 	smp_flush_tlb_cpumask(*mm_cpumask(mm));
 	itc = ia64_get_itc() - itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes_itc_clocks += itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes++;
+	__this_cpu_add(ptcstats.shub_ipi_flushes_itc_clocks, itc);
+	__this_cpu_inc(ptcstats.shub_ipi_flushes);
 }
 
 /**
@@ -199,14 +199,14 @@ sn2_global_tlb_purge(struct mm_struct *m
 			start += (1UL << nbits);
 		} while (start < end);
 		ia64_srlz_i();
-		__get_cpu_var(ptcstats).ptc_l++;
+		__this_cpu_inc(ptcstats.ptc_l);
 		preempt_enable();
 		return;
 	}
 
 	if (atomic_read(&mm->mm_users) == 1 && mymm) {
 		flush_tlb_mm(mm);
-		__get_cpu_var(ptcstats).change_rid++;
+		__this_cpu_inc(ptcstats.change_rid);
 		preempt_enable();
 		return;
 	}
@@ -250,11 +250,11 @@ sn2_global_tlb_purge(struct mm_struct *m
 	spin_lock_irqsave(PTC_LOCK(shub1), flags);
 	itc2 = ia64_get_itc();
 
-	__get_cpu_var(ptcstats).lock_itc_clocks += itc2 - itc;
-	__get_cpu_var(ptcstats).shub_ptc_flushes++;
-	__get_cpu_var(ptcstats).nodes_flushed += nix;
+	__this_cpu_add(ptcstats.lock_itc_clocks, itc2 - itc);
+	__this_cpu_inc(ptcstats.shub_ptc_flushes);
+	__this_cpu_add(ptcstats.nodes_flushed, nix);
 	if (!mymm)
-		 __get_cpu_var(ptcstats).shub_ptc_flushes_not_my_mm++;
+		 __this_cpu_inc(ptcstats.shub_ptc_flushes_not_my_mm);
 
 	if (use_cpu_ptcga && !mymm) {
 		old_rr = ia64_get_rr(start);
@@ -299,9 +299,9 @@ sn2_global_tlb_purge(struct mm_struct *m
 
 done:
 	itc2 = ia64_get_itc() - itc2;
-	__get_cpu_var(ptcstats).shub_itc_clocks += itc2;
-	if (itc2 > __get_cpu_var(ptcstats).shub_itc_clocks_max)
-		__get_cpu_var(ptcstats).shub_itc_clocks_max = itc2;
+	__this_cpu_add(ptcstats.shub_itc_clocks, itc2);
+	if (itc2 > __this_cpu_read(ptcstats.shub_itc_clocks_max))
+		__this_cpu_write(ptcstats.shub_itc_clocks_max, itc2);
 
 	if (old_rr) {
 		ia64_set_rr(start, old_rr);
@@ -311,7 +311,7 @@ done:
 	spin_unlock_irqrestore(PTC_LOCK(shub1), flags);
 
 	if (flush_opt == 1 && deadlock) {
-		__get_cpu_var(ptcstats).deadlocks++;
+		__this_cpu_inc(ptcstats.deadlocks);
 		sn2_ipi_flush_all_tlb(mm);
 	}
 
@@ -334,7 +334,7 @@ sn2_ptc_deadlock_recovery(short *nasids,
 	short nasid, i;
 	unsigned long *piows, zeroval, n;
 
-	__get_cpu_var(ptcstats).deadlocks++;
+	__this_cpu_inc(ptcstats.deadlocks);
 
 	piows = (unsigned long *) pda->pio_write_status_addr;
 	zeroval = pda->pio_write_status_val;
@@ -349,7 +349,7 @@ sn2_ptc_deadlock_recovery(short *nasids,
 			ptc1 = CHANGE_NASID(nasid, ptc1);
 
 		n = sn2_ptc_deadlock_recovery_core(ptc0, data0, ptc1, data1, piows, zeroval);
-		__get_cpu_var(ptcstats).deadlocks2 += n;
+		__this_cpu_add(ptcstats.deadlocks2, n);
 	}
 
 }


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 24/35] [PATCH 24/36] ia64: Replace __get_cpu_var uses
@ 2014-08-17 17:30   ` Christoph Lameter
  0 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/ia64/include/asm/hw_irq.h     |  2 +-
 arch/ia64/include/asm/sn/arch.h    |  4 ++--
 arch/ia64/include/asm/sn/nodepda.h |  2 +-
 arch/ia64/include/asm/switch_to.h  |  2 +-
 arch/ia64/include/asm/uv/uv_hub.h  |  2 +-
 arch/ia64/kernel/irq.c             |  2 +-
 arch/ia64/kernel/irq_ia64.c        |  4 ++--
 arch/ia64/kernel/kprobes.c         |  6 +++---
 arch/ia64/kernel/mca.c             | 16 ++++++++--------
 arch/ia64/kernel/process.c         |  6 +++---
 arch/ia64/kernel/traps.c           |  2 +-
 arch/ia64/sn/kernel/sn2/sn2_smp.c  | 28 ++++++++++++++--------------
 12 files changed, 38 insertions(+), 38 deletions(-)

Index: linux/arch/ia64/include/asm/hw_irq.h
=================================--- linux.orig/arch/ia64/include/asm/hw_irq.h
+++ linux/arch/ia64/include/asm/hw_irq.h
@@ -159,7 +159,7 @@ static inline ia64_vector __ia64_irq_to_
 static inline unsigned int
 __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/include/asm/sn/arch.h
=================================--- linux.orig/arch/ia64/include/asm/sn/arch.h
+++ linux/arch/ia64/include/asm/sn/arch.h
@@ -57,7 +57,7 @@ struct sn_hub_info_s {
 	u16 nasid_bitmask;
 };
 DECLARE_PER_CPU(struct sn_hub_info_s, __sn_hub_info);
-#define sn_hub_info 	(&__get_cpu_var(__sn_hub_info))
+#define sn_hub_info 	this_cpu_ptr(&__sn_hub_info)
 #define is_shub2()	(sn_hub_info->shub2)
 #define is_shub1()	(sn_hub_info->shub2 = 0)
 
@@ -72,7 +72,7 @@ DECLARE_PER_CPU(struct sn_hub_info_s, __
  * cpu.
  */
 DECLARE_PER_CPU(short, __sn_cnodeid_to_nasid[MAX_COMPACT_NODES]);
-#define sn_cnodeid_to_nasid	(&__get_cpu_var(__sn_cnodeid_to_nasid[0]))
+#define sn_cnodeid_to_nasid	this_cpu_ptr(&__sn_cnodeid_to_nasid[0])
 
 
 extern u8 sn_partition_id;
Index: linux/arch/ia64/include/asm/sn/nodepda.h
=================================--- linux.orig/arch/ia64/include/asm/sn/nodepda.h
+++ linux/arch/ia64/include/asm/sn/nodepda.h
@@ -70,7 +70,7 @@ typedef struct nodepda_s nodepda_t;
  */
 
 DECLARE_PER_CPU(struct nodepda_s *, __sn_nodepda);
-#define sn_nodepda		(__get_cpu_var(__sn_nodepda))
+#define sn_nodepda		__this_cpu_read(__sn_nodepda)
 #define	NODEPDA(cnodeid)	(sn_nodepda->pernode_pdaindr[cnodeid])
 
 /*
Index: linux/arch/ia64/include/asm/switch_to.h
=================================--- linux.orig/arch/ia64/include/asm/switch_to.h
+++ linux/arch/ia64/include/asm/switch_to.h
@@ -32,7 +32,7 @@ extern void ia64_load_extra (struct task
 
 #ifdef CONFIG_PERFMON
   DECLARE_PER_CPU(unsigned long, pfm_syst_info);
-# define PERFMON_IS_SYSWIDE() (__get_cpu_var(pfm_syst_info) & 0x1)
+# define PERFMON_IS_SYSWIDE() (__this_cpu_read(pfm_syst_info) & 0x1)
 #else
 # define PERFMON_IS_SYSWIDE() (0)
 #endif
Index: linux/arch/ia64/include/asm/uv/uv_hub.h
=================================--- linux.orig/arch/ia64/include/asm/uv/uv_hub.h
+++ linux/arch/ia64/include/asm/uv/uv_hub.h
@@ -108,7 +108,7 @@ struct uv_hub_info_s {
 	unsigned char	n_val;
 };
 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info);
-#define uv_hub_info 		(&__get_cpu_var(__uv_hub_info))
+#define uv_hub_info 		this_cpu_ptr(&__uv_hub_info)
 #define uv_cpu_hub_info(cpu)	(&per_cpu(__uv_hub_info, cpu))
 
 /*
Index: linux/arch/ia64/kernel/irq.c
=================================--- linux.orig/arch/ia64/kernel/irq.c
+++ linux/arch/ia64/kernel/irq.c
@@ -42,7 +42,7 @@ ia64_vector __ia64_irq_to_vector(int irq
 
 unsigned int __ia64_local_vector_to_irq (ia64_vector vec)
 {
-	return __get_cpu_var(vector_irq)[vec];
+	return __this_cpu_read(vector_irq[vec]);
 }
 #endif
 
Index: linux/arch/ia64/kernel/irq_ia64.c
=================================--- linux.orig/arch/ia64/kernel/irq_ia64.c
+++ linux/arch/ia64/kernel/irq_ia64.c
@@ -330,7 +330,7 @@ static irqreturn_t smp_irq_move_cleanup_
 		int irq;
 		struct irq_desc *desc;
 		struct irq_cfg *cfg;
-		irq = __get_cpu_var(vector_irq)[vector];
+		irq = __this_cpu_read(vector_irq[vector]);
 		if (irq < 0)
 			continue;
 
@@ -344,7 +344,7 @@ static irqreturn_t smp_irq_move_cleanup_
 			goto unlock;
 
 		spin_lock_irqsave(&vector_lock, flags);
-		__get_cpu_var(vector_irq)[vector] = -1;
+		__this_cpu_write(vector_irq[vector], -1);
 		cpu_clear(me, vector_table[vector]);
 		spin_unlock_irqrestore(&vector_lock, flags);
 		cfg->move_cleanup_count--;
Index: linux/arch/ia64/kernel/kprobes.c
=================================--- linux.orig/arch/ia64/kernel/kprobes.c
+++ linux/arch/ia64/kernel/kprobes.c
@@ -396,7 +396,7 @@ static void __kprobes restore_previous_k
 {
 	unsigned int i;
 	i = atomic_read(&kcb->prev_kprobe_index);
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe[i-1].kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe[i-1].kp);
 	kcb->kprobe_status = kcb->prev_kprobe[i-1].status;
 	atomic_sub(1, &kcb->prev_kprobe_index);
 }
@@ -404,7 +404,7 @@ static void __kprobes restore_previous_k
 static void __kprobes set_current_kprobe(struct kprobe *p,
 			struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static void kretprobe_trampoline(void)
@@ -823,7 +823,7 @@ static int __kprobes pre_kprobes_handler
 			/*
 			 * jprobe instrumented function just completed
 			 */
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/ia64/kernel/mca.c
=================================--- linux.orig/arch/ia64/kernel/mca.c
+++ linux/arch/ia64/kernel/mca.c
@@ -1341,7 +1341,7 @@ ia64_mca_handler(struct pt_regs *regs, s
 		ia64_mlogbuf_finish(1);
 	}
 
-	if (__get_cpu_var(ia64_mca_tr_reload)) {
+	if (__this_cpu_read(ia64_mca_tr_reload)) {
 		mca_insert_tr(0x1); /*Reload dynamic itrs*/
 		mca_insert_tr(0x2); /*Reload dynamic itrs*/
 	}
@@ -1868,14 +1868,14 @@ ia64_mca_cpu_init(void *cpu_data)
 		"MCA", cpu);
 	format_mca_init_stack(data, offsetof(struct ia64_mca_cpu, init_stack),
 		"INIT", cpu);
-	__get_cpu_var(ia64_mca_data) = __per_cpu_mca[cpu] = __pa(data);
+	__this_cpu_write(ia64_mca_data, (__per_cpu_mca[cpu] = __pa(data)));
 
 	/*
 	 * Stash away a copy of the PTE needed to map the per-CPU page.
 	 * We may need it during MCA recovery.
 	 */
-	__get_cpu_var(ia64_mca_per_cpu_pte) -		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_per_cpu_pte,
+		pte_val(mk_pte_phys(__pa(cpu_data), PAGE_KERNEL)));
 
 	/*
 	 * Also, stash away a copy of the PAL address and the PTE
@@ -1884,10 +1884,10 @@ ia64_mca_cpu_init(void *cpu_data)
 	pal_vaddr = efi_get_pal_addr();
 	if (!pal_vaddr)
 		return;
-	__get_cpu_var(ia64_mca_pal_base) -		GRANULEROUNDDOWN((unsigned long) pal_vaddr);
-	__get_cpu_var(ia64_mca_pal_pte) = pte_val(mk_pte_phys(__pa(pal_vaddr),
-							      PAGE_KERNEL));
+	__this_cpu_write(ia64_mca_pal_base,
+		GRANULEROUNDDOWN((unsigned long) pal_vaddr));
+	__this_cpu_write(ia64_mca_pal_pte, pte_val(mk_pte_phys(__pa(pal_vaddr),
+							      PAGE_KERNEL)));
 }
 
 static void ia64_mca_cmc_vector_adjust(void *dummy)
Index: linux/arch/ia64/kernel/process.c
=================================--- linux.orig/arch/ia64/kernel/process.c
+++ linux/arch/ia64/kernel/process.c
@@ -215,7 +215,7 @@ static inline void play_dead(void)
 	unsigned int this_cpu = smp_processor_id();
 
 	/* Ack it */
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 
 	max_xtp();
 	local_irq_disable();
@@ -273,7 +273,7 @@ ia64_save_extra (struct task_struct *tas
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_save_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE)
 		pfm_syst_wide_update_task(task, info, 0);
 #endif
@@ -293,7 +293,7 @@ ia64_load_extra (struct task_struct *tas
 	if ((task->thread.flags & IA64_THREAD_PM_VALID) != 0)
 		pfm_load_regs(task);
 
-	info = __get_cpu_var(pfm_syst_info);
+	info = __this_cpu_read(pfm_syst_info);
 	if (info & PFM_CPUINFO_SYST_WIDE) 
 		pfm_syst_wide_update_task(task, info, 1);
 #endif
Index: linux/arch/ia64/kernel/traps.c
=================================--- linux.orig/arch/ia64/kernel/traps.c
+++ linux/arch/ia64/kernel/traps.c
@@ -299,7 +299,7 @@ handle_fpu_swa (int fp_fault, struct pt_
 
 	if (!(current->thread.flags & IA64_THREAD_FPEMU_NOPRINT))  {
 		unsigned long count, current_jiffies = jiffies;
-		struct fpu_swa_msg *cp = &__get_cpu_var(cpulast);
+		struct fpu_swa_msg *cp = this_cpu_ptr(&cpulast);
 
 		if (unlikely(current_jiffies > cp->time))
 			cp->count = 0;
Index: linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
=================================--- linux.orig/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ linux/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -134,8 +134,8 @@ sn2_ipi_flush_all_tlb(struct mm_struct *
 	itc = ia64_get_itc();
 	smp_flush_tlb_cpumask(*mm_cpumask(mm));
 	itc = ia64_get_itc() - itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes_itc_clocks += itc;
-	__get_cpu_var(ptcstats).shub_ipi_flushes++;
+	__this_cpu_add(ptcstats.shub_ipi_flushes_itc_clocks, itc);
+	__this_cpu_inc(ptcstats.shub_ipi_flushes);
 }
 
 /**
@@ -199,14 +199,14 @@ sn2_global_tlb_purge(struct mm_struct *m
 			start += (1UL << nbits);
 		} while (start < end);
 		ia64_srlz_i();
-		__get_cpu_var(ptcstats).ptc_l++;
+		__this_cpu_inc(ptcstats.ptc_l);
 		preempt_enable();
 		return;
 	}
 
 	if (atomic_read(&mm->mm_users) = 1 && mymm) {
 		flush_tlb_mm(mm);
-		__get_cpu_var(ptcstats).change_rid++;
+		__this_cpu_inc(ptcstats.change_rid);
 		preempt_enable();
 		return;
 	}
@@ -250,11 +250,11 @@ sn2_global_tlb_purge(struct mm_struct *m
 	spin_lock_irqsave(PTC_LOCK(shub1), flags);
 	itc2 = ia64_get_itc();
 
-	__get_cpu_var(ptcstats).lock_itc_clocks += itc2 - itc;
-	__get_cpu_var(ptcstats).shub_ptc_flushes++;
-	__get_cpu_var(ptcstats).nodes_flushed += nix;
+	__this_cpu_add(ptcstats.lock_itc_clocks, itc2 - itc);
+	__this_cpu_inc(ptcstats.shub_ptc_flushes);
+	__this_cpu_add(ptcstats.nodes_flushed, nix);
 	if (!mymm)
-		 __get_cpu_var(ptcstats).shub_ptc_flushes_not_my_mm++;
+		 __this_cpu_inc(ptcstats.shub_ptc_flushes_not_my_mm);
 
 	if (use_cpu_ptcga && !mymm) {
 		old_rr = ia64_get_rr(start);
@@ -299,9 +299,9 @@ sn2_global_tlb_purge(struct mm_struct *m
 
 done:
 	itc2 = ia64_get_itc() - itc2;
-	__get_cpu_var(ptcstats).shub_itc_clocks += itc2;
-	if (itc2 > __get_cpu_var(ptcstats).shub_itc_clocks_max)
-		__get_cpu_var(ptcstats).shub_itc_clocks_max = itc2;
+	__this_cpu_add(ptcstats.shub_itc_clocks, itc2);
+	if (itc2 > __this_cpu_read(ptcstats.shub_itc_clocks_max))
+		__this_cpu_write(ptcstats.shub_itc_clocks_max, itc2);
 
 	if (old_rr) {
 		ia64_set_rr(start, old_rr);
@@ -311,7 +311,7 @@ done:
 	spin_unlock_irqrestore(PTC_LOCK(shub1), flags);
 
 	if (flush_opt = 1 && deadlock) {
-		__get_cpu_var(ptcstats).deadlocks++;
+		__this_cpu_inc(ptcstats.deadlocks);
 		sn2_ipi_flush_all_tlb(mm);
 	}
 
@@ -334,7 +334,7 @@ sn2_ptc_deadlock_recovery(short *nasids,
 	short nasid, i;
 	unsigned long *piows, zeroval, n;
 
-	__get_cpu_var(ptcstats).deadlocks++;
+	__this_cpu_inc(ptcstats.deadlocks);
 
 	piows = (unsigned long *) pda->pio_write_status_addr;
 	zeroval = pda->pio_write_status_val;
@@ -349,7 +349,7 @@ sn2_ptc_deadlock_recovery(short *nasids,
 			ptc1 = CHANGE_NASID(nasid, ptc1);
 
 		n = sn2_ptc_deadlock_recovery_core(ptc0, data0, ptc1, data1, piows, zeroval);
-		__get_cpu_var(ptcstats).deadlocks2 += n;
+		__this_cpu_add(ptcstats.deadlocks2, n);
 	}
 
 }


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 25/35] [PATCH 25/36] alpha: Replace __get_cpu_var
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (23 preceding siblings ...)
  2014-08-17 17:30   ` Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:14   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses Christoph Lameter
                   ` (10 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ivan Kokshaysky, Matt Turner, Richard Henderson

[-- Attachment #1: 0025-alpha-Replace-__get_cpu_var.patch --]
[-- Type: text/plain, Size: 6872 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/alpha/kernel/perf_event.c     | 16 ++++++++--------
 arch/alpha/kernel/time.c           |  6 +++---
 arch/powerpc/include/asm/cputime.h |  6 +++---
 3 files changed, 14 insertions(+), 14 deletions(-)

Index: linux/arch/alpha/kernel/perf_event.c
===================================================================
--- linux.orig/arch/alpha/kernel/perf_event.c
+++ linux/arch/alpha/kernel/perf_event.c
@@ -431,7 +431,7 @@ static void maybe_change_configuration(s
  */
 static int alpha_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int n0;
 	int ret;
@@ -483,7 +483,7 @@ static int alpha_pmu_add(struct perf_eve
  */
 static void alpha_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long irq_flags;
 	int j;
@@ -531,7 +531,7 @@ static void alpha_pmu_read(struct perf_e
 static void alpha_pmu_stop(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!(hwc->state & PERF_HES_STOPPED)) {
 		cpuc->idx_mask &= ~(1UL<<hwc->idx);
@@ -551,7 +551,7 @@ static void alpha_pmu_stop(struct perf_e
 static void alpha_pmu_start(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
 		return;
@@ -724,7 +724,7 @@ static int alpha_pmu_event_init(struct p
  */
 static void alpha_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (cpuc->enabled)
 		return;
@@ -750,7 +750,7 @@ static void alpha_pmu_enable(struct pmu
 
 static void alpha_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuc->enabled)
 		return;
@@ -814,8 +814,8 @@ static void alpha_perf_event_irq_handler
 	struct hw_perf_event *hwc;
 	int idx, j;
 
-	__get_cpu_var(irq_pmi_count)++;
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	__this_cpu_inc(irq_pmi_count);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* Completely counting through the PMC's period to trigger a new PMC
 	 * overflow interrupt while in this interrupt routine is utterly
Index: linux/arch/alpha/kernel/time.c
===================================================================
--- linux.orig/arch/alpha/kernel/time.c
+++ linux/arch/alpha/kernel/time.c
@@ -56,9 +56,9 @@ unsigned long est_cycle_freq;
 
 DEFINE_PER_CPU(u8, irq_work_pending);
 
-#define set_irq_work_pending_flag()  __get_cpu_var(irq_work_pending) = 1
-#define test_irq_work_pending()      __get_cpu_var(irq_work_pending)
-#define clear_irq_work_pending()     __get_cpu_var(irq_work_pending) = 0
+#define set_irq_work_pending_flag()  __this_cpu_write(irq_work_pending, 1)
+#define test_irq_work_pending()      __this_cpu_read(irq_work_pending)
+#define clear_irq_work_pending()     __this_cpu_write(irq_work_pending, 0)
 
 void arch_irq_work_raise(void)
 {
Index: linux/arch/powerpc/include/asm/cputime.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/cputime.h
+++ linux/arch/powerpc/include/asm/cputime.h
@@ -56,10 +56,10 @@ static inline unsigned long cputime_to_j
 static inline cputime_t cputime_to_scaled(const cputime_t ct)
 {
 	if (cpu_has_feature(CPU_FTR_SPURR) &&
-	    __get_cpu_var(cputime_last_delta))
+	    __this_cpu_read(cputime_last_delta))
 		return (__force u64) ct *
-			__get_cpu_var(cputime_scaled_last_delta) /
-			__get_cpu_var(cputime_last_delta);
+			__this_cpu_read(cputime_scaled_last_delta) /
+			__this_cpu_read(cputime_last_delta);
 	return ct;
 }
 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (24 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 25/35] [PATCH 25/36] alpha: Replace __get_cpu_var Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-18  2:45   ` Christoph Lameter
  2014-08-26 18:14   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 27/35] [PATCH 27/36] tile: " Christoph Lameter
                   ` (9 subsequent siblings)
  35 siblings, 2 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras

[-- Attachment #1: 0026-powerpc-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 37799 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/powerpc/include/asm/hardirq.h           |  4 +++-
 arch/powerpc/include/asm/tlbflush.h          |  4 ++--
 arch/powerpc/include/asm/xics.h              |  8 ++++----
 arch/powerpc/kernel/dbell.c                  |  2 +-
 arch/powerpc/kernel/hw_breakpoint.c          |  6 +++---
 arch/powerpc/kernel/iommu.c                  |  2 +-
 arch/powerpc/kernel/irq.c                    |  4 ++--
 arch/powerpc/kernel/kgdb.c                   |  2 +-
 arch/powerpc/kernel/kprobes.c                |  6 +++---
 arch/powerpc/kernel/mce.c                    | 24 ++++++++++++------------
 arch/powerpc/kernel/process.c                | 10 +++++-----
 arch/powerpc/kernel/smp.c                    |  6 +++---
 arch/powerpc/kernel/sysfs.c                  |  4 ++--
 arch/powerpc/kernel/time.c                   | 22 +++++++++++-----------
 arch/powerpc/kernel/traps.c                  |  6 +++---
 arch/powerpc/kvm/e500.c                      | 14 +++++++-------
 arch/powerpc/kvm/e500mc.c                    |  4 ++--
 arch/powerpc/mm/hash_native_64.c             |  2 +-
 arch/powerpc/mm/hash_utils_64.c              |  2 +-
 arch/powerpc/mm/hugetlbpage-book3e.c         |  6 +++---
 arch/powerpc/mm/hugetlbpage.c                |  2 +-
 arch/powerpc/mm/stab.c                       | 12 ++++++------
 arch/powerpc/perf/core-book3s.c              | 22 +++++++++++-----------
 arch/powerpc/perf/core-fsl-emb.c             |  6 +++---
 arch/powerpc/platforms/cell/interrupt.c      |  6 +++---
 arch/powerpc/platforms/ps3/interrupt.c       |  2 +-
 arch/powerpc/platforms/pseries/dtl.c         |  2 +-
 arch/powerpc/platforms/pseries/hvCall_inst.c |  4 ++--
 arch/powerpc/platforms/pseries/iommu.c       |  8 ++++----
 arch/powerpc/platforms/pseries/lpar.c        |  6 +++---
 arch/powerpc/platforms/pseries/ras.c         |  4 ++--
 arch/powerpc/sysdev/xics/xics-common.c       |  2 +-
 32 files changed, 108 insertions(+), 106 deletions(-)

Index: linux/arch/powerpc/include/asm/hardirq.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/hardirq.h
+++ linux/arch/powerpc/include/asm/hardirq.h
@@ -21,7 +21,9 @@ DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpust
 
 #define __ARCH_IRQ_STAT
 
-#define local_softirq_pending()	__get_cpu_var(irq_stat).__softirq_pending
+#define local_softirq_pending()	__this_cpu_read(irq_stat.__softirq_pending)
+#define set_softirq_pending(x) __this_cpu_write(irq_stat._softirq_pending, (x))
+#define or_softirq_pending(x) __this_cpu_or(irq_stat._softirq_pending, (x))
 
 static inline void ack_bad_irq(unsigned int irq)
 {
Index: linux/arch/powerpc/include/asm/tlbflush.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/tlbflush.h
+++ linux/arch/powerpc/include/asm/tlbflush.h
@@ -107,14 +107,14 @@ extern void __flush_tlb_pending(struct p
 
 static inline void arch_enter_lazy_mmu_mode(void)
 {
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 
 	batch->active = 1;
 }
 
 static inline void arch_leave_lazy_mmu_mode(void)
 {
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 
 	if (batch->index)
 		__flush_tlb_pending(batch);
Index: linux/arch/powerpc/include/asm/xics.h
===================================================================
--- linux.orig/arch/powerpc/include/asm/xics.h
+++ linux/arch/powerpc/include/asm/xics.h
@@ -97,7 +97,7 @@ DECLARE_PER_CPU(struct xics_cppr, xics_c
 
 static inline void xics_push_cppr(unsigned int vec)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	if (WARN_ON(os_cppr->index >= MAX_NUM_PRIORITIES - 1))
 		return;
@@ -110,7 +110,7 @@ static inline void xics_push_cppr(unsign
 
 static inline unsigned char xics_pop_cppr(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	if (WARN_ON(os_cppr->index < 1))
 		return LOWEST_PRIORITY;
@@ -120,7 +120,7 @@ static inline unsigned char xics_pop_cpp
 
 static inline void xics_set_base_cppr(unsigned char cppr)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	/* we only really want to set the priority when there's
 	 * just one cppr value on the stack
@@ -132,7 +132,7 @@ static inline void xics_set_base_cppr(un
 
 static inline unsigned char xics_cppr_top(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 	
 	return os_cppr->stack[os_cppr->index];
 }
Index: linux/arch/powerpc/kernel/dbell.c
===================================================================
--- linux.orig/arch/powerpc/kernel/dbell.c
+++ linux/arch/powerpc/kernel/dbell.c
@@ -41,7 +41,7 @@ void doorbell_exception(struct pt_regs *
 
 	may_hard_irq_enable();
 
-	__get_cpu_var(irq_stat).doorbell_irqs++;
+	__this_cpu_inc(irq_stat.doorbell_irqs);
 
 	smp_ipi_demux();
 
Index: linux/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- linux.orig/arch/powerpc/kernel/hw_breakpoint.c
+++ linux/arch/powerpc/kernel/hw_breakpoint.c
@@ -63,7 +63,7 @@ int hw_breakpoint_slots(int type)
 int arch_install_hw_breakpoint(struct perf_event *bp)
 {
 	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
-	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+	struct perf_event **slot = this_cpu_ptr(&bp_per_reg);
 
 	*slot = bp;
 
@@ -88,7 +88,7 @@ int arch_install_hw_breakpoint(struct pe
  */
 void arch_uninstall_hw_breakpoint(struct perf_event *bp)
 {
-	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+	struct perf_event **slot = this_cpu_ptr(&bp_per_reg);
 
 	if (*slot != bp) {
 		WARN_ONCE(1, "Can't find the breakpoint");
@@ -226,7 +226,7 @@ int __kprobes hw_breakpoint_handler(stru
 	 */
 	rcu_read_lock();
 
-	bp = __get_cpu_var(bp_per_reg);
+	bp = __this_cpu_read(bp_per_reg);
 	if (!bp)
 		goto out;
 	info = counter_arch_bp(bp);
Index: linux/arch/powerpc/kernel/iommu.c
===================================================================
--- linux.orig/arch/powerpc/kernel/iommu.c
+++ linux/arch/powerpc/kernel/iommu.c
@@ -208,7 +208,7 @@ static unsigned long iommu_range_alloc(s
 	 * We don't need to disable preemption here because any CPU can
 	 * safely use any IOMMU pool.
 	 */
-	pool_nr = __raw_get_cpu_var(iommu_pool_hash) & (tbl->nr_pools - 1);
+	pool_nr = __this_cpu_read(iommu_pool_hash) & (tbl->nr_pools - 1);
 
 	if (largealloc)
 		pool = &(tbl->large_pool);
Index: linux/arch/powerpc/kernel/irq.c
===================================================================
--- linux.orig/arch/powerpc/kernel/irq.c
+++ linux/arch/powerpc/kernel/irq.c
@@ -114,7 +114,7 @@ static inline notrace void set_soft_enab
 static inline notrace int decrementer_check_overflow(void)
 {
  	u64 now = get_tb_or_rtc();
- 	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
  
 	return now >= *next_tb;
 }
@@ -499,7 +499,7 @@ void __do_irq(struct pt_regs *regs)
 
 	/* And finally process it */
 	if (unlikely(irq == NO_IRQ))
-		__get_cpu_var(irq_stat).spurious_irqs++;
+		__this_cpu_inc(irq_stat.spurious_irqs);
 	else
 		generic_handle_irq(irq);
 
Index: linux/arch/powerpc/kernel/kgdb.c
===================================================================
--- linux.orig/arch/powerpc/kernel/kgdb.c
+++ linux/arch/powerpc/kernel/kgdb.c
@@ -155,7 +155,7 @@ static int kgdb_singlestep(struct pt_reg
 {
 	struct thread_info *thread_info, *exception_thread_info;
 	struct thread_info *backup_current_thread_info =
-		&__get_cpu_var(kgdb_thread_info);
+		this_cpu_ptr(&kgdb_thread_info);
 
 	if (user_mode(regs))
 		return 0;
Index: linux/arch/powerpc/kernel/kprobes.c
===================================================================
--- linux.orig/arch/powerpc/kernel/kprobes.c
+++ linux/arch/powerpc/kernel/kprobes.c
@@ -119,7 +119,7 @@ static void __kprobes save_previous_kpro
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_saved_msr = kcb->prev_kprobe.saved_msr;
 }
@@ -127,7 +127,7 @@ static void __kprobes restore_previous_k
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_saved_msr = regs->msr;
 }
 
@@ -192,7 +192,7 @@ static int __kprobes kprobe_handler(stru
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs)) {
 				goto ss_probe;
 			}
Index: linux/arch/powerpc/kernel/mce.c
===================================================================
--- linux.orig/arch/powerpc/kernel/mce.c
+++ linux/arch/powerpc/kernel/mce.c
@@ -73,8 +73,8 @@ void save_mce_event(struct pt_regs *regs
 		    uint64_t nip, uint64_t addr)
 {
 	uint64_t srr1;
-	int index = __get_cpu_var(mce_nest_count)++;
-	struct machine_check_event *mce = &__get_cpu_var(mce_event[index]);
+	int index = __this_cpu_inc_return(mce_nest_count);
+	struct machine_check_event *mce = this_cpu_ptr(&mce_event[index]);
 
 	/*
 	 * Return if we don't have enough space to log mce event.
@@ -143,7 +143,7 @@ void save_mce_event(struct pt_regs *regs
  */
 int get_mce_event(struct machine_check_event *mce, bool release)
 {
-	int index = __get_cpu_var(mce_nest_count) - 1;
+	int index = __this_cpu_read(mce_nest_count) - 1;
 	struct machine_check_event *mc_evt;
 	int ret = 0;
 
@@ -153,7 +153,7 @@ int get_mce_event(struct machine_check_e
 
 	/* Check if we have MCE info to process. */
 	if (index < MAX_MC_EVT) {
-		mc_evt = &__get_cpu_var(mce_event[index]);
+		mc_evt = this_cpu_ptr(&mce_event[index]);
 		/* Copy the event structure and release the original */
 		if (mce)
 			*mce = *mc_evt;
@@ -163,7 +163,7 @@ int get_mce_event(struct machine_check_e
 	}
 	/* Decrement the count to free the slot. */
 	if (release)
-		__get_cpu_var(mce_nest_count)--;
+		__this_cpu_dec(mce_nest_count);
 
 	return ret;
 }
@@ -184,13 +184,13 @@ void machine_check_queue_event(void)
 	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
 		return;
 
-	index = __get_cpu_var(mce_queue_count)++;
+	index = __this_cpu_inc_return(mce_queue_count);
 	/* If queue is full, just return for now. */
 	if (index >= MAX_MC_EVT) {
-		__get_cpu_var(mce_queue_count)--;
+		__this_cpu_dec(mce_queue_count);
 		return;
 	}
-	__get_cpu_var(mce_event_queue[index]) = evt;
+	memcpy(this_cpu_ptr(&mce_event_queue[index]), &evt, sizeof(evt));
 
 	/* Queue irq work to process this event later. */
 	irq_work_queue(&mce_event_process_work);
@@ -208,11 +208,11 @@ static void machine_check_process_queued
 	 * For now just print it to console.
 	 * TODO: log this error event to FSP or nvram.
 	 */
-	while (__get_cpu_var(mce_queue_count) > 0) {
-		index = __get_cpu_var(mce_queue_count) - 1;
+	while (__this_cpu_read(mce_queue_count) > 0) {
+		index = __this_cpu_read(mce_queue_count) - 1;
 		machine_check_print_event_info(
-				&__get_cpu_var(mce_event_queue[index]));
-		__get_cpu_var(mce_queue_count)--;
+				this_cpu_ptr(&mce_event_queue[index]));
+		__this_cpu_dec(mce_queue_count);
 	}
 }
 
Index: linux/arch/powerpc/kernel/process.c
===================================================================
--- linux.orig/arch/powerpc/kernel/process.c
+++ linux/arch/powerpc/kernel/process.c
@@ -498,7 +498,7 @@ static inline int set_dawr(struct arch_h
 
 void __set_breakpoint(struct arch_hw_breakpoint *brk)
 {
-	__get_cpu_var(current_brk) = *brk;
+	__this_cpu_write(current_brk, *brk);
 
 	if (cpu_has_feature(CPU_FTR_DAWR))
 		set_dawr(brk);
@@ -841,7 +841,7 @@ struct task_struct *__switch_to(struct t
  * schedule DABR
  */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
-	if (unlikely(!hw_brk_match(&__get_cpu_var(current_brk), &new->thread.hw_brk)))
+	if (unlikely(!hw_brk_match(this_cpu_ptr(&current_brk), &new->thread.hw_brk)))
 		__set_breakpoint(&new->thread.hw_brk);
 #endif /* CONFIG_HAVE_HW_BREAKPOINT */
 #endif
@@ -855,7 +855,7 @@ struct task_struct *__switch_to(struct t
 	 * Collect processor utilization data per process
 	 */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
+		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
 		long unsigned start_tb, current_tb;
 		start_tb = old_thread->start_tb;
 		cu->current_tb = current_tb = mfspr(SPRN_PURR);
@@ -865,7 +865,7 @@ struct task_struct *__switch_to(struct t
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_PPC_BOOK3S_64
-	batch = &__get_cpu_var(ppc64_tlb_batch);
+	batch = this_cpu_ptr(&ppc64_tlb_batch);
 	if (batch->active) {
 		current_thread_info()->local_flags |= _TLF_LAZY_MMU;
 		if (batch->index)
@@ -888,7 +888,7 @@ struct task_struct *__switch_to(struct t
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (current_thread_info()->local_flags & _TLF_LAZY_MMU) {
 		current_thread_info()->local_flags &= ~_TLF_LAZY_MMU;
-		batch = &__get_cpu_var(ppc64_tlb_batch);
+		batch = this_cpu_ptr(&ppc64_tlb_batch);
 		batch->active = 1;
 	}
 #endif /* CONFIG_PPC_BOOK3S_64 */
Index: linux/arch/powerpc/kernel/smp.c
===================================================================
--- linux.orig/arch/powerpc/kernel/smp.c
+++ linux/arch/powerpc/kernel/smp.c
@@ -242,7 +242,7 @@ void smp_muxed_ipi_message_pass(int cpu,
 
 irqreturn_t smp_ipi_demux(void)
 {
-	struct cpu_messages *info = &__get_cpu_var(ipi_message);
+	struct cpu_messages *info = this_cpu_ptr(&ipi_message);
 	unsigned int all;
 
 	mb();	/* order any irq clear */
@@ -438,9 +438,9 @@ void generic_mach_cpu_die(void)
 	idle_task_exit();
 	cpu = smp_processor_id();
 	printk(KERN_DEBUG "CPU%d offline\n", cpu);
-	__get_cpu_var(cpu_state) = CPU_DEAD;
+	__this_cpu_write(cpu_state, CPU_DEAD);
 	smp_wmb();
-	while (__get_cpu_var(cpu_state) != CPU_UP_PREPARE)
+	while (__this_cpu_read(cpu_state) != CPU_UP_PREPARE)
 		cpu_relax();
 }
 
Index: linux/arch/powerpc/kernel/sysfs.c
===================================================================
--- linux.orig/arch/powerpc/kernel/sysfs.c
+++ linux/arch/powerpc/kernel/sysfs.c
@@ -394,10 +394,10 @@ void ppc_enable_pmcs(void)
 	ppc_set_pmu_inuse(1);
 
 	/* Only need to enable them once */
-	if (__get_cpu_var(pmcs_enabled))
+	if (__this_cpu_read(pmcs_enabled))
 		return;
 
-	__get_cpu_var(pmcs_enabled) = 1;
+	__this_cpu_write(pmcs_enabled, 1);
 
 	if (ppc_md.enable_pmcs)
 		ppc_md.enable_pmcs();
Index: linux/arch/powerpc/kernel/time.c
===================================================================
--- linux.orig/arch/powerpc/kernel/time.c
+++ linux/arch/powerpc/kernel/time.c
@@ -458,9 +458,9 @@ static inline void clear_irq_work_pendin
 
 DEFINE_PER_CPU(u8, irq_work_pending);
 
-#define set_irq_work_pending_flag()	__get_cpu_var(irq_work_pending) = 1
-#define test_irq_work_pending()		__get_cpu_var(irq_work_pending)
-#define clear_irq_work_pending()	__get_cpu_var(irq_work_pending) = 0
+#define set_irq_work_pending_flag()	__this_cpu_write(irq_work_pending, 1)
+#define test_irq_work_pending()		__this_cpu_read(irq_work_pending)
+#define clear_irq_work_pending()	__this_cpu_write(irq_work_pending, 0)
 
 #endif /* 32 vs 64 bit */
 
@@ -482,8 +482,8 @@ void arch_irq_work_raise(void)
 void __timer_interrupt(void)
 {
 	struct pt_regs *regs = get_irq_regs();
-	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
-	struct clock_event_device *evt = &__get_cpu_var(decrementers);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+	struct clock_event_device *evt = this_cpu_ptr(&decrementers);
 	u64 now;
 
 	trace_timer_interrupt_entry(regs);
@@ -498,7 +498,7 @@ void __timer_interrupt(void)
 		*next_tb = ~(u64)0;
 		if (evt->event_handler)
 			evt->event_handler(evt);
-		__get_cpu_var(irq_stat).timer_irqs_event++;
+		__this_cpu_inc(irq_stat.timer_irqs_event);
 	} else {
 		now = *next_tb - now;
 		if (now <= DECREMENTER_MAX)
@@ -506,13 +506,13 @@ void __timer_interrupt(void)
 		/* We may have raced with new irq work */
 		if (test_irq_work_pending())
 			set_dec(1);
-		__get_cpu_var(irq_stat).timer_irqs_others++;
+		__this_cpu_inc(irq_stat.timer_irqs_others);
 	}
 
 #ifdef CONFIG_PPC64
 	/* collect purr register values often, for accurate calculations */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
+		struct cpu_usage *cu = this_cpu_ptr(&cpu_usage_array);
 		cu->current_tb = mfspr(SPRN_PURR);
 	}
 #endif
@@ -527,7 +527,7 @@ void __timer_interrupt(void)
 void timer_interrupt(struct pt_regs * regs)
 {
 	struct pt_regs *old_regs;
-	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
 
 	/* Ensure a positive value is written to the decrementer, or else
 	 * some CPUs will continue to take decrementer exceptions.
@@ -813,7 +813,7 @@ static void __init clocksource_init(void
 static int decrementer_set_next_event(unsigned long evt,
 				      struct clock_event_device *dev)
 {
-	__get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
+	__this_cpu_write(decrementers_next_tb, get_tb_or_rtc() + evt);
 	set_dec(evt);
 
 	/* We may have raced with new irq work */
@@ -833,7 +833,7 @@ static void decrementer_set_mode(enum cl
 /* Interrupt handler for the timer broadcast IPI */
 void tick_broadcast_ipi_handler(void)
 {
-	u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
 
 	*next_tb = get_tb_or_rtc();
 	__timer_interrupt();
Index: linux/arch/powerpc/kernel/traps.c
===================================================================
--- linux.orig/arch/powerpc/kernel/traps.c
+++ linux/arch/powerpc/kernel/traps.c
@@ -295,7 +295,7 @@ long machine_check_early(struct pt_regs
 {
 	long handled = 0;
 
-	__get_cpu_var(irq_stat).mce_exceptions++;
+	__this_cpu_inc(irq_stat.mce_exceptions);
 
 	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
 		handled = cur_cpu_spec->machine_check_early(regs);
@@ -304,7 +304,7 @@ long machine_check_early(struct pt_regs
 
 long hmi_exception_realmode(struct pt_regs *regs)
 {
-	__get_cpu_var(irq_stat).hmi_exceptions++;
+	__this_cpu_inc(irq_stat.hmi_exceptions);
 
 	if (ppc_md.hmi_exception_early)
 		ppc_md.hmi_exception_early(regs);
@@ -700,7 +700,7 @@ void machine_check_exception(struct pt_r
 	enum ctx_state prev_state = exception_enter();
 	int recover = 0;
 
-	__get_cpu_var(irq_stat).mce_exceptions++;
+	__this_cpu_inc(irq_stat.mce_exceptions);
 
 	/* See if any machine dependent calls. In theory, we would want
 	 * to call the CPU first, and call the ppc_md. one if the CPU
@@ -1519,7 +1519,7 @@ void vsx_unavailable_tm(struct pt_regs *
 
 void performance_monitor_exception(struct pt_regs *regs)
 {
-	__get_cpu_var(irq_stat).pmu_irqs++;
+	__this_cpu_inc(irq_stat.pmu_irqs);
 
 	perf_irq(regs);
 }
Index: linux/arch/powerpc/kvm/e500.c
===================================================================
--- linux.orig/arch/powerpc/kvm/e500.c
+++ linux/arch/powerpc/kvm/e500.c
@@ -76,11 +76,11 @@ static inline int local_sid_setup_one(st
 	unsigned long sid;
 	int ret = -1;
 
-	sid = ++(__get_cpu_var(pcpu_last_used_sid));
+	sid = __this_cpu_inc_return(pcpu_last_used_sid);
 	if (sid < NUM_TIDS) {
-		__get_cpu_var(pcpu_sids).entry[sid] = entry;
+		__this_cpu_write(pcpu_sids)entry[sid], entry);
 		entry->val = sid;
-		entry->pentry = &__get_cpu_var(pcpu_sids).entry[sid];
+		entry->pentry = this_cpu_ptr(&pcpu_sids.entry[sid]);
 		ret = sid;
 	}
 
@@ -108,8 +108,8 @@ static inline int local_sid_setup_one(st
 static inline int local_sid_lookup(struct id *entry)
 {
 	if (entry && entry->val != 0 &&
-	    __get_cpu_var(pcpu_sids).entry[entry->val] == entry &&
-	    entry->pentry == &__get_cpu_var(pcpu_sids).entry[entry->val])
+	    __this_cpu_read(pcpu_sids.entry[entry->val]) == entry &&
+	    entry->pentry == this_cpu_ptr(&pcpu_sids.entry[entry->val]))
 		return entry->val;
 	return -1;
 }
@@ -117,8 +117,8 @@ static inline int local_sid_lookup(struc
 /* Invalidate all id mappings on local core -- call with preempt disabled */
 static inline void local_sid_destroy_all(void)
 {
-	__get_cpu_var(pcpu_last_used_sid) = 0;
-	memset(&__get_cpu_var(pcpu_sids), 0, sizeof(__get_cpu_var(pcpu_sids)));
+	__this_cpu_write(pcpu_last_used_sid, 0);
+	memset(this_cpu_ptr(&pcpu_sids), 0, sizeof(pcpu_sids));
 }
 
 static void *kvmppc_e500_id_table_alloc(struct kvmppc_vcpu_e500 *vcpu_e500)
Index: linux/arch/powerpc/kvm/e500mc.c
===================================================================
--- linux.orig/arch/powerpc/kvm/e500mc.c
+++ linux/arch/powerpc/kvm/e500mc.c
@@ -141,9 +141,9 @@ static void kvmppc_core_vcpu_load_e500mc
 	mtspr(SPRN_GESR, vcpu->arch.shared->esr);
 
 	if (vcpu->arch.oldpir != mfspr(SPRN_PIR) ||
-	    __get_cpu_var(last_vcpu_of_lpid)[vcpu->kvm->arch.lpid] != vcpu) {
+	    __this_cpu_read(last_vcpu_of_lpid[vcpu->kvm->arch.lpid]) != vcpu) {
 		kvmppc_e500_tlbil_all(vcpu_e500);
-		__get_cpu_var(last_vcpu_of_lpid)[vcpu->kvm->arch.lpid] = vcpu;
+		__this_cpu_write(last_vcpu_of_lpid[vcpu->kvm->arch.lpid], vcpu);
 	}
 
 	kvmppc_load_guest_fp(vcpu);
Index: linux/arch/powerpc/mm/hash_native_64.c
===================================================================
--- linux.orig/arch/powerpc/mm/hash_native_64.c
+++ linux/arch/powerpc/mm/hash_native_64.c
@@ -625,7 +625,7 @@ static void native_flush_hash_range(unsi
 	unsigned long want_v;
 	unsigned long flags;
 	real_pte_t pte;
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 	unsigned long psize = batch->psize;
 	int ssize = batch->ssize;
 	int i;
Index: linux/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- linux.orig/arch/powerpc/mm/hash_utils_64.c
+++ linux/arch/powerpc/mm/hash_utils_64.c
@@ -1314,7 +1314,7 @@ void flush_hash_range(unsigned long numb
 	else {
 		int i;
 		struct ppc64_tlb_batch *batch =
-			&__get_cpu_var(ppc64_tlb_batch);
+			this_cpu_ptr(&ppc64_tlb_batch);
 
 		for (i = 0; i < number; i++)
 			flush_hash_page(batch->vpn[i], batch->pte[i],
Index: linux/arch/powerpc/mm/hugetlbpage-book3e.c
===================================================================
--- linux.orig/arch/powerpc/mm/hugetlbpage-book3e.c
+++ linux/arch/powerpc/mm/hugetlbpage-book3e.c
@@ -33,13 +33,13 @@ static inline int tlb1_next(void)
 
 	ncams = mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY;
 
-	index = __get_cpu_var(next_tlbcam_idx);
+	index = this_cpu_read(next_tlbcam_idx);
 
 	/* Just round-robin the entries and wrap when we hit the end */
 	if (unlikely(index == ncams - 1))
-		__get_cpu_var(next_tlbcam_idx) = tlbcam_index;
+		__this_cpu_write(next_tlbcam_idx, tlbcam_index);
 	else
-		__get_cpu_var(next_tlbcam_idx)++;
+		__this_cpu_inc(next_tlbcam_idx);
 
 	return index;
 }
Index: linux/arch/powerpc/mm/hugetlbpage.c
===================================================================
--- linux.orig/arch/powerpc/mm/hugetlbpage.c
+++ linux/arch/powerpc/mm/hugetlbpage.c
@@ -462,7 +462,7 @@ static void hugepd_free(struct mmu_gathe
 {
 	struct hugepd_freelist **batchp;
 
-	batchp = &get_cpu_var(hugepd_freelist_cur);
+	batchp = this_cpu_ptr(&hugepd_freelist_cur);
 
 	if (atomic_read(&tlb->mm->mm_users) < 2 ||
 	    cpumask_equal(mm_cpumask(tlb->mm),
Index: linux/arch/powerpc/perf/core-book3s.c
===================================================================
--- linux.orig/arch/powerpc/perf/core-book3s.c
+++ linux/arch/powerpc/perf/core-book3s.c
@@ -339,7 +339,7 @@ static void power_pmu_bhrb_reset(void)
 
 static void power_pmu_bhrb_enable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!ppmu->bhrb_nr)
 		return;
@@ -354,7 +354,7 @@ static void power_pmu_bhrb_enable(struct
 
 static void power_pmu_bhrb_disable(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!ppmu->bhrb_nr)
 		return;
@@ -1144,7 +1144,7 @@ static void power_pmu_disable(struct pmu
 	if (!ppmu)
 		return;
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuhw->disabled) {
 		/*
@@ -1211,7 +1211,7 @@ static void power_pmu_enable(struct pmu
 		return;
 	local_irq_save(flags);
 
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	if (!cpuhw->disabled)
 		goto out;
 
@@ -1403,7 +1403,7 @@ static int power_pmu_add(struct perf_eve
 	 * Add the event to the list (if there is room)
 	 * and check whether the total set is still feasible.
 	 */
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	n0 = cpuhw->n_events;
 	if (n0 >= ppmu->n_counter)
 		goto out;
@@ -1469,7 +1469,7 @@ static void power_pmu_del(struct perf_ev
 
 	power_pmu_read(event);
 
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	for (i = 0; i < cpuhw->n_events; ++i) {
 		if (event == cpuhw->event[i]) {
 			while (++i < cpuhw->n_events) {
@@ -1575,7 +1575,7 @@ static void power_pmu_stop(struct perf_e
  */
 void power_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1589,7 +1589,7 @@ void power_pmu_start_txn(struct pmu *pmu
  */
 void power_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1607,7 +1607,7 @@ int power_pmu_commit_txn(struct pmu *pmu
 
 	if (!ppmu)
 		return -EAGAIN;
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	n = cpuhw->n_events;
 	if (check_excludes(cpuhw->event, cpuhw->flags, 0, n))
 		return -EAGAIN;
@@ -1964,7 +1964,7 @@ static void record_and_restart(struct pe
 
 		if (event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK) {
 			struct cpu_hw_events *cpuhw;
-			cpuhw = &__get_cpu_var(cpu_hw_events);
+			cpuhw = this_cpu_ptr(&cpu_hw_events);
 			power_pmu_bhrb_read(cpuhw);
 			data.br_stack = &cpuhw->bhrb_stack;
 		}
@@ -2037,7 +2037,7 @@ static bool pmc_overflow(unsigned long v
 static void perf_event_interrupt(struct pt_regs *regs)
 {
 	int i, j;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	unsigned long val[8];
 	int found, active;
Index: linux/arch/powerpc/perf/core-fsl-emb.c
===================================================================
--- linux.orig/arch/powerpc/perf/core-fsl-emb.c
+++ linux/arch/powerpc/perf/core-fsl-emb.c
@@ -210,7 +210,7 @@ static void fsl_emb_pmu_disable(struct p
 	unsigned long flags;
 
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	if (!cpuhw->disabled) {
 		cpuhw->disabled = 1;
@@ -249,7 +249,7 @@ static void fsl_emb_pmu_enable(struct pm
 	unsigned long flags;
 
 	local_irq_save(flags);
-	cpuhw = &__get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	if (!cpuhw->disabled)
 		goto out;
 
@@ -653,7 +653,7 @@ static void record_and_restart(struct pe
 static void perf_event_interrupt(struct pt_regs *regs)
 {
 	int i;
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	unsigned long val;
 	int found = 0;
Index: linux/arch/powerpc/platforms/cell/interrupt.c
===================================================================
--- linux.orig/arch/powerpc/platforms/cell/interrupt.c
+++ linux/arch/powerpc/platforms/cell/interrupt.c
@@ -82,7 +82,7 @@ static void iic_unmask(struct irq_data *
 
 static void iic_eoi(struct irq_data *d)
 {
-	struct iic *iic = &__get_cpu_var(cpu_iic);
+	struct iic *iic = this_cpu_ptr(&cpu_iic);
 	out_be64(&iic->regs->prio, iic->eoi_stack[--iic->eoi_ptr]);
 	BUG_ON(iic->eoi_ptr < 0);
 }
@@ -148,7 +148,7 @@ static unsigned int iic_get_irq(void)
 	struct iic *iic;
 	unsigned int virq;
 
-	iic = &__get_cpu_var(cpu_iic);
+	iic = this_cpu_ptr(&cpu_iic);
 	*(unsigned long *) &pending =
 		in_be64((u64 __iomem *) &iic->regs->pending_destr);
 	if (!(pending.flags & CBE_IIC_IRQ_VALID))
@@ -163,7 +163,7 @@ static unsigned int iic_get_irq(void)
 
 void iic_setup_cpu(void)
 {
-	out_be64(&__get_cpu_var(cpu_iic).regs->prio, 0xff);
+	out_be64(this_cpu_ptr(&cpu_iic.regs->prio), 0xff);
 }
 
 u8 iic_get_target_id(int cpu)
Index: linux/arch/powerpc/platforms/ps3/interrupt.c
===================================================================
--- linux.orig/arch/powerpc/platforms/ps3/interrupt.c
+++ linux/arch/powerpc/platforms/ps3/interrupt.c
@@ -711,7 +711,7 @@ void __init ps3_register_ipi_irq(unsigne
 
 static unsigned int ps3_get_irq(void)
 {
-	struct ps3_private *pd = &__get_cpu_var(ps3_private);
+	struct ps3_private *pd = this_cpu_ptr(&ps3_private);
 	u64 x = (pd->bmp.status & pd->bmp.mask);
 	unsigned int plug;
 
Index: linux/arch/powerpc/platforms/pseries/dtl.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/dtl.c
+++ linux/arch/powerpc/platforms/pseries/dtl.c
@@ -75,7 +75,7 @@ static atomic_t dtl_count;
  */
 static void consume_dtle(struct dtl_entry *dtle, u64 index)
 {
-	struct dtl_ring *dtlr = &__get_cpu_var(dtl_rings);
+	struct dtl_ring *dtlr = this_cpu_ptr(&dtl_rings);
 	struct dtl_entry *wp = dtlr->write_ptr;
 	struct lppaca *vpa = local_paca->lppaca_ptr;
 
Index: linux/arch/powerpc/platforms/pseries/hvCall_inst.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/hvCall_inst.c
+++ linux/arch/powerpc/platforms/pseries/hvCall_inst.c
@@ -110,7 +110,7 @@ static void probe_hcall_entry(void *igno
 	if (opcode > MAX_HCALL_OPCODE)
 		return;
 
-	h = &__get_cpu_var(hcall_stats)[opcode / 4];
+	h = this_cpu_ptr(&hcall_stats[opcode / 4]);
 	h->tb_start = mftb();
 	h->purr_start = mfspr(SPRN_PURR);
 }
@@ -123,7 +123,7 @@ static void probe_hcall_exit(void *ignor
 	if (opcode > MAX_HCALL_OPCODE)
 		return;
 
-	h = &__get_cpu_var(hcall_stats)[opcode / 4];
+	h = this_cpu_ptr(&hcall_stats[opcode / 4]);
 	h->num_calls++;
 	h->tb_total += mftb() - h->tb_start;
 	h->purr_total += mfspr(SPRN_PURR) - h->purr_start;
Index: linux/arch/powerpc/platforms/pseries/iommu.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/iommu.c
+++ linux/arch/powerpc/platforms/pseries/iommu.c
@@ -200,7 +200,7 @@ static int tce_buildmulti_pSeriesLP(stru
 
 	local_irq_save(flags);	/* to protect tcep and the page behind it */
 
-	tcep = __get_cpu_var(tce_page);
+	tcep = __this_cpu_read(tce_page);
 
 	/* This is safe to do since interrupts are off when we're called
 	 * from iommu_alloc{,_sg}()
@@ -213,7 +213,7 @@ static int tce_buildmulti_pSeriesLP(stru
 			return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
 					    direction, attrs);
 		}
-		__get_cpu_var(tce_page) = tcep;
+		__this_cpu_write(tce_page, tcep);
 	}
 
 	rpn = __pa(uaddr) >> TCE_SHIFT;
@@ -399,7 +399,7 @@ static int tce_setrange_multi_pSeriesLP(
 	long l, limit;
 
 	local_irq_disable();	/* to protect tcep and the page behind it */
-	tcep = __get_cpu_var(tce_page);
+	tcep = __this_cpu_read(tce_page);
 
 	if (!tcep) {
 		tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
@@ -407,7 +407,7 @@ static int tce_setrange_multi_pSeriesLP(
 			local_irq_enable();
 			return -ENOMEM;
 		}
-		__get_cpu_var(tce_page) = tcep;
+		__this_cpu_write(tce_page, tcep);
 	}
 
 	proto_tce = TCE_PCI_READ | TCE_PCI_WRITE;
Index: linux/arch/powerpc/platforms/pseries/lpar.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/lpar.c
+++ linux/arch/powerpc/platforms/pseries/lpar.c
@@ -507,7 +507,7 @@ static void pSeries_lpar_flush_hash_rang
 	unsigned long vpn;
 	unsigned long i, pix, rc;
 	unsigned long flags = 0;
-	struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
+	struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch);
 	int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
 	unsigned long param[9];
 	unsigned long hash, index, shift, hidx, slot;
@@ -697,7 +697,7 @@ void __trace_hcall_entry(unsigned long o
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(hcall_trace_depth);
+	depth = this_cpu_ptr(&hcall_trace_depth);
 
 	if (*depth)
 		goto out;
@@ -722,7 +722,7 @@ void __trace_hcall_exit(long opcode, uns
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(hcall_trace_depth);
+	depth = this_cpu_ptr(&hcall_trace_depth);
 
 	if (*depth)
 		goto out;
Index: linux/arch/powerpc/platforms/pseries/ras.c
===================================================================
--- linux.orig/arch/powerpc/platforms/pseries/ras.c
+++ linux/arch/powerpc/platforms/pseries/ras.c
@@ -302,8 +302,8 @@ static struct rtas_error_log *fwnmi_get_
 	/* If it isn't an extended log we can use the per cpu 64bit buffer */
 	h = (struct rtas_error_log *)&savep[1];
 	if (!rtas_error_extended(h)) {
-		memcpy(&__get_cpu_var(mce_data_buf), h, sizeof(__u64));
-		errhdr = (struct rtas_error_log *)&__get_cpu_var(mce_data_buf);
+		memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64));
+		errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf);
 	} else {
 		int len, error_log_length;
 
Index: linux/arch/powerpc/sysdev/xics/xics-common.c
===================================================================
--- linux.orig/arch/powerpc/sysdev/xics/xics-common.c
+++ linux/arch/powerpc/sysdev/xics/xics-common.c
@@ -155,7 +155,7 @@ int __init xics_smp_probe(void)
 
 void xics_teardown_cpu(void)
 {
-	struct xics_cppr *os_cppr = &__get_cpu_var(xics_cppr);
+	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
 
 	/*
 	 * we have to reset the cppr index to 0 because we're
Index: linux/arch/powerpc/platforms/powernv/opal-tracepoints.c
===================================================================
--- linux.orig/arch/powerpc/platforms/powernv/opal-tracepoints.c
+++ linux/arch/powerpc/platforms/powernv/opal-tracepoints.c
@@ -48,7 +48,7 @@ void __trace_opal_entry(unsigned long op
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(opal_trace_depth);
+	depth = this_cpu_ptr(opal_trace_depth);
 
 	if (*depth)
 		goto out;
@@ -69,7 +69,7 @@ void __trace_opal_exit(long opcode, unsi
 
 	local_irq_save(flags);
 
-	depth = &__get_cpu_var(opal_trace_depth);
+	depth = this_cpu_ptr(opal_trace_depth);
 
 	if (*depth)
 		goto out;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 27/35] [PATCH 27/36] tile: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (25 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:11   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters Christoph Lameter
                   ` (8 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

[-- Attachment #1: 0027-tile-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 13891 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/tile/include/asm/irqflags.h    |  4 ++--
 arch/tile/include/asm/mmu_context.h |  6 +++---
 arch/tile/kernel/irq.c              | 14 +++++++-------
 arch/tile/kernel/messaging.c        |  4 ++--
 arch/tile/kernel/process.c          |  2 +-
 arch/tile/kernel/setup.c            |  3 ++-
 arch/tile/kernel/single_step.c      |  4 ++--
 arch/tile/kernel/smp.c              |  2 +-
 arch/tile/kernel/smpboot.c          |  6 +++---
 arch/tile/kernel/time.c             |  8 ++++----
 arch/tile/mm/highmem.c              |  2 +-
 arch/tile/mm/init.c                 |  4 ++--
 12 files changed, 30 insertions(+), 29 deletions(-)

Index: linux/arch/tile/include/asm/irqflags.h
===================================================================
--- linux.orig/arch/tile/include/asm/irqflags.h
+++ linux/arch/tile/include/asm/irqflags.h
@@ -140,12 +140,12 @@ extern unsigned int debug_smp_processor_
 
 /*
  * Read the set of maskable interrupts.
- * We avoid the preemption warning here via __this_cpu_ptr since even
+ * We avoid the preemption warning here via raw_cpu_ptr since even
  * if irqs are already enabled, it's harmless to read the wrong cpu's
  * enabled mask.
  */
 #define arch_local_irqs_enabled() \
-	(*__this_cpu_ptr(&interrupts_enabled_mask))
+	(*raw_cpu_ptr(&interrupts_enabled_mask))
 
 /* Re-enable all maskable interrupts. */
 #define arch_local_irq_enable() \
Index: linux/arch/tile/include/asm/mmu_context.h
===================================================================
--- linux.orig/arch/tile/include/asm/mmu_context.h
+++ linux/arch/tile/include/asm/mmu_context.h
@@ -84,7 +84,7 @@ static inline void enter_lazy_tlb(struct
 	 * clear any pending DMA interrupts.
 	 */
 	if (current->thread.tile_dma_state.enabled)
-		install_page_table(mm->pgd, __get_cpu_var(current_asid));
+		install_page_table(mm->pgd, __this_cpu_read(current_asid));
 #endif
 }
 
@@ -96,12 +96,12 @@ static inline void switch_mm(struct mm_s
 		int cpu = smp_processor_id();
 
 		/* Pick new ASID. */
-		int asid = __get_cpu_var(current_asid) + 1;
+		int asid = __this_cpu_read(current_asid) + 1;
 		if (asid > max_asid) {
 			asid = min_asid;
 			local_flush_tlb();
 		}
-		__get_cpu_var(current_asid) = asid;
+		__this_cpu_write(current_asid, asid);
 
 		/* Clear cpu from the old mm, and set it in the new one. */
 		cpumask_clear_cpu(cpu, mm_cpumask(prev));
Index: linux/arch/tile/kernel/irq.c
===================================================================
--- linux.orig/arch/tile/kernel/irq.c
+++ linux/arch/tile/kernel/irq.c
@@ -73,7 +73,7 @@ static DEFINE_PER_CPU(int, irq_depth);
  */
 void tile_dev_intr(struct pt_regs *regs, int intnum)
 {
-	int depth = __get_cpu_var(irq_depth)++;
+	int depth = __this_cpu_inc_return(irq_depth);
 	unsigned long original_irqs;
 	unsigned long remaining_irqs;
 	struct pt_regs *old_regs;
@@ -120,7 +120,7 @@ void tile_dev_intr(struct pt_regs *regs,
 
 		/* Count device irqs; Linux IPIs are counted elsewhere. */
 		if (irq != IRQ_RESCHEDULE)
-			__get_cpu_var(irq_stat).irq_dev_intr_count++;
+			__this_cpu_inc(irq_stat.irq_dev_intr_count);
 
 		generic_handle_irq(irq);
 	}
@@ -130,10 +130,10 @@ void tile_dev_intr(struct pt_regs *regs,
 	 * including any that were reenabled during interrupt
 	 * handling.
 	 */
-	if (depth == 0)
-		unmask_irqs(~__get_cpu_var(irq_disable_mask));
+	if (depth == 1)
+		unmask_irqs(~__this_cpu_read(irq_disable_mask));
 
-	__get_cpu_var(irq_depth)--;
+	__this_cpu_dec(irq_depth);
 
 	/*
 	 * Track time spent against the current process again and
@@ -151,7 +151,7 @@ void tile_dev_intr(struct pt_regs *regs,
 static void tile_irq_chip_enable(struct irq_data *d)
 {
 	get_cpu_var(irq_disable_mask) &= ~(1UL << d->irq);
-	if (__get_cpu_var(irq_depth) == 0)
+	if (__this_cpu_read(irq_depth) == 0)
 		unmask_irqs(1UL << d->irq);
 	put_cpu_var(irq_disable_mask);
 }
@@ -197,7 +197,7 @@ static void tile_irq_chip_ack(struct irq
  */
 static void tile_irq_chip_eoi(struct irq_data *d)
 {
-	if (!(__get_cpu_var(irq_disable_mask) & (1UL << d->irq)))
+	if (!(__this_cpu_read(irq_disable_mask) & (1UL << d->irq)))
 		unmask_irqs(1UL << d->irq);
 }
 
Index: linux/arch/tile/kernel/messaging.c
===================================================================
--- linux.orig/arch/tile/kernel/messaging.c
+++ linux/arch/tile/kernel/messaging.c
@@ -28,7 +28,7 @@ static DEFINE_PER_CPU(HV_MsgState, msg_s
 void init_messaging(void)
 {
 	/* Allocate storage for messages in kernel space */
-	HV_MsgState *state = &__get_cpu_var(msg_state);
+	HV_MsgState *state = this_cpu_ptr(&msg_state);
 	int rc = hv_register_message_state(state);
 	if (rc != HV_OK)
 		panic("hv_register_message_state: error %d", rc);
@@ -96,7 +96,7 @@ void hv_message_intr(struct pt_regs *reg
 			struct hv_driver_cb *cb =
 				(struct hv_driver_cb *)him->intarg;
 			cb->callback(cb, him->intdata);
-			__get_cpu_var(irq_stat).irq_hv_msg_count++;
+			__this_cpu_inc(irq_stat.irq_hv_msg_count);
 		}
 	}
 
Index: linux/arch/tile/kernel/process.c
===================================================================
--- linux.orig/arch/tile/kernel/process.c
+++ linux/arch/tile/kernel/process.c
@@ -64,7 +64,7 @@ early_param("idle", idle_setup);
 
 void arch_cpu_idle(void)
 {
-	__get_cpu_var(irq_stat).idle_timestamp = jiffies;
+	__this_cpu_write(irq_stat.idle_timestamp, jiffies);
 	_cpu_idle();
 }
 
Index: linux/arch/tile/kernel/setup.c
===================================================================
--- linux.orig/arch/tile/kernel/setup.c
+++ linux/arch/tile/kernel/setup.c
@@ -1218,7 +1218,8 @@ static void __init validate_hv(void)
 	 * various asid variables to their appropriate initial states.
 	 */
 	asid_range = hv_inquire_asid(0);
-	__get_cpu_var(current_asid) = min_asid = asid_range.start;
+	min_asid = asid_range.start;
+	__this_cpu_write(current_asid, min_asid);
 	max_asid = asid_range.start + asid_range.size - 1;
 
 	if (hv_confstr(HV_CONFSTR_CHIP_MODEL, (HV_VirtAddr)chip_model,
Index: linux/arch/tile/kernel/single_step.c
===================================================================
--- linux.orig/arch/tile/kernel/single_step.c
+++ linux/arch/tile/kernel/single_step.c
@@ -740,7 +740,7 @@ static DEFINE_PER_CPU(unsigned long, ss_
 
 void gx_singlestep_handle(struct pt_regs *regs, int fault_num)
 {
-	unsigned long *ss_pc = &__get_cpu_var(ss_saved_pc);
+	unsigned long *ss_pc = this_cpu_ptr(&ss_saved_pc);
 	struct thread_info *info = (void *)current_thread_info();
 	int is_single_step = test_ti_thread_flag(info, TIF_SINGLESTEP);
 	unsigned long control = __insn_mfspr(SPR_SINGLE_STEP_CONTROL_K);
@@ -766,7 +766,7 @@ void gx_singlestep_handle(struct pt_regs
 
 void single_step_once(struct pt_regs *regs)
 {
-	unsigned long *ss_pc = &__get_cpu_var(ss_saved_pc);
+	unsigned long *ss_pc = this_cpu_ptr(&ss_saved_pc);
 	unsigned long control = __insn_mfspr(SPR_SINGLE_STEP_CONTROL_K);
 
 	*ss_pc = regs->pc;
Index: linux/arch/tile/kernel/smp.c
===================================================================
--- linux.orig/arch/tile/kernel/smp.c
+++ linux/arch/tile/kernel/smp.c
@@ -188,7 +188,7 @@ void flush_icache_range(unsigned long st
 /* Called when smp_send_reschedule() triggers IRQ_RESCHEDULE. */
 static irqreturn_t handle_reschedule_ipi(int irq, void *token)
 {
-	__get_cpu_var(irq_stat).irq_resched_count++;
+	__this_cpu_inc(irq_stat.irq_resched_count);
 	scheduler_ipi();
 
 	return IRQ_HANDLED;
Index: linux/arch/tile/kernel/smpboot.c
===================================================================
--- linux.orig/arch/tile/kernel/smpboot.c
+++ linux/arch/tile/kernel/smpboot.c
@@ -41,7 +41,7 @@ void __init smp_prepare_boot_cpu(void)
 	int cpu = smp_processor_id();
 	set_cpu_online(cpu, 1);
 	set_cpu_present(cpu, 1);
-	__get_cpu_var(cpu_state) = CPU_ONLINE;
+	__this_cpu_write(cpu_state, CPU_ONLINE);
 
 	init_messaging();
 }
@@ -158,7 +158,7 @@ static void start_secondary(void)
 	/* printk(KERN_DEBUG "Initializing CPU#%d\n", cpuid); */
 
 	/* Initialize the current asid for our first page table. */
-	__get_cpu_var(current_asid) = min_asid;
+	__this_cpu_write(current_asid, min_asid);
 
 	/* Set up this thread as another owner of the init_mm */
 	atomic_inc(&init_mm.mm_count);
@@ -201,7 +201,7 @@ void online_secondary(void)
 	notify_cpu_starting(smp_processor_id());
 
 	set_cpu_online(smp_processor_id(), 1);
-	__get_cpu_var(cpu_state) = CPU_ONLINE;
+	__this_cpu_write(cpu_state, CPU_ONLINE);
 
 	/* Set up tile-specific state for this cpu. */
 	setup_cpu(0);
Index: linux/arch/tile/kernel/time.c
===================================================================
--- linux.orig/arch/tile/kernel/time.c
+++ linux/arch/tile/kernel/time.c
@@ -162,7 +162,7 @@ static DEFINE_PER_CPU(struct clock_event
 
 void setup_tile_timer(void)
 {
-	struct clock_event_device *evt = &__get_cpu_var(tile_timer);
+	struct clock_event_device *evt = this_cpu_ptr(&tile_timer);
 
 	/* Fill in fields that are speed-specific. */
 	clockevents_calc_mult_shift(evt, cycles_per_sec, TILE_MINSEC);
@@ -182,7 +182,7 @@ void setup_tile_timer(void)
 void do_timer_interrupt(struct pt_regs *regs, int fault_num)
 {
 	struct pt_regs *old_regs = set_irq_regs(regs);
-	struct clock_event_device *evt = &__get_cpu_var(tile_timer);
+	struct clock_event_device *evt = this_cpu_ptr(&tile_timer);
 
 	/*
 	 * Mask the timer interrupt here, since we are a oneshot timer
@@ -194,7 +194,7 @@ void do_timer_interrupt(struct pt_regs *
 	irq_enter();
 
 	/* Track interrupt count. */
-	__get_cpu_var(irq_stat).irq_timer_count++;
+	__this_cpu_inc(irq_stat.irq_timer_count);
 
 	/* Call the generic timer handler */
 	evt->event_handler(evt);
@@ -235,7 +235,7 @@ cycles_t ns2cycles(unsigned long nsecs)
 	 * We do not have to disable preemption here as each core has the same
 	 * clock frequency.
 	 */
-	struct clock_event_device *dev = &__raw_get_cpu_var(tile_timer);
+	struct clock_event_device *dev = raw_cpu_ptr(&tile_timer);
 
 	/*
 	 * as in clocksource.h and x86's timer.h, we split the calculation
Index: linux/arch/tile/mm/highmem.c
===================================================================
--- linux.orig/arch/tile/mm/highmem.c
+++ linux/arch/tile/mm/highmem.c
@@ -103,7 +103,7 @@ static void kmap_atomic_register(struct
 	spin_lock(&amp_lock);
 
 	/* With interrupts disabled, now fill in the per-cpu info. */
-	amp = &__get_cpu_var(amps).per_type[type];
+	amp = this_cpu_ptr(&amps.per_type[type]);
 	amp->page = page;
 	amp->cpu = smp_processor_id();
 	amp->va = va;
Index: linux/arch/tile/mm/init.c
===================================================================
--- linux.orig/arch/tile/mm/init.c
+++ linux/arch/tile/mm/init.c
@@ -593,14 +593,14 @@ static void __init kernel_physical_mappi
 	interrupt_mask_set_mask(-1ULL);
 	rc = flush_and_install_context(__pa(pgtables),
 				       init_pgprot((unsigned long)pgtables),
-				       __get_cpu_var(current_asid),
+				       __this_cpu_read(current_asid),
 				       cpumask_bits(my_cpu_mask));
 	interrupt_mask_restore_mask(irqmask);
 	BUG_ON(rc != 0);
 
 	/* Copy the page table back to the normal swapper_pg_dir. */
 	memcpy(pgd_base, pgtables, sizeof(pgtables));
-	__install_page_table(pgd_base, __get_cpu_var(current_asid),
+	__install_page_table(pgd_base, __this_cpu_read(current_asid),
 			     swapper_pgprot);
 
 	/*


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (26 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 27/35] [PATCH 27/36] tile: " Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:15   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses Christoph Lameter
                   ` (7 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0028-tile-Use-this_cpu_ptr-for-hardware-counters.patch --]
[-- Type: text/plain, Size: 2099 bytes --]

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/tile/kernel/perf_event.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/arch/tile/kernel/perf_event.c
===================================================================
--- linux.orig/arch/tile/kernel/perf_event.c
+++ linux/arch/tile/kernel/perf_event.c
@@ -590,7 +590,7 @@ static int tile_event_set_period(struct
  */
 static void tile_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -616,7 +616,7 @@ static void tile_pmu_stop(struct perf_ev
  */
 static void tile_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = event->hw.idx;
 
 	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
@@ -650,7 +650,7 @@ static void tile_pmu_start(struct perf_e
  */
 static int tile_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc;
 	unsigned long mask;
 	int b, max_cnt;
@@ -706,7 +706,7 @@ static int tile_pmu_add(struct perf_even
  */
 static void tile_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	/*
@@ -880,14 +880,14 @@ static struct pmu tilera_pmu = {
 int tile_pmu_handle_irq(struct pt_regs *regs, int fault)
 {
 	struct perf_sample_data data;
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
 	u64 val;
 	unsigned long status;
 	int bit;
 
-	__get_cpu_var(perf_irqs)++;
+	__this_cpu_inc(perf_irqs);
 
 	if (!atomic_read(&tile_active_events))
 		return 0;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (27 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:15   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 30/35] [PATCH 30/36] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
                   ` (6 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Mike Frysinger

[-- Attachment #1: 0029-blackfin-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 6628 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


CC: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/blackfin/include/asm/ipipe.h         |  2 +-
 arch/blackfin/kernel/perf_event.c         | 10 +++++-----
 arch/blackfin/mach-common/ints-priority.c |  8 ++++----
 arch/blackfin/mach-common/smp.c           |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

Index: linux/arch/blackfin/include/asm/ipipe.h
===================================================================
--- linux.orig/arch/blackfin/include/asm/ipipe.h
+++ linux/arch/blackfin/include/asm/ipipe.h
@@ -157,7 +157,7 @@ static inline unsigned long __ipipe_ffnz
 }
 
 #define __ipipe_do_root_xirq(ipd, irq)					\
-	((ipd)->irqs[irq].handler(irq, &__raw_get_cpu_var(__ipipe_tick_regs)))
+	((ipd)->irqs[irq].handler(irq, raw_cpu_ptr(&__ipipe_tick_regs)))
 
 #define __ipipe_run_irqtail(irq)  /* Must be a macro */			\
 	do {								\
Index: linux/arch/blackfin/kernel/perf_event.c
===================================================================
--- linux.orig/arch/blackfin/kernel/perf_event.c
+++ linux/arch/blackfin/kernel/perf_event.c
@@ -300,7 +300,7 @@ again:
 
 static void bfin_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -318,7 +318,7 @@ static void bfin_pmu_stop(struct perf_ev
 
 static void bfin_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 
@@ -335,7 +335,7 @@ static void bfin_pmu_start(struct perf_e
 
 static void bfin_pmu_del(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	bfin_pmu_stop(event, PERF_EF_UPDATE);
 	__clear_bit(event->hw.idx, cpuc->used_mask);
@@ -345,7 +345,7 @@ static void bfin_pmu_del(struct perf_eve
 
 static int bfin_pmu_add(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx;
 	int ret = -EAGAIN;
@@ -421,7 +421,7 @@ static int bfin_pmu_event_init(struct pe
 
 static void bfin_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_event *event;
 	struct hw_perf_event *hwc;
 	int i;
Index: linux/arch/blackfin/mach-common/ints-priority.c
===================================================================
--- linux.orig/arch/blackfin/mach-common/ints-priority.c
+++ linux/arch/blackfin/mach-common/ints-priority.c
@@ -1309,12 +1309,12 @@ asmlinkage int __ipipe_grab_irq(int vec,
 		bfin_write_TIMER_STATUS(1); /* Latch TIMIL0 */
 #endif
 		/* This is basically what we need from the register frame. */
-		__raw_get_cpu_var(__ipipe_tick_regs).ipend = regs->ipend;
-		__raw_get_cpu_var(__ipipe_tick_regs).pc = regs->pc;
+		__this_cpu_write(__ipipe_tick_regs.ipend, regs->ipend);
+		__this_cpu_write(__ipipe_tick_regs.pc, regs->pc);
 		if (this_domain != ipipe_root_domain)
-			__raw_get_cpu_var(__ipipe_tick_regs).ipend &= ~0x10;
+			__this_cpu_and(__ipipe_tick_regs.ipend, ~0x10);
 		else
-			__raw_get_cpu_var(__ipipe_tick_regs).ipend |= 0x10;
+			__this_cpu_or(__ipipe_tick_regs.ipend, 0x10);
 	}
 
 	/*
Index: linux/arch/blackfin/mach-common/smp.c
===================================================================
--- linux.orig/arch/blackfin/mach-common/smp.c
+++ linux/arch/blackfin/mach-common/smp.c
@@ -146,7 +146,7 @@ static irqreturn_t ipi_handler_int1(int
 	platform_clear_ipi(cpu, IRQ_SUPPLE_1);
 
 	smp_rmb();
-	bfin_ipi_data = &__get_cpu_var(bfin_ipi);
+	bfin_ipi_data = this_cpu_ptr(&bfin_ipi);
 	while ((pending = atomic_xchg(&bfin_ipi_data->bits, 0)) != 0) {
 		msg = 0;
 		do {


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 30/35] [PATCH 30/36] avr32: Replace __get_cpu_var with __this_cpu_write
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (28 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:15   ` Tejun Heo
  2014-08-17 17:30   ` Christoph Lameter
                   ` (5 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Haavard Skinnemoen, Hans-Christian Egtvedt

[-- Attachment #1: 0030-avr32-Replace-__get_cpu_var-with-__this_cpu_write.patch --]
[-- Type: text/plain, Size: 769 bytes --]

Replace the single use of __get_cpu_var in avr32 with
__this_cpu_write.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/avr32/kernel/kprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/arch/avr32/kernel/kprobes.c
===================================================================
--- linux.orig/arch/avr32/kernel/kprobes.c
+++ linux/arch/avr32/kernel/kprobes.c
@@ -104,7 +104,7 @@ static void __kprobes resume_execution(s
 
 static void __kprobes set_current_kprobe(struct kprobe *p)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 }
 
 static int __kprobes kprobe_handler(struct pt_regs *regs)


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 31/35] [PATCH 31/36] sparc: Replace __get_cpu_var uses
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
@ 2014-08-17 17:30   ` Christoph Lameter
  2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

[-- Attachment #1: 0031-sparc-Replace-__get_cpu_var-uses.patch --]
[-- Type: text/plain, Size: 14593 bytes --]

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: sparclinux@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/sparc/include/asm/cpudata_32.h |  2 +-
 arch/sparc/include/asm/cpudata_64.h |  2 +-
 arch/sparc/kernel/kprobes.c         |  6 +++---
 arch/sparc/kernel/leon_smp.c        |  2 +-
 arch/sparc/kernel/nmi.c             | 16 ++++++++--------
 arch/sparc/kernel/pci_sun4v.c       |  8 ++++----
 arch/sparc/kernel/perf_event.c      | 26 +++++++++++++-------------
 arch/sparc/kernel/sun4d_smp.c       |  2 +-
 arch/sparc/kernel/time_64.c         |  2 +-
 arch/sparc/mm/tlb.c                 |  4 ++--
 10 files changed, 35 insertions(+), 35 deletions(-)

Index: linux/arch/sparc/include/asm/cpudata_32.h
===================================================================
--- linux.orig/arch/sparc/include/asm/cpudata_32.h
+++ linux/arch/sparc/include/asm/cpudata_32.h
@@ -26,6 +26,6 @@ typedef struct {
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu) per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data() __get_cpu_var(__cpu_data)
+#define local_cpu_data() (*this_cpu_ptr(&__cpu_data))
 
 #endif /* _SPARC_CPUDATA_H */
Index: linux/arch/sparc/include/asm/cpudata_64.h
===================================================================
--- linux.orig/arch/sparc/include/asm/cpudata_64.h
+++ linux/arch/sparc/include/asm/cpudata_64.h
@@ -30,7 +30,7 @@ typedef struct {
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu)		per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data()	__get_cpu_var(__cpu_data)
+#define local_cpu_data()	(*this_cpu_ptr(&__cpu_data))
 
 #endif /* !(__ASSEMBLY__) */
 
Index: linux/arch/sparc/kernel/kprobes.c
===================================================================
--- linux.orig/arch/sparc/kernel/kprobes.c
+++ linux/arch/sparc/kernel/kprobes.c
@@ -83,7 +83,7 @@ static void __kprobes save_previous_kpro
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_orig_tnpc = kcb->prev_kprobe.orig_tnpc;
 	kcb->kprobe_orig_tstate_pil = kcb->prev_kprobe.orig_tstate_pil;
@@ -92,7 +92,7 @@ static void __kprobes restore_previous_k
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_orig_tnpc = regs->tnpc;
 	kcb->kprobe_orig_tstate_pil = (regs->tstate & TSTATE_PIL);
 }
@@ -155,7 +155,7 @@ static int __kprobes kprobe_handler(stru
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/sparc/kernel/leon_smp.c
===================================================================
--- linux.orig/arch/sparc/kernel/leon_smp.c
+++ linux/arch/sparc/kernel/leon_smp.c
@@ -343,7 +343,7 @@ static void leon_ipi_resched(int cpu)
 
 void leonsmp_ipi_interrupt(void)
 {
-	struct leon_ipi_work *work = &__get_cpu_var(leon_ipi_work);
+	struct leon_ipi_work *work = this_cpu_ptr(&leon_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/nmi.c
===================================================================
--- linux.orig/arch/sparc/kernel/nmi.c
+++ linux/arch/sparc/kernel/nmi.c
@@ -100,20 +100,20 @@ notrace __kprobes void perfctr_irq(int i
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
 
 	sum = local_cpu_data().irq0_irqs;
-	if (__get_cpu_var(nmi_touch)) {
-		__get_cpu_var(nmi_touch) = 0;
+	if (__this_cpu_read(nmi_touch)) {
+		__this_cpu_write(nmi_touch, 0);
 		touched = 1;
 	}
-	if (!touched && __get_cpu_var(last_irq_sum) == sum) {
+	if (!touched && __this_cpu_read(last_irq_sum) == sum) {
 		__this_cpu_inc(alert_counter);
 		if (__this_cpu_read(alert_counter) == 30 * nmi_hz)
 			die_nmi("BUG: NMI Watchdog detected LOCKUP",
 				regs, panic_on_timeout);
 	} else {
-		__get_cpu_var(last_irq_sum) = sum;
+		__this_cpu_write(last_irq_sum, sum);
 		__this_cpu_write(alert_counter, 0);
 	}
-	if (__get_cpu_var(wd_enabled)) {
+	if (__this_cpu_read(wd_enabled)) {
 		pcr_ops->write_pic(0, pcr_ops->nmi_picl_value(nmi_hz));
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_enable);
 	}
@@ -154,7 +154,7 @@ static void report_broken_nmi(int cpu, i
 void stop_nmi_watchdog(void *unused)
 {
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
-	__get_cpu_var(wd_enabled) = 0;
+	__this_cpu_write(wd_enabled, 0);
 	atomic_dec(&nmi_active);
 }
 
@@ -207,7 +207,7 @@ error:
 
 void start_nmi_watchdog(void *unused)
 {
-	__get_cpu_var(wd_enabled) = 1;
+	__this_cpu_write(wd_enabled, 1);
 	atomic_inc(&nmi_active);
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
@@ -218,7 +218,7 @@ void start_nmi_watchdog(void *unused)
 
 static void nmi_adjust_hz_one(void *unused)
 {
-	if (!__get_cpu_var(wd_enabled))
+	if (!__this_cpu_read(wd_enabled))
 		return;
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
Index: linux/arch/sparc/kernel/pci_sun4v.c
===================================================================
--- linux.orig/arch/sparc/kernel/pci_sun4v.c
+++ linux/arch/sparc/kernel/pci_sun4v.c
@@ -48,7 +48,7 @@ static int iommu_batch_initialized;
 /* Interrupts must be disabled.  */
 static inline void iommu_batch_start(struct device *dev, unsigned long prot, unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	p->dev		= dev;
 	p->prot		= prot;
@@ -94,7 +94,7 @@ static long iommu_batch_flush(struct iom
 
 static inline void iommu_batch_new_entry(unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	if (p->entry + p->npages == entry)
 		return;
@@ -106,7 +106,7 @@ static inline void iommu_batch_new_entry
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_add(u64 phys_page)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
@@ -120,7 +120,7 @@ static inline long iommu_batch_add(u64 p
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_end(void)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
Index: linux/arch/sparc/kernel/perf_event.c
===================================================================
--- linux.orig/arch/sparc/kernel/perf_event.c
+++ linux/arch/sparc/kernel/perf_event.c
@@ -1013,7 +1013,7 @@ static void update_pcrs_for_enable(struc
 
 static void sparc_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (cpuc->enabled)
@@ -1031,7 +1031,7 @@ static void sparc_pmu_enable(struct pmu
 
 static void sparc_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (!cpuc->enabled)
@@ -1065,7 +1065,7 @@ static int active_event_index(struct cpu
 
 static void sparc_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (flags & PERF_EF_RELOAD) {
@@ -1080,7 +1080,7 @@ static void sparc_pmu_start(struct perf_
 
 static void sparc_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (!(event->hw.state & PERF_HES_STOPPED)) {
@@ -1096,7 +1096,7 @@ static void sparc_pmu_stop(struct perf_e
 
 static void sparc_pmu_del(struct perf_event *event, int _flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 	int i;
 
@@ -1133,7 +1133,7 @@ static void sparc_pmu_del(struct perf_ev
 
 static void sparc_pmu_read(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -1145,7 +1145,7 @@ static DEFINE_MUTEX(pmc_grab_mutex);
 
 static void perf_stop_nmi_watchdog(void *unused)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	stop_nmi_watchdog(NULL);
@@ -1356,7 +1356,7 @@ static int collect_events(struct perf_ev
 
 static int sparc_pmu_add(struct perf_event *event, int ef_flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n0, ret = -EAGAIN;
 	unsigned long flags;
 
@@ -1498,7 +1498,7 @@ static int sparc_pmu_event_init(struct p
  */
 static void sparc_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1511,7 +1511,7 @@ static void sparc_pmu_start_txn(struct p
  */
 static void sparc_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1524,13 +1524,13 @@ static void sparc_pmu_cancel_txn(struct
  */
 static int sparc_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n;
 
 	if (!sparc_pmu)
 		return -EINVAL;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 	n = cpuc->n_events;
 	if (check_excludes(cpuc->event, 0, n))
 		return -EINVAL;
@@ -1601,7 +1601,7 @@ static int __kprobes perf_event_nmi_hand
 
 	regs = args->regs;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* If the PMU has the TOE IRQ enable bits, we need to do a
 	 * dummy write to the %pcr to clear the overflow bits and thus
Index: linux/arch/sparc/kernel/sun4d_smp.c
===================================================================
--- linux.orig/arch/sparc/kernel/sun4d_smp.c
+++ linux/arch/sparc/kernel/sun4d_smp.c
@@ -204,7 +204,7 @@ static void __init smp4d_ipi_init(void)
 
 void sun4d_ipi_interrupt(void)
 {
-	struct sun4d_ipi_work *work = &__get_cpu_var(sun4d_ipi_work);
+	struct sun4d_ipi_work *work = this_cpu_ptr(&sun4d_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/time_64.c
===================================================================
--- linux.orig/arch/sparc/kernel/time_64.c
+++ linux/arch/sparc/kernel/time_64.c
@@ -765,7 +765,7 @@ void setup_sparc64_timer(void)
 			     : /* no outputs */
 			     : "r" (pstate));
 
-	sevt = &__get_cpu_var(sparc64_events);
+	sevt = this_cpu_ptr(&sparc64_events);
 
 	memcpy(sevt, &sparc64_clockevent, sizeof(*sevt));
 	sevt->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/sparc/mm/tlb.c
===================================================================
--- linux.orig/arch/sparc/mm/tlb.c
+++ linux/arch/sparc/mm/tlb.c
@@ -52,14 +52,14 @@ out:
 
 void arch_enter_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	tb->active = 1;
 }
 
 void arch_leave_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	if (tb->tlb_nr)
 		flush_tlb_pending();


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 31/35] [PATCH 31/36] sparc: Replace __get_cpu_var uses
@ 2014-08-17 17:30   ` Christoph Lameter
  0 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :


#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))



__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.


This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.




Transformations done to __get_cpu_var()


1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);


2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);


3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);


4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));


5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);


6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)


Cc: sparclinux@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/sparc/include/asm/cpudata_32.h |  2 +-
 arch/sparc/include/asm/cpudata_64.h |  2 +-
 arch/sparc/kernel/kprobes.c         |  6 +++---
 arch/sparc/kernel/leon_smp.c        |  2 +-
 arch/sparc/kernel/nmi.c             | 16 ++++++++--------
 arch/sparc/kernel/pci_sun4v.c       |  8 ++++----
 arch/sparc/kernel/perf_event.c      | 26 +++++++++++++-------------
 arch/sparc/kernel/sun4d_smp.c       |  2 +-
 arch/sparc/kernel/time_64.c         |  2 +-
 arch/sparc/mm/tlb.c                 |  4 ++--
 10 files changed, 35 insertions(+), 35 deletions(-)

Index: linux/arch/sparc/include/asm/cpudata_32.h
=================================--- linux.orig/arch/sparc/include/asm/cpudata_32.h
+++ linux/arch/sparc/include/asm/cpudata_32.h
@@ -26,6 +26,6 @@ typedef struct {
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu) per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data() __get_cpu_var(__cpu_data)
+#define local_cpu_data() (*this_cpu_ptr(&__cpu_data))
 
 #endif /* _SPARC_CPUDATA_H */
Index: linux/arch/sparc/include/asm/cpudata_64.h
=================================--- linux.orig/arch/sparc/include/asm/cpudata_64.h
+++ linux/arch/sparc/include/asm/cpudata_64.h
@@ -30,7 +30,7 @@ typedef struct {
 
 DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
 #define cpu_data(__cpu)		per_cpu(__cpu_data, (__cpu))
-#define local_cpu_data()	__get_cpu_var(__cpu_data)
+#define local_cpu_data()	(*this_cpu_ptr(&__cpu_data))
 
 #endif /* !(__ASSEMBLY__) */
 
Index: linux/arch/sparc/kernel/kprobes.c
=================================--- linux.orig/arch/sparc/kernel/kprobes.c
+++ linux/arch/sparc/kernel/kprobes.c
@@ -83,7 +83,7 @@ static void __kprobes save_previous_kpro
 
 static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = kcb->prev_kprobe.kp;
+	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
 	kcb->kprobe_status = kcb->prev_kprobe.status;
 	kcb->kprobe_orig_tnpc = kcb->prev_kprobe.orig_tnpc;
 	kcb->kprobe_orig_tstate_pil = kcb->prev_kprobe.orig_tstate_pil;
@@ -92,7 +92,7 @@ static void __kprobes restore_previous_k
 static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 				struct kprobe_ctlblk *kcb)
 {
-	__get_cpu_var(current_kprobe) = p;
+	__this_cpu_write(current_kprobe, p);
 	kcb->kprobe_orig_tnpc = regs->tnpc;
 	kcb->kprobe_orig_tstate_pil = (regs->tstate & TSTATE_PIL);
 }
@@ -155,7 +155,7 @@ static int __kprobes kprobe_handler(stru
 				ret = 1;
 				goto no_kprobe;
 			}
-			p = __get_cpu_var(current_kprobe);
+			p = __this_cpu_read(current_kprobe);
 			if (p->break_handler && p->break_handler(p, regs))
 				goto ss_probe;
 		}
Index: linux/arch/sparc/kernel/leon_smp.c
=================================--- linux.orig/arch/sparc/kernel/leon_smp.c
+++ linux/arch/sparc/kernel/leon_smp.c
@@ -343,7 +343,7 @@ static void leon_ipi_resched(int cpu)
 
 void leonsmp_ipi_interrupt(void)
 {
-	struct leon_ipi_work *work = &__get_cpu_var(leon_ipi_work);
+	struct leon_ipi_work *work = this_cpu_ptr(&leon_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/nmi.c
=================================--- linux.orig/arch/sparc/kernel/nmi.c
+++ linux/arch/sparc/kernel/nmi.c
@@ -100,20 +100,20 @@ notrace __kprobes void perfctr_irq(int i
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
 
 	sum = local_cpu_data().irq0_irqs;
-	if (__get_cpu_var(nmi_touch)) {
-		__get_cpu_var(nmi_touch) = 0;
+	if (__this_cpu_read(nmi_touch)) {
+		__this_cpu_write(nmi_touch, 0);
 		touched = 1;
 	}
-	if (!touched && __get_cpu_var(last_irq_sum) = sum) {
+	if (!touched && __this_cpu_read(last_irq_sum) = sum) {
 		__this_cpu_inc(alert_counter);
 		if (__this_cpu_read(alert_counter) = 30 * nmi_hz)
 			die_nmi("BUG: NMI Watchdog detected LOCKUP",
 				regs, panic_on_timeout);
 	} else {
-		__get_cpu_var(last_irq_sum) = sum;
+		__this_cpu_write(last_irq_sum, sum);
 		__this_cpu_write(alert_counter, 0);
 	}
-	if (__get_cpu_var(wd_enabled)) {
+	if (__this_cpu_read(wd_enabled)) {
 		pcr_ops->write_pic(0, pcr_ops->nmi_picl_value(nmi_hz));
 		pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_enable);
 	}
@@ -154,7 +154,7 @@ static void report_broken_nmi(int cpu, i
 void stop_nmi_watchdog(void *unused)
 {
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
-	__get_cpu_var(wd_enabled) = 0;
+	__this_cpu_write(wd_enabled, 0);
 	atomic_dec(&nmi_active);
 }
 
@@ -207,7 +207,7 @@ error:
 
 void start_nmi_watchdog(void *unused)
 {
-	__get_cpu_var(wd_enabled) = 1;
+	__this_cpu_write(wd_enabled, 1);
 	atomic_inc(&nmi_active);
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
@@ -218,7 +218,7 @@ void start_nmi_watchdog(void *unused)
 
 static void nmi_adjust_hz_one(void *unused)
 {
-	if (!__get_cpu_var(wd_enabled))
+	if (!__this_cpu_read(wd_enabled))
 		return;
 
 	pcr_ops->write_pcr(0, pcr_ops->pcr_nmi_disable);
Index: linux/arch/sparc/kernel/pci_sun4v.c
=================================--- linux.orig/arch/sparc/kernel/pci_sun4v.c
+++ linux/arch/sparc/kernel/pci_sun4v.c
@@ -48,7 +48,7 @@ static int iommu_batch_initialized;
 /* Interrupts must be disabled.  */
 static inline void iommu_batch_start(struct device *dev, unsigned long prot, unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	p->dev		= dev;
 	p->prot		= prot;
@@ -94,7 +94,7 @@ static long iommu_batch_flush(struct iom
 
 static inline void iommu_batch_new_entry(unsigned long entry)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	if (p->entry + p->npages = entry)
 		return;
@@ -106,7 +106,7 @@ static inline void iommu_batch_new_entry
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_add(u64 phys_page)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
@@ -120,7 +120,7 @@ static inline long iommu_batch_add(u64 p
 /* Interrupts must be disabled.  */
 static inline long iommu_batch_end(void)
 {
-	struct iommu_batch *p = &__get_cpu_var(iommu_batch);
+	struct iommu_batch *p = this_cpu_ptr(&iommu_batch);
 
 	BUG_ON(p->npages >= PGLIST_NENTS);
 
Index: linux/arch/sparc/kernel/perf_event.c
=================================--- linux.orig/arch/sparc/kernel/perf_event.c
+++ linux/arch/sparc/kernel/perf_event.c
@@ -1013,7 +1013,7 @@ static void update_pcrs_for_enable(struc
 
 static void sparc_pmu_enable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (cpuc->enabled)
@@ -1031,7 +1031,7 @@ static void sparc_pmu_enable(struct pmu
 
 static void sparc_pmu_disable(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	if (!cpuc->enabled)
@@ -1065,7 +1065,7 @@ static int active_event_index(struct cpu
 
 static void sparc_pmu_start(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (flags & PERF_EF_RELOAD) {
@@ -1080,7 +1080,7 @@ static void sparc_pmu_start(struct perf_
 
 static void sparc_pmu_stop(struct perf_event *event, int flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 
 	if (!(event->hw.state & PERF_HES_STOPPED)) {
@@ -1096,7 +1096,7 @@ static void sparc_pmu_stop(struct perf_e
 
 static void sparc_pmu_del(struct perf_event *event, int _flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	unsigned long flags;
 	int i;
 
@@ -1133,7 +1133,7 @@ static void sparc_pmu_del(struct perf_ev
 
 static void sparc_pmu_read(struct perf_event *event)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx = active_event_index(cpuc, event);
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -1145,7 +1145,7 @@ static DEFINE_MUTEX(pmc_grab_mutex);
 
 static void perf_stop_nmi_watchdog(void *unused)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int i;
 
 	stop_nmi_watchdog(NULL);
@@ -1356,7 +1356,7 @@ static int collect_events(struct perf_ev
 
 static int sparc_pmu_add(struct perf_event *event, int ef_flags)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n0, ret = -EAGAIN;
 	unsigned long flags;
 
@@ -1498,7 +1498,7 @@ static int sparc_pmu_event_init(struct p
  */
 static void sparc_pmu_start_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
@@ -1511,7 +1511,7 @@ static void sparc_pmu_start_txn(struct p
  */
 static void sparc_pmu_cancel_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
@@ -1524,13 +1524,13 @@ static void sparc_pmu_cancel_txn(struct
  */
 static int sparc_pmu_commit_txn(struct pmu *pmu)
 {
-	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int n;
 
 	if (!sparc_pmu)
 		return -EINVAL;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 	n = cpuc->n_events;
 	if (check_excludes(cpuc->event, 0, n))
 		return -EINVAL;
@@ -1601,7 +1601,7 @@ static int __kprobes perf_event_nmi_hand
 
 	regs = args->regs;
 
-	cpuc = &__get_cpu_var(cpu_hw_events);
+	cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/* If the PMU has the TOE IRQ enable bits, we need to do a
 	 * dummy write to the %pcr to clear the overflow bits and thus
Index: linux/arch/sparc/kernel/sun4d_smp.c
=================================--- linux.orig/arch/sparc/kernel/sun4d_smp.c
+++ linux/arch/sparc/kernel/sun4d_smp.c
@@ -204,7 +204,7 @@ static void __init smp4d_ipi_init(void)
 
 void sun4d_ipi_interrupt(void)
 {
-	struct sun4d_ipi_work *work = &__get_cpu_var(sun4d_ipi_work);
+	struct sun4d_ipi_work *work = this_cpu_ptr(&sun4d_ipi_work);
 
 	if (work->single) {
 		work->single = 0;
Index: linux/arch/sparc/kernel/time_64.c
=================================--- linux.orig/arch/sparc/kernel/time_64.c
+++ linux/arch/sparc/kernel/time_64.c
@@ -765,7 +765,7 @@ void setup_sparc64_timer(void)
 			     : /* no outputs */
 			     : "r" (pstate));
 
-	sevt = &__get_cpu_var(sparc64_events);
+	sevt = this_cpu_ptr(&sparc64_events);
 
 	memcpy(sevt, &sparc64_clockevent, sizeof(*sevt));
 	sevt->cpumask = cpumask_of(smp_processor_id());
Index: linux/arch/sparc/mm/tlb.c
=================================--- linux.orig/arch/sparc/mm/tlb.c
+++ linux/arch/sparc/mm/tlb.c
@@ -52,14 +52,14 @@ out:
 
 void arch_enter_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	tb->active = 1;
 }
 
 void arch_leave_lazy_mmu_mode(void)
 {
-	struct tlb_batch *tb = &__get_cpu_var(tlb_batch);
+	struct tlb_batch *tb = this_cpu_ptr(&tlb_batch);
 
 	if (tb->tlb_nr)
 		flush_tlb_pending();


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 32/35] [PATCH 32/36] clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (30 preceding siblings ...)
  2014-08-17 17:30   ` Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:16   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr Christoph Lameter
                   ` (3 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0032-clocksource-Replace-__this_cpu_ptr-with-raw_cpu_ptr.patch --]
[-- Type: text/plain, Size: 643 bytes --]

One newly introduced __this_cpu_ptr should be raw_cpu_ptr.

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 drivers/clocksource/qcom-timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/clocksource/qcom-timer.c
===================================================================
--- linux.orig/drivers/clocksource/qcom-timer.c
+++ linux/drivers/clocksource/qcom-timer.c
@@ -219,7 +219,7 @@ static void __init msm_timer_init(u32 dg
 		}
 
 		/* Immediately configure the timer on the boot CPU */
-		msm_local_timer_setup(__this_cpu_ptr(msm_evt));
+		msm_local_timer_setup(raw_cpu_ptr(msm_evt));
 	}
 
 err:


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (31 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 32/35] [PATCH 32/36] clocksource: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:16   ` Tejun Heo
  2014-08-17 17:30 ` [PATCH 34/35] [PATCH 01/36] __get_cpu_var/cpumask_var_t: Resolve ambiguities Christoph Lameter
                   ` (2 subsequent siblings)
  35 siblings, 1 reply; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0034-percpu-Remove-__this_cpu_ptr.patch --]
[-- Type: text/plain, Size: 738 bytes --]

The __this_cpu_ptr macro is no longer in use so drop it.

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 include/asm-generic/percpu.h | 3 ---
 1 file changed, 3 deletions(-)

Index: linux/include/linux/percpu-defs.h
===================================================================
--- linux.orig/include/linux/percpu-defs.h
+++ linux/include/linux/percpu-defs.h
@@ -257,9 +257,6 @@ do {									\
 #define __raw_get_cpu_var(var)	(*raw_cpu_ptr(&(var)))
 #define __get_cpu_var(var)	(*this_cpu_ptr(&(var)))
 
-/* keep until we have removed all uses of __this_cpu_ptr */
-#define __this_cpu_ptr(ptr)	raw_cpu_ptr(ptr)
-
 /*
  * Must be an lvalue. Since @var must be a simple identifier,
  * we force a syntax error here if it isn't.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 34/35] [PATCH 01/36] __get_cpu_var/cpumask_var_t: Resolve ambiguities
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (32 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-17 17:30 ` [PATCH 35/35] [PATCH 33/36] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.17] Christoph Lameter
  2014-08-26 18:17 ` [PATCH 00/35] percpu: Consistent per cpu operations V7 Tejun Heo
  35 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0001-__get_cpu_var-cpumask_var_t-Resolve-ambiguities.patch --]
[-- Type: text/plain, Size: 5570 bytes --]

__get_cpu_var can paper over differences in the definitions
of cpumask_var_t and either use the address of the cpumask
variable directly or perform a fetch of the address of the
struct cpumask allocated elsewhere. This is important
particularly when using per cpu cpumask_var_t declarations
because in one case we have an offset into a per cpu area
to handle and in the other case we need to fetch a pointer
from the offset.

This patch introduces a new macro

this_cpu_cpumask_var_t_ptr()

that is defined where cpumask_var_t is defined and performs
the proper actions. All use cases where __get_cpu_var
is used with cpumask_var_t are converted to the use
of this_cpu_cpumask_var_t_ptr().

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 arch/x86/include/asm/perf_event_p4.h |  2 +-
 arch/x86/oprofile/op_model_p4.c      |  2 +-
 include/linux/cpumask.h              | 11 +++++++++++
 kernel/sched/deadline.c              |  2 +-
 kernel/sched/fair.c                  |  2 +-
 kernel/sched/rt.c                    |  2 +-
 6 files changed, 16 insertions(+), 5 deletions(-)

Index: linux/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux.orig/arch/x86/include/asm/perf_event_p4.h
+++ linux/arch/x86/include/asm/perf_event_p4.h
@@ -189,7 +189,7 @@ static inline int p4_ht_thread(int cpu)
 {
 #ifdef CONFIG_SMP
 	if (smp_num_siblings == 2)
-		return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+		return cpu != cpumask_first(this_cpu_cpumask_var_t_ptr(cpu_sibling_map));
 #endif
 	return 0;
 }
Index: linux/arch/x86/oprofile/op_model_p4.c
===================================================================
--- linux.orig/arch/x86/oprofile/op_model_p4.c
+++ linux/arch/x86/oprofile/op_model_p4.c
@@ -372,7 +372,7 @@ static unsigned int get_stagger(void)
 {
 #ifdef CONFIG_SMP
 	int cpu = smp_processor_id();
-	return cpu != cpumask_first(__get_cpu_var(cpu_sibling_map));
+	return cpu != cpumask_first(this_cpu_cpumask_var_t_ptr(cpu_sibling_map));
 #endif
 	return 0;
 }
Index: linux/include/linux/cpumask.h
===================================================================
--- linux.orig/include/linux/cpumask.h
+++ linux/include/linux/cpumask.h
@@ -666,10 +666,19 @@ static inline size_t cpumask_size(void)
  *
  * This code makes NR_CPUS length memcopy and brings to a memory corruption.
  * cpumask_copy() provide safe copy functionality.
+ *
+ * Note that there is another evil here: If you define a cpumask_var_t
+ * as a percpu variable then the way to obtain the address of the cpumask
+ * structure differently influences what this_cpu_* operation needs to be
+ * used. Please use this_cpu_cpumask_var_t in those cases. The direct use
+ * of this_cpu_ptr() or this_cpu_read() will lead to failures when the
+ * other type of cpumask_var_t implementation is configured.
  */
 #ifdef CONFIG_CPUMASK_OFFSTACK
 typedef struct cpumask *cpumask_var_t;
 
+#define this_cpu_cpumask_var_t_ptr(x) this_cpu_read(x)
+
 bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
 bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags);
 bool zalloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
@@ -681,6 +690,8 @@ void free_bootmem_cpumask_var(cpumask_va
 #else
 typedef struct cpumask cpumask_var_t[1];
 
+#define this_cpu_cpumask_var_t_ptr(x) this_cpu_ptr(x)
+
 static inline bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
 {
 	return true;
Index: linux/kernel/sched/deadline.c
===================================================================
--- linux.orig/kernel/sched/deadline.c
+++ linux/kernel/sched/deadline.c
@@ -1158,7 +1158,7 @@ static DEFINE_PER_CPU(cpumask_var_t, loc
 static int find_later_rq(struct task_struct *task)
 {
 	struct sched_domain *sd;
-	struct cpumask *later_mask = __get_cpu_var(local_cpu_mask_dl);
+	struct cpumask *later_mask = this_cpu_cpumask_var_t_ptr(local_cpu_mask_dl);
 	int this_cpu = smp_processor_id();
 	int best_cpu, cpu = task_cpu(task);
 
Index: linux/kernel/sched/fair.c
===================================================================
--- linux.orig/kernel/sched/fair.c
+++ linux/kernel/sched/fair.c
@@ -6539,7 +6539,7 @@ static int load_balance(int this_cpu, st
 	struct sched_group *group;
 	struct rq *busiest;
 	unsigned long flags;
-	struct cpumask *cpus = __get_cpu_var(load_balance_mask);
+	struct cpumask *cpus = this_cpu_cpumask_var_t_ptr(load_balance_mask);
 
 	struct lb_env env = {
 		.sd		= sd,
Index: linux/kernel/sched/rt.c
===================================================================
--- linux.orig/kernel/sched/rt.c
+++ linux/kernel/sched/rt.c
@@ -1526,7 +1526,7 @@ static DEFINE_PER_CPU(cpumask_var_t, loc
 static int find_lowest_rq(struct task_struct *task)
 {
 	struct sched_domain *sd;
-	struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
+	struct cpumask *lowest_mask = this_cpu_cpumask_var_t_ptr(local_cpu_mask);
 	int this_cpu = smp_processor_id();
 	int cpu      = task_cpu(task);
 
Index: linux/arch/x86/kernel/apic/x2apic_cluster.c
===================================================================
--- linux.orig/arch/x86/kernel/apic/x2apic_cluster.c
+++ linux/arch/x86/kernel/apic/x2apic_cluster.c
@@ -42,8 +42,7 @@ __x2apic_send_IPI_mask(const struct cpum
 	 * We are to modify mask, so we need an own copy
 	 * and be sure it's manipulated with irq off.
 	 */
-	ipi_mask_ptr = __raw_get_cpu_var(ipi_mask);
-	cpumask_copy(ipi_mask_ptr, mask);
+	ipi_mask_ptr = this_cpu_cpumask_var_t_ptr(ipi_mask);
 
 	/*
 	 * The idea is to send one IPI per cluster.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 35/35] [PATCH 33/36] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.17]
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (33 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 34/35] [PATCH 01/36] __get_cpu_var/cpumask_var_t: Resolve ambiguities Christoph Lameter
@ 2014-08-17 17:30 ` Christoph Lameter
  2014-08-26 18:17 ` [PATCH 00/35] percpu: Consistent per cpu operations V7 Tejun Heo
  35 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-17 17:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

[-- Attachment #1: 0033-Remove-__get_cpu_var-and-__raw_get_cpu_var-macros-on.patch --]
[-- Type: text/plain, Size: 827 bytes --]

No user is left in the kernel source tree. Therefore we can
drop the definitions.

[Patch should not be merged until all the replacement patches have been
merged. Probably this means hold until the 3.17 merge window]

Signed-off-by: Christoph Lameter <cl@linux.com>
---
 include/asm-generic/percpu.h | 5 -----
 1 file changed, 5 deletions(-)

Index: linux/include/linux/percpu-defs.h
===================================================================
--- linux.orig/include/linux/percpu-defs.h
+++ linux/include/linux/percpu-defs.h
@@ -254,8 +254,6 @@ do {									\
 #endif	/* CONFIG_SMP */
 
 #define per_cpu(var, cpu)	(*per_cpu_ptr(&(var), cpu))
-#define __raw_get_cpu_var(var)	(*raw_cpu_ptr(&(var)))
-#define __get_cpu_var(var)	(*this_cpu_ptr(&(var)))
 
 /*
  * Must be an lvalue. Since @var must be a simple identifier,


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-18  2:45   ` Christoph Lameter
  2014-08-26 18:14   ` Tejun Heo
  1 sibling, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-08-18  2:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras

korg tester found an issue:


From: Christoph Lameter <cl@liknux.com>
Subject: powerpc: Fix reference to opal_trace_depth.

depth is an address and not a scalar. Use & to determine the address.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/arch/powerpc/platforms/powernv/opal-tracepoints.c
===================================================================
--- linux.orig/arch/powerpc/platforms/powernv/opal-tracepoints.c
+++ linux/arch/powerpc/platforms/powernv/opal-tracepoints.c
@@ -48,7 +48,7 @@ void __trace_opal_entry(unsigned long op

 	local_irq_save(flags);

-	depth = this_cpu_ptr(opal_trace_depth);
+	depth = this_cpu_ptr(&opal_trace_depth);

 	if (*depth)
 		goto out;
@@ -69,7 +69,7 @@ void __trace_opal_exit(long opcode, unsi

 	local_irq_save(flags);

-	depth = this_cpu_ptr(opal_trace_depth);
+	depth = this_cpu_ptr(&opal_trace_depth);

 	if (*depth)
 		goto out;

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 ` [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-22 17:48   ` Will Deacon
  2014-08-26 18:11   ` Tejun Heo
  1 sibling, 0 replies; 81+ messages in thread
From: Will Deacon @ 2014-08-22 17:48 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Russell King, Catalin Marinas

On Sun, Aug 17, 2014 at 06:30:42PM +0100, Christoph Lameter wrote:
> __this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.
> 
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> CC: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>
> ---
>  arch/arm/kernel/smp_twd.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)

Looks good to me:

  Acked-by: Will Deacon <will.deacon@arm.com>

Will

> Index: linux/arch/arm/kernel/smp_twd.c
> ===================================================================
> --- linux.orig/arch/arm/kernel/smp_twd.c
> +++ linux/arch/arm/kernel/smp_twd.c
> @@ -92,7 +92,7 @@ static int twd_timer_ack(void)
>  
>  static void twd_timer_stop(void)
>  {
> -	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
> +	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
>  
>  	twd_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
>  	disable_percpu_irq(clk->irq);
> @@ -108,7 +108,7 @@ static void twd_update_frequency(void *n
>  {
>  	twd_timer_rate = *((unsigned long *) new_rate);
>  
> -	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
> +	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
>  }
>  
>  static int twd_rate_change(struct notifier_block *nb,
> @@ -134,7 +134,7 @@ static struct notifier_block twd_clk_nb
>  
>  static int twd_clk_init(void)
>  {
> -	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
> +	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
>  		return clk_notifier_register(twd_clk, &twd_clk_nb);
>  
>  	return 0;
> @@ -153,7 +153,7 @@ static void twd_update_frequency(void *d
>  {
>  	twd_timer_rate = clk_get_rate(twd_clk);
>  
> -	clockevents_update_freq(__this_cpu_ptr(twd_evt), twd_timer_rate);
> +	clockevents_update_freq(raw_cpu_ptr(twd_evt), twd_timer_rate);
>  }
>  
>  static int twd_cpufreq_transition(struct notifier_block *nb,
> @@ -179,7 +179,7 @@ static struct notifier_block twd_cpufreq
>  
>  static int twd_cpufreq_init(void)
>  {
> -	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
> +	if (twd_evt && raw_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
>  		return cpufreq_register_notifier(&twd_cpufreq_nb,
>  			CPUFREQ_TRANSITION_NOTIFIER);
>  
> @@ -269,7 +269,7 @@ static void twd_get_clock(struct device_
>   */
>  static void twd_timer_setup(void)
>  {
> -	struct clock_event_device *clk = __this_cpu_ptr(twd_evt);
> +	struct clock_event_device *clk = raw_cpu_ptr(twd_evt);
>  	int cpu = smp_processor_id();
>  
>  	/*
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:04   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:04 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, akpm

On Sun, Aug 17, 2014 at 12:30:24PM -0500, Christoph Lameter wrote:
> Replace uses of __get_cpu_var for address calculation with this_cpu_ptr.
> 
> Cc: akpm@linux-foundation.org
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/35] [PATCH 03/36] time: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
@ 2014-08-26 18:05   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:25PM -0500, Christoph Lameter wrote:
> Convert uses of __get_cpu_var for creating a address from a percpu
> offset to this_cpu_ptr.
> 
> The two cases where get_cpu_var is used to actually access a percpu
> variable are changed to use this_cpu_read/raw_cpu_read.
> 
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period
  2014-08-17 17:30 ` [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period Christoph Lameter
@ 2014-08-26 18:05   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:26PM -0500, Christoph Lameter wrote:
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr
  2014-08-17 17:30 ` [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
@ 2014-08-26 18:05   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:27PM -0500, Christoph Lameter wrote:
> Convert all uses of __get_cpu_var for address calculation to use
> this_cpu_ptr instead.
> 
> [Uses of __get_cpu_var with cpumask_var_t are no longer
> handled by this patch]
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 ` [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-26 18:06   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:06 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Jens Axboe

On Sun, Aug 17, 2014 at 12:30:28PM -0500, Christoph Lameter wrote:
> __this_cpu_ptr is being phased out use raw_cpu_ptr instead which was
> introduced in 3.15-rc1.
> 
> Cc: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:07   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Arnd Bergmann, Greg Kroah-Hartman

On Sun, Aug 17, 2014 at 12:30:29PM -0500, Christoph Lameter wrote:
> A single case of using __get_cpu_var for address calculation.
> 
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 ` [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
@ 2014-08-26 18:07   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Daniel Lezcano, linux-pm, Rafael J. Wysocki

On Sun, Aug 17, 2014 at 12:30:30PM -0500, Christoph Lameter wrote:
> All of these are for address calculation. Replace with
> this_cpu_ptr().
> 
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: linux-pm@vger.kernel.org
> Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
> [cpufreq changes]
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 08/35] [PATCH 08/36] drivers/oprofile: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 ` [PATCH 08/35] [PATCH 08/36] drivers/oprofile: " Christoph Lameter
@ 2014-08-26 18:08   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Robert Richter, oprofile-list

On Sun, Aug 17, 2014 at 12:30:31PM -0500, Christoph Lameter wrote:
> Replace the uses of __get_cpu_var for address calculation with this_cpu_ptr.
> 
> Cc: Robert Richter <rric@kernel.org>
> Cc: oprofile-list@lists.sf.net
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used for address calculation
  2014-08-17 17:30 ` [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used " Christoph Lameter
@ 2014-08-26 18:08   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

On Sun, Aug 17, 2014 at 12:30:32PM -0500, Christoph Lameter wrote:
> Replace __get_cpu_var used for address calculation with this_cpu_ptr.
> 
> Acked-by: James Hogan <james.hogan@imgtec.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 ` [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses " Christoph Lameter
@ 2014-08-26 18:08   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

On Sun, Aug 17, 2014 at 12:30:33PM -0500, Christoph Lameter wrote:
> Replace with this_cpu_ptr.
> 
> Acked-by: Chris Metcalf <cmetcalf@tilera.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:08   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Wim Van Sebroeck, linux-watchdog

On Sun, Aug 17, 2014 at 12:30:34PM -0500, Christoph Lameter wrote:
> Most of these are the uses of &__raw_get_cpu_var for address calculation.
> 
> touch_softlockup_watchdog_sync() uses __raw_get_cpu_var to write to
> per cpu variables. Use __this_cpu_write instead.
> 
> Cc: Wim Van Sebroeck <wim@iguana.be>
> Cc: linux-watchdog@vger.kernel.org
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr
  2014-08-17 17:30 ` [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
@ 2014-08-26 18:09   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, netdev, Eric Dumazet, David S. Miller

On Sun, Aug 17, 2014 at 12:30:35PM -0500, Christoph Lameter wrote:
> Replace uses of get_cpu_var for address calculation through this_cpu_ptr.
> 
> Cc: netdev@vger.kernel.org
> Cc: Eric Dumazet <edumazet@google.com>
> Acked-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 ` [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-26 18:09   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:36PM -0500, Christoph Lameter wrote:
> __this_cpu_ptr is being phased out.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation
  2014-08-17 17:30 ` [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation Christoph Lameter
@ 2014-08-26 18:09   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, James Hogan

On Sun, Aug 17, 2014 at 12:30:37PM -0500, Christoph Lameter wrote:
> Replace __get_cpu_var uses for address calculation with this_cpu_ptr().
> 
> Acked-by: James Hogan <james.hogan@imgtec.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14
  2014-08-17 17:30 ` [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
@ 2014-08-26 18:09   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:38PM -0500, Christoph Lameter wrote:
> Another case was merged for 3.14-rc1
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses
  2014-08-17 17:30 ` [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
@ 2014-08-26 18:10   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, nicolas.pitre, Russell King

On Sun, Aug 17, 2014 at 12:30:39PM -0500, Christoph Lameter wrote:
> [ARM specific]
> 
> These are generally replaced with raw_cpu_ptr. However, in
> gic_get_percpu_base() we immediately dereference the pointer. This is
> equivalent to a raw_cpu_read. So use that operation there.
> 
> Cc: nicolas.pitre@linaro.org
> Cc: Russell King <rmk+kernel@arm.linux.org.uk>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:10   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, x86, H. Peter Anvin

On Sun, Aug 17, 2014 at 12:30:40PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: x86@kernel.org
> Acked-by: H. Peter Anvin <hpa@linux.intel.com>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var
  2014-08-17 17:30 ` [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var Christoph Lameter
@ 2014-08-26 18:11   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Hedi Berriche, Mike Travis, Dimitri Sivanich

On Sun, Aug 17, 2014 at 12:30:41PM -0500, Christoph Lameter wrote:
> Use __this_cpu_read instead.
> 
> Cc: Hedi Berriche <hedi@sgi.com>
> Cc: Mike Travis <travis@sgi.com>
> Cc: Dimitri Sivanich <sivanich@sgi.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 ` [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
  2014-08-22 17:48   ` Will Deacon
@ 2014-08-26 18:11   ` Tejun Heo
  2014-08-26 18:16     ` Will Deacon
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Russell King, Catalin Marinas, Will Deacon

On Sun, Aug 17, 2014 at 12:30:42PM -0500, Christoph Lameter wrote:
> __this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.
> 
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> CC: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops with Will's ack added.
Please let me know if this patch should be routed differently.  Note
that this patch was to be applied to percpu/for-3.17 but delayed due
to build issues caused by cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 27/35] [PATCH 27/36] tile: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 27/35] [PATCH 27/36] tile: " Christoph Lameter
@ 2014-08-26 18:11   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Chris Metcalf

On Sun, Aug 17, 2014 at 12:30:50PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Acked-by: Chris Metcalf <cmetcalf@tilera.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator.
  2014-08-17 17:30 ` [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
@ 2014-08-26 18:12   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:12 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, David Daney

On Sun, Aug 17, 2014 at 12:30:43PM -0500, Christoph Lameter wrote:
> The use of __this_cpu_inc() requires a fundamental integer type, so
> change the type of all the counters to unsigned long, which is the
> same width they were before, but not wrapped in local_t.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:12   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:12 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ralf Baechle

On Sun, Aug 17, 2014 at 12:30:44PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 22/35] [PATCH 22/36] s390: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 22/35] [PATCH 22/36] s390: " Christoph Lameter
@ 2014-08-26 18:13   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:13 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Martin Schwidefsky, linux390, Heiko Carstens

On Sun, Aug 17, 2014 at 12:30:45PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> CC: linux390@de.ibm.com
> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 23/35] [PATCH 23/36] s390: cio driver &__get_cpu_var replacements
  2014-08-17 17:30 ` [PATCH 23/35] [PATCH 23/36] s390: cio driver &__get_cpu_var replacements Christoph Lameter
@ 2014-08-26 18:13   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:13 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:46PM -0500, Christoph Lameter wrote:
> Use this_cpu_ptr() instead of &__get_cpu_var()
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/35] [PATCH 24/36] ia64: Replace __get_cpu_var uses
  2014-08-17 17:30   ` Christoph Lameter
@ 2014-08-26 18:13     ` Tejun Heo
  -1 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:13 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

On Sun, Aug 17, 2014 at 12:30:47PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/35] [PATCH 24/36] ia64: Replace __get_cpu_var uses
@ 2014-08-26 18:13     ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:13 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Tony Luck, Fenghua Yu, linux-ia64

On Sun, Aug 17, 2014 at 12:30:47PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: linux-ia64@vger.kernel.org
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 25/35] [PATCH 25/36] alpha: Replace __get_cpu_var
  2014-08-17 17:30 ` [PATCH 25/35] [PATCH 25/36] alpha: Replace __get_cpu_var Christoph Lameter
@ 2014-08-26 18:14   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Ivan Kokshaysky, Matt Turner, Richard Henderson

On Sun, Aug 17, 2014 at 12:30:48PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
> Cc: Matt Turner <mattst88@gmail.com>
> Acked-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses Christoph Lameter
  2014-08-18  2:45   ` Christoph Lameter
@ 2014-08-26 18:14   ` Tejun Heo
  2014-12-16 22:07     ` Alexander Graf
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras

On Sun, Aug 17, 2014 at 12:30:49PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Paul Mackerras <paulus@samba.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops with the fix up patch rolled
in.  Please let me know if this patch should be routed differently.
Note that this patch was to be applied to percpu/for-3.17 but delayed
due to build issues caused by cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters
  2014-08-17 17:30 ` [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters Christoph Lameter
@ 2014-08-26 18:15   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:51PM -0500, Christoph Lameter wrote:
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses
  2014-08-17 17:30 ` [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses Christoph Lameter
@ 2014-08-26 18:15   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Mike Frysinger

On Sun, Aug 17, 2014 at 12:30:52PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> CC: Mike Frysinger <vapier@gentoo.org>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 30/35] [PATCH 30/36] avr32: Replace __get_cpu_var with __this_cpu_write
  2014-08-17 17:30 ` [PATCH 30/35] [PATCH 30/36] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
@ 2014-08-26 18:15   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Haavard Skinnemoen, Hans-Christian Egtvedt

On Sun, Aug 17, 2014 at 12:30:53PM -0500, Christoph Lameter wrote:
> Replace the single use of __get_cpu_var in avr32 with
> __this_cpu_write.
> 
> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-26 18:11   ` Tejun Heo
@ 2014-08-26 18:16     ` Will Deacon
  2014-08-26 18:19       ` Tejun Heo
  0 siblings, 1 reply; 81+ messages in thread
From: Will Deacon @ 2014-08-26 18:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Lameter, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Russell King, Catalin Marinas

On Tue, Aug 26, 2014 at 07:11:31PM +0100, Tejun Heo wrote:
> On Sun, Aug 17, 2014 at 12:30:42PM -0500, Christoph Lameter wrote:
> > __this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.
> > 
> > Cc: Russell King <linux@arm.linux.org.uk>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > CC: Will Deacon <will.deacon@arm.com>
> > Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> (Please disregard the ones I posted for v1 of the patch series)
> 
> Applied to percpu/for-3.18-consistent-ops with Will's ack added.
> Please let me know if this patch should be routed differently.  Note
> that this patch was to be applied to percpu/for-3.17 but delayed due
> to build issues caused by cpumask_var_t.

I was going to send it via RMK, but I'm happy either way.

Will

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 31/35] [PATCH 31/36] sparc: Replace __get_cpu_var uses
  2014-08-17 17:30   ` Christoph Lameter
@ 2014-08-26 18:16     ` Tejun Heo
  -1 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

On Sun, Aug 17, 2014 at 12:30:54PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: sparclinux@vger.kernel.org
> Acked-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 31/35] [PATCH 31/36] sparc: Replace __get_cpu_var uses
@ 2014-08-26 18:16     ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, sparclinux, David S. Miller

On Sun, Aug 17, 2014 at 12:30:54PM -0500, Christoph Lameter wrote:
> __get_cpu_var() is used for multiple purposes in the kernel source. One of
> them is address calculation via the form &__get_cpu_var(x).  This calculates
> the address for the instance of the percpu variable of the current processor
> based on an offset.
> 
> Other use cases are for storing and retrieving data from the current
> processors percpu area.  __get_cpu_var() can be used as an lvalue when
> writing data or on the right side of an assignment.
...
> Cc: sparclinux@vger.kernel.org
> Acked-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 32/35] [PATCH 32/36] clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-17 17:30 ` [PATCH 32/35] [PATCH 32/36] clocksource: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
@ 2014-08-26 18:16   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:55PM -0500, Christoph Lameter wrote:
> One newly introduced __this_cpu_ptr should be raw_cpu_ptr.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr
  2014-08-17 17:30 ` [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr Christoph Lameter
@ 2014-08-26 18:16   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

On Sun, Aug 17, 2014 at 12:30:56PM -0500, Christoph Lameter wrote:
> The __this_cpu_ptr macro is no longer in use so drop it.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

(Please disregard the ones I posted for v1 of the patch series)

Applied to percpu/for-3.18-consistent-ops.  Please let me know if this
patch should be routed differently.  Note that this patch was to be
applied to percpu/for-3.17 but delayed due to build issues caused by
cpumask_var_t.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 00/35] percpu: Consistent per cpu operations V7
  2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
                   ` (34 preceding siblings ...)
  2014-08-17 17:30 ` [PATCH 35/35] [PATCH 33/36] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.17] Christoph Lameter
@ 2014-08-26 18:17 ` Tejun Heo
  35 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:17 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

1-33 applied to percpu/for-3.18-consistent-ops.  Will wait for the
updated patches for the cpumask_var_t issue.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr
  2014-08-26 18:16     ` Will Deacon
@ 2014-08-26 18:19       ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2014-08-26 18:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: Christoph Lameter, akpm, rostedt, linux-kernel, Ingo Molnar,
	Peter Zijlstra, Thomas Gleixner, Russell King, Catalin Marinas

On Tue, Aug 26, 2014 at 07:16:14PM +0100, Will Deacon wrote:
> > Applied to percpu/for-3.18-consistent-ops with Will's ack added.
> > Please let me know if this patch should be routed differently.  Note
> > that this patch was to be applied to percpu/for-3.17 but delayed due
> > to build issues caused by cpumask_var_t.
> 
> I was going to send it via RMK, but I'm happy either way.

Yeah, I think it'd be easier to route this through percpu tree.  There
are too many of these and a dependent change follows afterwards.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-08-26 18:14   ` Tejun Heo
@ 2014-12-16 22:07     ` Alexander Graf
  2014-12-17  9:43       ` Paolo Bonzini
  0 siblings, 1 reply; 81+ messages in thread
From: Alexander Graf @ 2014-12-16 22:07 UTC (permalink / raw)
  To: Tejun Heo, Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Scott Wood, Paolo Bonzini

On 26.08.14 20:14, Tejun Heo wrote:
> On Sun, Aug 17, 2014 at 12:30:49PM -0500, Christoph Lameter wrote:
>> __get_cpu_var() is used for multiple purposes in the kernel source. One of
>> them is address calculation via the form &__get_cpu_var(x).  This calculates
>> the address for the instance of the percpu variable of the current processor
>> based on an offset.
>>
>> Other use cases are for storing and retrieving data from the current
>> processors percpu area.  __get_cpu_var() can be used as an lvalue when
>> writing data or on the right side of an assignment.
> ...
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> CC: Paul Mackerras <paulus@samba.org>
>> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> (Please disregard the ones I posted for v1 of the patch series)
> 
> Applied to percpu/for-3.18-consistent-ops with the fix up patch rolled
> in.  Please let me know if this patch should be routed differently.
> Note that this patch was to be applied to percpu/for-3.17 but delayed
> due to build issues caused by cpumask_var_t.

Unfortunately this commit breaks e500v2 builds with KVM enabled:

  arch/powerpc/kvm/e500.c: In function "local_sid_setup_one":
  arch/powerpc/kvm/e500.c:81:29: error: macro "__this_cpu_write"
requires 2 arguments, but only 1 given
  arch/powerpc/kvm/e500.c:81:3: error: unknown type name "__this_cpu_write"
  arch/powerpc/kvm/e500.c:81:47: error: expected "=", ",", ";", "asm" or
"__attribute__" before ")" token
  arch/powerpc/kvm/e500.c:82:8: error: request for member "val" in
something not a structure or union
  arch/powerpc/kvm/e500.c:83:8: error: request for member "pentry" in
something not a structure or union

Looking at the offending line of code I'm not surprised:

-               __get_cpu_var(pcpu_sids).entry[sid] = entry;
+               __this_cpu_write(pcpu_sids)entry[sid], entry);

This doesn't quite look like valid C to me. The patch below fixes the
build error for me. Please check whether it's correct and if so apply it
directly to the tree.


Alex


From: Alexander Graf <agraf@suse.de>
Date: Tue, 16 Dec 2014 23:04:01 +0100
Subject: [PATCH] KVM: PPC: E500: Compile fix in this_cpu_write

Commit 69111bac42f5 (powerpc: Replace __get_cpu_var uses) introduced compile
breakage to the e500 target by introducing invalid automatically created C
syntax.

Fix up the breakage and make the code compile again.

Signed-off-by: Alexander Graf <agraf@suse.de>

diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 1609584..e1cb588 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -78,7 +78,7 @@ static inline int local_sid_setup_one(struct id *entry)

 	sid = __this_cpu_inc_return(pcpu_last_used_sid);
 	if (sid < NUM_TIDS) {
-		__this_cpu_write(pcpu_sids)entry[sid], entry);
+		__this_cpu_write(pcpu_sids.entry[sid], entry);
 		entry->val = sid;
 		entry->pentry = this_cpu_ptr(&pcpu_sids.entry[sid]);
 		ret = sid;

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-12-16 22:07     ` Alexander Graf
@ 2014-12-17  9:43       ` Paolo Bonzini
  2014-12-17 15:31         ` Christoph Lameter
  0 siblings, 1 reply; 81+ messages in thread
From: Paolo Bonzini @ 2014-12-17  9:43 UTC (permalink / raw)
  To: Alexander Graf, Tejun Heo, Christoph Lameter
  Cc: akpm, rostedt, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Scott Wood



On 16/12/2014 23:07, Alexander Graf wrote:
> On 26.08.14 20:14, Tejun Heo wrote:
>> On Sun, Aug 17, 2014 at 12:30:49PM -0500, Christoph Lameter wrote:
>>> __get_cpu_var() is used for multiple purposes in the kernel source. One of
>>> them is address calculation via the form &__get_cpu_var(x).  This calculates
>>> the address for the instance of the percpu variable of the current processor
>>> based on an offset.
>>>
>>> Other use cases are for storing and retrieving data from the current
>>> processors percpu area.  __get_cpu_var() can be used as an lvalue when
>>> writing data or on the right side of an assignment.
>> ...
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> CC: Paul Mackerras <paulus@samba.org>
>>> Signed-off-by: Christoph Lameter <cl@linux.com>
>>
>> (Please disregard the ones I posted for v1 of the patch series)
>>
>> Applied to percpu/for-3.18-consistent-ops with the fix up patch rolled
>> in.  Please let me know if this patch should be routed differently.
>> Note that this patch was to be applied to percpu/for-3.17 but delayed
>> due to build issues caused by cpumask_var_t.
> 
> Unfortunately this commit breaks e500v2 builds with KVM enabled:
> 
>   arch/powerpc/kvm/e500.c: In function "local_sid_setup_one":
>   arch/powerpc/kvm/e500.c:81:29: error: macro "__this_cpu_write"
> requires 2 arguments, but only 1 given
>   arch/powerpc/kvm/e500.c:81:3: error: unknown type name "__this_cpu_write"
>   arch/powerpc/kvm/e500.c:81:47: error: expected "=", ",", ";", "asm" or
> "__attribute__" before ")" token
>   arch/powerpc/kvm/e500.c:82:8: error: request for member "val" in
> something not a structure or union
>   arch/powerpc/kvm/e500.c:83:8: error: request for member "pentry" in
> something not a structure or union
> 
> Looking at the offending line of code I'm not surprised:
> 
> -               __get_cpu_var(pcpu_sids).entry[sid] = entry;
> +               __this_cpu_write(pcpu_sids)entry[sid], entry);
> 
> This doesn't quite look like valid C to me. The patch below fixes the
> build error for me. Please check whether it's correct and if so apply it
> directly to the tree.

I can also ask Linus to squash it in the merge.

Paolo

> 
> 
> Alex
> 
> 
> From: Alexander Graf <agraf@suse.de>
> Date: Tue, 16 Dec 2014 23:04:01 +0100
> Subject: [PATCH] KVM: PPC: E500: Compile fix in this_cpu_write
> 
> Commit 69111bac42f5 (powerpc: Replace __get_cpu_var uses) introduced compile
> breakage to the e500 target by introducing invalid automatically created C
> syntax.
> 
> Fix up the breakage and make the code compile again.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> 
> diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
> index 1609584..e1cb588 100644
> --- a/arch/powerpc/kvm/e500.c
> +++ b/arch/powerpc/kvm/e500.c
> @@ -78,7 +78,7 @@ static inline int local_sid_setup_one(struct id *entry)
> 
>  	sid = __this_cpu_inc_return(pcpu_last_used_sid);
>  	if (sid < NUM_TIDS) {
> -		__this_cpu_write(pcpu_sids)entry[sid], entry);
> +		__this_cpu_write(pcpu_sids.entry[sid], entry);
>  		entry->val = sid;
>  		entry->pentry = this_cpu_ptr(&pcpu_sids.entry[sid]);
>  		ret = sid;
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses
  2014-12-17  9:43       ` Paolo Bonzini
@ 2014-12-17 15:31         ` Christoph Lameter
  0 siblings, 0 replies; 81+ messages in thread
From: Christoph Lameter @ 2014-12-17 15:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alexander Graf, Tejun Heo, akpm, rostedt, linux-kernel,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Scott Wood

On Wed, 17 Dec 2014, Paolo Bonzini wrote:

> I can also ask Linus to squash it in the merge.

Merge the fix please. Further patches were merged that remove
the __get_cpu_var macro. Reversion of the patch will cause more breakage.


^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2014-12-17 15:31 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-17 17:30 [PATCH 00/35] percpu: Consistent per cpu operations V7 Christoph Lameter
2014-08-17 17:30 ` [PATCH 01/35] [PATCH 02/36] kernel misc: Replace __get_cpu_var uses Christoph Lameter
2014-08-26 18:04   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 02/35] [PATCH 03/36] time: " Christoph Lameter
2014-08-26 18:05   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 03/35] time: Convert a bunch of &__get_cpu_var introduced in the 3.16 merge period Christoph Lameter
2014-08-26 18:05   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 04/35] [PATCH 04/36] scheduler: Replace __get_cpu_var with this_cpu_ptr Christoph Lameter
2014-08-26 18:05   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 05/35] [PATCH 05/36] block: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-08-26 18:06   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 06/35] [PATCH 06/36] drivers/char/random: Replace __get_cpu_var uses Christoph Lameter
2014-08-26 18:07   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 07/35] [PATCH 07/36] drivers/cpuidle: Replace __get_cpu_var uses for address calculation Christoph Lameter
2014-08-26 18:07   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 08/35] [PATCH 08/36] drivers/oprofile: " Christoph Lameter
2014-08-26 18:08   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 09/35] [PATCH 09/36] drivers/clocksource: Replace __get_cpu_var used " Christoph Lameter
2014-08-26 18:08   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 10/35] [PATCH 10/36] drivers/net/ethernet/tile: Replace __get_cpu_var uses " Christoph Lameter
2014-08-26 18:08   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 11/35] [PATCH 11/36] watchdog: Replace __raw_get_cpu_var uses Christoph Lameter
2014-08-26 18:08   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 12/35] [PATCH 12/36] net: Replace get_cpu_var through this_cpu_ptr Christoph Lameter
2014-08-26 18:09   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 13/35] [PATCH 13/36] md: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-08-26 18:09   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 14/35] [PATCH 14/36] metag: Replace __get_cpu_var uses for address calculation Christoph Lameter
2014-08-26 18:09   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 15/35] [PATCH 15/36] drivers/net/ethernet/tile: __get_cpu_var call introduced in 3.14 Christoph Lameter
2014-08-26 18:09   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 16/35] [PATCH 16/36] irqchips: Replace __this_cpu_ptr uses Christoph Lameter
2014-08-26 18:10   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 17/35] [PATCH 17/36] x86: Replace __get_cpu_var uses Christoph Lameter
2014-08-26 18:10   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 18/35] [PATCH 18/36] uv: Replace __get_cpu_var Christoph Lameter
2014-08-26 18:11   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 19/35] [PATCH 19/36] arm: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-08-22 17:48   ` Will Deacon
2014-08-26 18:11   ` Tejun Heo
2014-08-26 18:16     ` Will Deacon
2014-08-26 18:19       ` Tejun Heo
2014-08-17 17:30 ` [PATCH 20/35] [PATCH 20/36] MIPS: Replace __get_cpu_var uses in FPU emulator Christoph Lameter
2014-08-26 18:12   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 21/35] [PATCH 21/36] mips: Replace __get_cpu_var uses Christoph Lameter
2014-08-26 18:12   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 22/35] [PATCH 22/36] s390: " Christoph Lameter
2014-08-26 18:13   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 23/35] [PATCH 23/36] s390: cio driver &__get_cpu_var replacements Christoph Lameter
2014-08-26 18:13   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 24/35] [PATCH 24/36] ia64: Replace __get_cpu_var uses Christoph Lameter
2014-08-17 17:30   ` Christoph Lameter
2014-08-26 18:13   ` Tejun Heo
2014-08-26 18:13     ` Tejun Heo
2014-08-17 17:30 ` [PATCH 25/35] [PATCH 25/36] alpha: Replace __get_cpu_var Christoph Lameter
2014-08-26 18:14   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 26/35] [PATCH 26/36] powerpc: Replace __get_cpu_var uses Christoph Lameter
2014-08-18  2:45   ` Christoph Lameter
2014-08-26 18:14   ` Tejun Heo
2014-12-16 22:07     ` Alexander Graf
2014-12-17  9:43       ` Paolo Bonzini
2014-12-17 15:31         ` Christoph Lameter
2014-08-17 17:30 ` [PATCH 27/35] [PATCH 27/36] tile: " Christoph Lameter
2014-08-26 18:11   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 28/35] [PATCH 28/36] tile: Use this_cpu_ptr() for hardware counters Christoph Lameter
2014-08-26 18:15   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 29/35] [PATCH 29/36] blackfin: Replace __get_cpu_var uses Christoph Lameter
2014-08-26 18:15   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 30/35] [PATCH 30/36] avr32: Replace __get_cpu_var with __this_cpu_write Christoph Lameter
2014-08-26 18:15   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 31/35] [PATCH 31/36] sparc: Replace __get_cpu_var uses Christoph Lameter
2014-08-17 17:30   ` Christoph Lameter
2014-08-26 18:16   ` Tejun Heo
2014-08-26 18:16     ` Tejun Heo
2014-08-17 17:30 ` [PATCH 32/35] [PATCH 32/36] clocksource: Replace __this_cpu_ptr with raw_cpu_ptr Christoph Lameter
2014-08-26 18:16   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 33/35] [PATCH 34/36] percpu: Remove __this_cpu_ptr Christoph Lameter
2014-08-26 18:16   ` Tejun Heo
2014-08-17 17:30 ` [PATCH 34/35] [PATCH 01/36] __get_cpu_var/cpumask_var_t: Resolve ambiguities Christoph Lameter
2014-08-17 17:30 ` [PATCH 35/35] [PATCH 33/36] Remove __get_cpu_var and __raw_get_cpu_var macros [only in 3.17] Christoph Lameter
2014-08-26 18:17 ` [PATCH 00/35] percpu: Consistent per cpu operations V7 Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.