All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/13] sched/treewide: Clean up various racy task affinity issues
@ 2017-04-12 20:07 Thomas Gleixner
  2017-04-12 20:07 ` [patch 01/13] ia64/topology: Remove cpus_allowed manipulation Thomas Gleixner
                   ` (13 more replies)
  0 siblings, 14 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar

While dealing with the fallout of the scheduler cleanups on RT, we found
several racy usage sites of the following scheme:

	cpumask_copy(&save_cpus_allowed, &current->cpus_allowed);
	set_cpus_allowed_ptr(current, cpumask_of(cpu));
	do_stuff();
	set_cpus_allowed_ptr(current, &save_cpus_allowed);

That's racy in two aspects:

1) Nothing prevents the CPU from being unplugged after the temporary
   affinity setting is in place. This results on code being executed on the
   wrong CPU(s).

2) Nothing prevents a concurrent affinity setting from user space. That
   also results in code being executed on the wrong CPU(s) and the restore
   of the previous affinity setting overwrites the new one.

Various variants of cleanups:

 - Removal, because the calling thread is already guaranteed to run on the
   correct CPU.

 - Conversion to smp function calls (simple register read/write)

 - Conversion to work_on_cpu(). There were even files containing comments
   to that effect.

 - The rest needs seperate hotplug protection for work_on_cpu(). To avoid open
   coding the

	get_online_cpus();
	if (cpu_online(cpu))
		ret = do_stuff();
	else
		ret = -ENODEV;
	put_online_cpus();

   scheme this series provides a new helper function work_on_cpu_safe()
   which implements the above.

Aside of fixing these races this allows to restrict the access to
current->cpus_allowed with a follow up series.

Thanks,

	tglx
---
 arch/ia64/kernel/salinfo.c           |   31 ++++-------
 arch/ia64/kernel/topology.c          |    6 --
 arch/ia64/sn/kernel/sn2/sn_hwperf.c  |   16 +++---
 arch/powerpc/kernel/smp.c            |   26 ++++------
 arch/sparc/kernel/sysfs.c            |   36 +++----------
 drivers/acpi/processor_driver.c      |   10 +++
 drivers/acpi/processor_throttling.c  |   31 +++++------
 drivers/cpufreq/ia64-acpi-cpufreq.c  |   91 ++++++++++++++---------------------
 drivers/cpufreq/sh-cpufreq.c         |   45 ++++++++++-------
 drivers/cpufreq/sparc-us2e-cpufreq.c |   45 ++++++++---------
 drivers/cpufreq/sparc-us3-cpufreq.c  |   46 ++++++-----------
 drivers/crypto/n2_core.c             |   31 ++++++-----
 include/linux/workqueue.h            |    5 +
 kernel/workqueue.c                   |   23 ++++++++
 14 files changed, 209 insertions(+), 233 deletions(-)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 01/13] ia64/topology: Remove cpus_allowed manipulation
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-15 14:15   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-ia64

[-- Attachment #1: ia64-topology--Remove-cpus_allowed-hackery.patch --]
[-- Type: text/plain, Size: 964 bytes --]

The CPU hotplug callback fiddles with the cpus_allowed pointer to pin the
calling thread on the plugged CPU. That's already guaranteed by the hotplug
core code.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
---
 arch/ia64/kernel/topology.c |    6 ------
 1 file changed, 6 deletions(-)

--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -355,18 +355,12 @@ static int cache_add_dev(unsigned int cp
 	unsigned long i, j;
 	struct cache_info *this_object;
 	int retval = 0;
-	cpumask_t oldmask;
 
 	if (all_cpu_cache_info[cpu].kobj.parent)
 		return 0;
 
-	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed_ptr(current, cpumask_of(cpu));
-	if (unlikely(retval))
-		return retval;
 
 	retval = cpu_cache_sysfs_init(cpu);
-	set_cpus_allowed_ptr(current, &oldmask);
 	if (unlikely(retval < 0))
 		return retval;
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
  2017-04-12 20:07 ` [patch 01/13] ia64/topology: Remove cpus_allowed manipulation Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13 11:11   ` Dou Liyang
                     ` (3 more replies)
  2017-04-12 20:07 ` [patch 03/13] ia64/salinfo: Replace racy task affinity logic Thomas Gleixner
                   ` (11 subsequent siblings)
  13 siblings, 4 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar

[-- Attachment #1: workqueue--Provide-work_on_cpu_safe--.patch --]
[-- Type: text/plain, Size: 1942 bytes --]

work_on_cpu() is not protected against CPU hotplug. For code which requires
to be either executed on an online CPU or to fail if the CPU is not
available the callsite would have to protect against CPU hotplug.

Provide a function which does get/put_online_cpus() around the call to
work_on_cpu() and fails the call with -ENODEV if the target CPU is not
online.

Preparatory patch to convert several racy task affinity manipulations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
---
 include/linux/workqueue.h |    5 +++++
 kernel/workqueue.c        |   23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -608,8 +608,13 @@ static inline long work_on_cpu(int cpu,
 {
 	return fn(arg);
 }
+static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+	return fn(arg);
+}
 #else
 long work_on_cpu(int cpu, long (*fn)(void *), void *arg);
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4735,6 +4735,29 @@ long work_on_cpu(int cpu, long (*fn)(voi
 	return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
+
+/**
+ * work_on_cpu_safe - run a function in thread context on a particular cpu
+ * @cpu: the cpu to run on
+ * @fn:  the function to run
+ * @arg: the function argument
+ *
+ * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold
+ * any locks which would prevent @fn from completing.
+ *
+ * Return: The value @fn returns.
+ */
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+	long ret = -ENODEV;
+
+	get_online_cpus();
+	if (cpu_online(cpu))
+		ret = work_on_cpu(cpu, fn, arg);
+	put_online_cpus();
+	return ret;
+}
+EXPORT_SYMBOL_GPL(work_on_cpu_safe);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 03/13] ia64/salinfo: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
  2017-04-12 20:07 ` [patch 01/13] ia64/topology: Remove cpus_allowed manipulation Thomas Gleixner
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-15 14:17   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 04/13] ia64/sn/hwperf: " Thomas Gleixner
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-ia64

[-- Attachment #1: ia64-salinfo--Use-work_on_cpu_safe--.patch --]
[-- Type: text/plain, Size: 3516 bytes --]

Some of the file operations in /proc/sal require to run code on the
requested cpu. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
---
 arch/ia64/kernel/salinfo.c |   31 ++++++++++++-------------------
 1 file changed, 12 insertions(+), 19 deletions(-)

--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -179,14 +179,14 @@ struct salinfo_platform_oemdata_parms {
 	const u8 *efi_guid;
 	u8 **oemdata;
 	u64 *oemdata_size;
-	int ret;
 };
 
-static void
+static long
 salinfo_platform_oemdata_cpu(void *context)
 {
 	struct salinfo_platform_oemdata_parms *parms = context;
-	parms->ret = salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, parms->oemdata_size);
+
+	return salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, parms->oemdata_size);
 }
 
 static void
@@ -380,16 +380,7 @@ salinfo_log_release(struct inode *inode,
 	return 0;
 }
 
-static void
-call_on_cpu(int cpu, void (*fn)(void *), void *arg)
-{
-	cpumask_t save_cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-	(*fn)(arg);
-	set_cpus_allowed_ptr(current, &save_cpus_allowed);
-}
-
-static void
+static long
 salinfo_log_read_cpu(void *context)
 {
 	struct salinfo_data *data = context;
@@ -399,6 +390,7 @@ salinfo_log_read_cpu(void *context)
 	/* Clear corrected errors as they are read from SAL */
 	if (rh->severity == sal_log_severity_corrected)
 		ia64_sal_clear_state_info(data->type);
+	return 0;
 }
 
 static void
@@ -430,7 +422,7 @@ salinfo_log_new_read(int cpu, struct sal
 	spin_unlock_irqrestore(&data_saved_lock, flags);
 
 	if (!data->saved_num)
-		call_on_cpu(cpu, salinfo_log_read_cpu, data);
+		work_on_cpu_safe(cpu, salinfo_log_read_cpu, data);
 	if (!data->log_size) {
 		data->state = STATE_NO_DATA;
 		cpumask_clear_cpu(cpu, &data->cpu_event);
@@ -459,11 +451,13 @@ salinfo_log_read(struct file *file, char
 	return simple_read_from_buffer(buffer, count, ppos, buf, bufsize);
 }
 
-static void
+static long
 salinfo_log_clear_cpu(void *context)
 {
 	struct salinfo_data *data = context;
+
 	ia64_sal_clear_state_info(data->type);
+	return 0;
 }
 
 static int
@@ -486,7 +480,7 @@ salinfo_log_clear(struct salinfo_data *d
 	rh = (sal_log_record_header_t *)(data->log_buffer);
 	/* Corrected errors have already been cleared from SAL */
 	if (rh->severity != sal_log_severity_corrected)
-		call_on_cpu(cpu, salinfo_log_clear_cpu, data);
+		work_on_cpu_safe(cpu, salinfo_log_clear_cpu, data);
 	/* clearing a record may make a new record visible */
 	salinfo_log_new_read(cpu, data);
 	if (data->state == STATE_LOG_RECORD) {
@@ -531,9 +525,8 @@ salinfo_log_write(struct file *file, con
 				.oemdata = &data->oemdata,
 				.oemdata_size = &data->oemdata_size
 			};
-			call_on_cpu(cpu, salinfo_platform_oemdata_cpu, &parms);
-			if (parms.ret)
-				count = parms.ret;
+			count = work_on_cpu_safe(cpu, salinfo_platform_oemdata_cpu,
+						 &parms);
 		} else
 			data->oemdata_size = 0;
 	} else

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 04/13] ia64/sn/hwperf: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (2 preceding siblings ...)
  2017-04-12 20:07 ` [patch 03/13] ia64/salinfo: Replace racy task affinity logic Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-12 20:53   ` [patch V 2 " Thomas Gleixner
  2017-04-12 20:07   ` Thomas Gleixner
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-ia64

[-- Attachment #1: ia64-sn-hwperf--Use-work_on_cpu_safe--.patch --]
[-- Type: text/plain, Size: 1724 bytes --]

sn_hwperf_op_cpu() which is invoked from an ioctl requires to run code on
the requested cpu. This is achieved by temporarily setting the affinity of
the calling user space thread to the requested CPU and reset it to the
original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
---
 arch/ia64/sn/kernel/sn2/sn_hwperf.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

--- a/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -598,6 +598,12 @@ static void sn_hwperf_call_sal(void *inf
 	op_info->ret = r;
 }
 
+static long sn_hwperf_call_sal_work(void *info)
+{
+	sn_hwperf_call_sal(info);
+	return 0;
+}
+
 static int sn_hwperf_op_cpu(struct sn_hwperf_op_info *op_info)
 {
 	u32 cpu;
@@ -629,13 +635,9 @@ static int sn_hwperf_op_cpu(struct sn_hw
 			/* use an interprocessor interrupt to call SAL */
 			smp_call_function_single(cpu, sn_hwperf_call_sal,
 				op_info, 1);
-		}
-		else {
-			/* migrate the task before calling SAL */ 
-			save_allowed = current->cpus_allowed;
-			set_cpus_allowed_ptr(current, cpumask_of(cpu));
-			sn_hwperf_call_sal(op_info);
-			set_cpus_allowed_ptr(current, &save_allowed);
+		} else {
+			/* Call on the target CPU */
+			work_on_cpu_safe(cpu, sn_hwperf_call_sal, op_info);
 		}
 	}
 	r = op_info->ret;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 05/13] powerpc/smp: Replace open coded task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
@ 2017-04-12 20:07   ` Thomas Gleixner
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, Paul Mackerras, linuxppc-dev

[-- Attachment #1: powerpc-smp--Use-work-function-instead-of-affinity-games.patch --]
[-- Type: text/plain, Size: 2437 bytes --]

Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
run on any online CPU at this point, but the setup_cpu() callback requires
to be invoked on the boot CPU. This is achieved by temporarily setting the
affinity of the calling user space thread to the requested CPU and reset it
to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

That's actually not a problem in this context as neither CPU hotplug nor
affinity settings can happen, but the access to task_struct::cpus_allowed
is about to restricted.

Replace it with a call to work_on_cpu_safe() which achieves the same result.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/smp.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -787,24 +787,21 @@ static struct sched_domain_topology_leve
 	{ NULL, },
 };
 
-void __init smp_cpus_done(unsigned int max_cpus)
+static __init long smp_setup_cpu_workfn(void *data __always_unused)
 {
-	cpumask_var_t old_mask;
+	smp_ops->setup_cpu(boot_cpuid);
+	return 0;
+}
 
-	/* We want the setup_cpu() here to be called from CPU 0, but our
-	 * init thread may have been "borrowed" by another CPU in the meantime
-	 * se we pin us down to CPU 0 for a short while
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+	/*
+	 * We want the setup_cpu() here to be called on the boot CPU, but
+	 * init might run on any CPU, so make sure it's invoked on the boot
+	 * CPU.
 	 */
-	alloc_cpumask_var(&old_mask, GFP_NOWAIT);
-	cpumask_copy(old_mask, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(boot_cpuid));
-	
 	if (smp_ops && smp_ops->setup_cpu)
-		smp_ops->setup_cpu(boot_cpuid);
-
-	set_cpus_allowed_ptr(current, old_mask);
-
-	free_cpumask_var(old_mask);
+		work_on_cpu_safe(boot_cpuid, smp_setup_cpu_workfn, NULL);
 
 	if (smp_ops && smp_ops->bringup_done)
 		smp_ops->bringup_done();
@@ -812,7 +809,6 @@ void __init smp_cpus_done(unsigned int m
 	dump_numa_cpu_topology();
 
 	set_sched_topology(powerpc_topology);
-
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 05/13] powerpc/smp: Replace open coded task affinity logic
@ 2017-04-12 20:07   ` Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, Paul Mackerras, linuxppc-dev

Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
run on any online CPU at this point, but the setup_cpu() callback requires
to be invoked on the boot CPU. This is achieved by temporarily setting the
affinity of the calling user space thread to the requested CPU and reset it
to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

That's actually not a problem in this context as neither CPU hotplug nor
affinity settings can happen, but the access to task_struct::cpus_allowed
is about to restricted.

Replace it with a call to work_on_cpu_safe() which achieves the same result.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/smp.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -787,24 +787,21 @@ static struct sched_domain_topology_leve
 	{ NULL, },
 };
 
-void __init smp_cpus_done(unsigned int max_cpus)
+static __init long smp_setup_cpu_workfn(void *data __always_unused)
 {
-	cpumask_var_t old_mask;
+	smp_ops->setup_cpu(boot_cpuid);
+	return 0;
+}
 
-	/* We want the setup_cpu() here to be called from CPU 0, but our
-	 * init thread may have been "borrowed" by another CPU in the meantime
-	 * se we pin us down to CPU 0 for a short while
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+	/*
+	 * We want the setup_cpu() here to be called on the boot CPU, but
+	 * init might run on any CPU, so make sure it's invoked on the boot
+	 * CPU.
 	 */
-	alloc_cpumask_var(&old_mask, GFP_NOWAIT);
-	cpumask_copy(old_mask, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(boot_cpuid));
-	
 	if (smp_ops && smp_ops->setup_cpu)
-		smp_ops->setup_cpu(boot_cpuid);
-
-	set_cpus_allowed_ptr(current, old_mask);
-
-	free_cpumask_var(old_mask);
+		work_on_cpu_safe(boot_cpuid, smp_setup_cpu_workfn, NULL);
 
 	if (smp_ops && smp_ops->bringup_done)
 		smp_ops->bringup_done();
@@ -812,7 +809,6 @@ void __init smp_cpus_done(unsigned int m
 	dump_numa_cpu_topology();
 
 	set_sched_topology(powerpc_topology);
-
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 06/13] sparc/sysfs: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
@ 2017-04-12 20:07   ` Thomas Gleixner
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, sparclinux

[-- Attachment #1: sparc-sysfs--Use-work-instead-of-affinity-games.patch --]
[-- Type: text/plain, Size: 2924 bytes --]

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
---
 arch/sparc/kernel/sysfs.c |   36 +++++++++---------------------------
 1 file changed, 9 insertions(+), 27 deletions(-)

--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_g
 	.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-			        unsigned long (*func)(unsigned long),
-				unsigned long arg)
-{
-	cpumask_t old_affinity;
-	unsigned long ret;
-
-	cpumask_copy(&old_affinity, &current->cpus_allowed);
-	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-		return 0;
-
-	ret = func(arg);
-
-	set_cpus_allowed_ptr(current, &old_affinity);
-
-	return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
 	unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable
 	return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-	unsigned long ra, orig_ra;
+	unsigned long ra, orig_ra, *val = data;
 
-	if (val)
+	if (*val)
 		ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
 	else
 		ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
 static ssize_t show_mmustat_enable(struct device *s,
 				struct device_attribute *attr, char *buf)
 {
-	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+	long val = work_on_cpu(s->id, read_mmustat_enable, 0);
+
 	return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,14 @@ static ssize_t store_mmustat_enable(stru
 			struct device_attribute *attr, const char *buf,
 			size_t count)
 {
-	unsigned long val, err;
 	int ret = sscanf(buf, "%lu", &val);
+	unsigned long val;
+	long err;
 
 	if (ret != 1)
 		return -EINVAL;
 
-	err = run_on_cpu(s->id, write_mmustat_enable, val);
+	err = work_on_cpu(s->id, write_mmustat_enable, &val);
 	if (err)
 		return -EIO;
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 06/13] sparc/sysfs: Replace racy task affinity logic
@ 2017-04-12 20:07   ` Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, sparclinux

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
---
 arch/sparc/kernel/sysfs.c |   36 +++++++++---------------------------
 1 file changed, 9 insertions(+), 27 deletions(-)

--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_g
 	.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-			        unsigned long (*func)(unsigned long),
-				unsigned long arg)
-{
-	cpumask_t old_affinity;
-	unsigned long ret;
-
-	cpumask_copy(&old_affinity, &current->cpus_allowed);
-	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-		return 0;
-
-	ret = func(arg);
-
-	set_cpus_allowed_ptr(current, &old_affinity);
-
-	return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
 	unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable
 	return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-	unsigned long ra, orig_ra;
+	unsigned long ra, orig_ra, *val = data;
 
-	if (val)
+	if (*val)
 		ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
 	else
 		ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
 static ssize_t show_mmustat_enable(struct device *s,
 				struct device_attribute *attr, char *buf)
 {
-	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+	long val = work_on_cpu(s->id, read_mmustat_enable, 0);
+
 	return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,14 @@ static ssize_t store_mmustat_enable(stru
 			struct device_attribute *attr, const char *buf,
 			size_t count)
 {
-	unsigned long val, err;
 	int ret = sscanf(buf, "%lu", &val);
+	unsigned long val;
+	long err;
 
 	if (ret != 1)
 		return -EINVAL;
 
-	err = run_on_cpu(s->id, write_mmustat_enable, val);
+	err = work_on_cpu(s->id, write_mmustat_enable, &val);
 	if (err)
 		return -EIO;
 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 07/13] ACPI/processor: Fix error handling in __acpi_processor_start()
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (5 preceding siblings ...)
  2017-04-12 20:07   ` Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-15 14:19   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 08/13] ACPI/processor: Replace racy task affinity logic Thomas Gleixner
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-acpi

[-- Attachment #1: ACPI-processor--Fix-error-handling-in-__acpi_processor_start--.patch --]
[-- Type: text/plain, Size: 781 bytes --]

When acpi_install_notify_handler() fails the cooling device stays
registered and the sysfs files created via acpi_pss_perf_init() are
leaked and the function returns success.

Undo acpi_pss_perf_init() and return a proper error code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
---
 drivers/acpi/processor_driver.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -251,6 +251,9 @@ static int __acpi_processor_start(struct
 	if (ACPI_SUCCESS(status))
 		return 0;
 
+	result = -ENODEV;
+	acpi_pss_perf_exit(pr, device);
+
 err_power_exit:
 	acpi_processor_power_exit(pr);
 	return result;



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 08/13] ACPI/processor: Replace racy task affinity logic.
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (6 preceding siblings ...)
  2017-04-12 20:07 ` [patch 07/13] ACPI/processor: Fix error handling in __acpi_processor_start() Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13 11:39   ` Peter Zijlstra
  2017-04-15 14:19   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 09/13] cpufreq/ia64: " Thomas Gleixner
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-acpi

[-- Attachment #1: ACPI-processor--Use-work_on_cpu--.patch --]
[-- Type: text/plain, Size: 3257 bytes --]

acpi_processor_get_throttling() requires to invoke the getter function on
the target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

acpi_processor_get_throttling() is invoked in two ways:

1) The CPU online callback, which is already running on the target CPU and
   obviously protected against hotplug and not affected by affinity
   settings.

2) The ACPI driver probe function, which is not protected against hotplug
   during modprobe.

Switch it over to work_on_cpu() and protect the probe function against CPU
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
---
 drivers/acpi/processor_driver.c     |    7 ++++++-
 drivers/acpi/processor_throttling.c |   31 +++++++++++++------------------
 2 files changed, 19 insertions(+), 19 deletions(-)

--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -262,11 +262,16 @@ static int __acpi_processor_start(struct
 static int acpi_processor_start(struct device *dev)
 {
 	struct acpi_device *device = ACPI_COMPANION(dev);
+	int ret;
 
 	if (!device)
 		return -ENODEV;
 
-	return __acpi_processor_start(device);
+	/* Protect against concurrent CPU hotplug operations */
+	get_online_cpus();
+	ret = __acpi_processor_start(device);
+	put_online_cpus();
+	return ret;
 }
 
 static int acpi_processor_stop(struct device *dev)
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -901,36 +901,31 @@ static int acpi_processor_get_throttling
 	return 0;
 }
 
-static int acpi_processor_get_throttling(struct acpi_processor *pr)
+static long __acpi_processor_get_throttling(void *data)
 {
-	cpumask_var_t saved_mask;
-	int ret;
+	struct acpi_processor *pr = data;
+
+	return pr->throttling.acpi_processor_get_throttling(pr);
+}
 
+static int acpi_processor_get_throttling(struct acpi_processor *pr)
+{
 	if (!pr)
 		return -EINVAL;
 
 	if (!pr->flags.throttling)
 		return -ENODEV;
 
-	if (!alloc_cpumask_var(&saved_mask, GFP_KERNEL))
-		return -ENOMEM;
-
 	/*
-	 * Migrate task to the cpu pointed by pr.
+	 * This is either called from the CPU hotplug callback of
+	 * processor_driver or via the ACPI probe function. In the latter
+	 * case the CPU is not guaranteed to be online. Both call sites are
+	 * protected against CPU hotplug.
 	 */
-	cpumask_copy(saved_mask, &current->cpus_allowed);
-	/* FIXME: use work_on_cpu() */
-	if (set_cpus_allowed_ptr(current, cpumask_of(pr->id))) {
-		/* Can't migrate to the target pr->id CPU. Exit */
-		free_cpumask_var(saved_mask);
+	if (!cpu_online(pr->id))
 		return -ENODEV;
-	}
-	ret = pr->throttling.acpi_processor_get_throttling(pr);
-	/* restore the previous state */
-	set_cpus_allowed_ptr(current, saved_mask);
-	free_cpumask_var(saved_mask);
 
-	return ret;
+	return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
 }
 
 static int acpi_processor_get_fadt_info(struct acpi_processor *pr)



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 09/13] cpufreq/ia64: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (7 preceding siblings ...)
  2017-04-12 20:07 ` [patch 08/13] ACPI/processor: Replace racy task affinity logic Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-12 20:55   ` [patch V2 " Thomas Gleixner
  2017-04-13  2:42   ` [patch 09/13] " Viresh Kumar
  2017-04-12 20:07 ` [patch 10/13] cpufreq/sh: " Thomas Gleixner
                   ` (4 subsequent siblings)
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

[-- Attachment #1: cpufreq-ia64--Use-work_on_cpu--.patch --]
[-- Type: text/plain, Size: 4697 bytes --]

The get() and target() callbacks must run on the affected cpu. This is
achieved by temporarily setting the affinity of the calling thread to the
requested CPU and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-pm@vger.kernel.org
---
 drivers/cpufreq/ia64-acpi-cpufreq.c |   91 +++++++++++++++---------------------
 1 file changed, 38 insertions(+), 53 deletions(-)

--- a/drivers/cpufreq/ia64-acpi-cpufreq.c
+++ b/drivers/cpufreq/ia64-acpi-cpufreq.c
@@ -34,6 +34,11 @@ struct cpufreq_acpi_io {
 	unsigned int				resume;
 };
 
+struct cpufreq_acpi_req {
+	unsigned int		cpu;
+	unsigned int		state;
+};
+
 static struct cpufreq_acpi_io	*acpi_io_data[NR_CPUS];
 
 static struct cpufreq_driver acpi_cpufreq_driver;
@@ -83,8 +88,7 @@ processor_get_pstate (
 static unsigned
 extract_clock (
 	struct cpufreq_acpi_io *data,
-	unsigned value,
-	unsigned int cpu)
+	unsigned value)
 {
 	unsigned long i;
 
@@ -98,60 +102,42 @@ extract_clock (
 }
 
 
-static unsigned int
+static long
 processor_get_freq (
-	struct cpufreq_acpi_io	*data,
-	unsigned int		cpu)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	unsigned long 		clock_freq;
+	struct cpufreq_acpi_req *req = arg;
+	struct cpufreq_acpi_io	*data = acpi_io_data[req->cpu];
+	u32			value;
+	int			ret;
 
 	pr_debug("processor_get_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 	if (smp_processor_id() != cpu)
-		goto migrate_end;
+		return -EAGAIN;
 
 	/* processor_get_pstate gets the instantaneous frequency */
 	ret = processor_get_pstate(&value);
-
 	if (ret) {
-		set_cpus_allowed_ptr(current, &saved_mask);
 		pr_warn("get performance failed with error %d\n", ret);
-		ret = 0;
-		goto migrate_end;
+		return ret;
 	}
-	clock_freq = extract_clock(data, value, cpu);
-	ret = (clock_freq*1000);
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return ret;
+	return 1000 * extract_clock(data, value);
 }
 
 
-static int
+static long
 processor_set_freq (
-	struct cpufreq_acpi_io	*data,
-	struct cpufreq_policy   *policy,
-	int			state)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	int			retval;
+	struct cpufreq_acpi_req *req = arg;
+	unsigned int		cpu = req->cpu;
+	struct cpufreq_acpi_io	*data = acpi_io_data[cpu];
+	int			ret, state = req->state;
+	u32			value;
 
 	pr_debug("processor_set_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(policy->cpu));
-	if (smp_processor_id() != policy->cpu) {
-		retval = -EAGAIN;
-		goto migrate_end;
-	}
+	if (smp_processor_id() != cpu)
+		return -EAGAIN;
 
 	if (state == data->acpi_data.state) {
 		if (unlikely(data->resume)) {
@@ -159,8 +145,7 @@ processor_set_freq (
 			data->resume = 0;
 		} else {
 			pr_debug("Already at target state (P%d)\n", state);
-			retval = 0;
-			goto migrate_end;
+			return 0;
 		}
 	}
 
@@ -171,7 +156,6 @@ processor_set_freq (
 	 * First we write the target state's 'control' value to the
 	 * control_register.
 	 */
-
 	value = (u32) data->acpi_data.states[state].control;
 
 	pr_debug("Transitioning to state: 0x%08x\n", value);
@@ -179,17 +163,11 @@ processor_set_freq (
 	ret = processor_set_pstate(value);
 	if (ret) {
 		pr_warn("Transition failed with error %d\n", ret);
-		retval = -ENODEV;
-		goto migrate_end;
+		return -ENODEV;
 	}
 
 	data->acpi_data.state = state;
-
-	retval = 0;
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return (retval);
+	return 0;
 }
 
 
@@ -197,11 +175,13 @@ static unsigned int
 acpi_cpufreq_get (
 	unsigned int		cpu)
 {
-	struct cpufreq_acpi_io *data = acpi_io_data[cpu];
+	struct cpufreq_acpi_req req;
+	long ret;
 
-	pr_debug("acpi_cpufreq_get\n");
+	req.cpu = cpu;
+	ret = work_on_cpu(cpu, processor_get_freq, &req);
 
-	return processor_get_freq(data, cpu);
+	return ret > 0 ? (unsigned int) ret : 0;
 }
 
 
@@ -210,7 +190,12 @@ acpi_cpufreq_target (
 	struct cpufreq_policy   *policy,
 	unsigned int index)
 {
-	return processor_set_freq(acpi_io_data[policy->cpu], policy, index);
+	struct cpufreq_acpi_req req;
+
+	req.cpu = policy->cpu;
+	req.state = index;
+
+	return work_on_cpu(cpu, processor_set_freq, &req);
 }
 
 static int

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 10/13] cpufreq/sh: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (8 preceding siblings ...)
  2017-04-12 20:07 ` [patch 09/13] cpufreq/ia64: " Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13  2:46   ` Viresh Kumar
  2017-04-15 14:20   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 11/13] cpufreq/sparc-us3: " Thomas Gleixner
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

[-- Attachment #1: cpufreq-sh--Use-work_on_cpu--.patch --]
[-- Type: text/plain, Size: 3158 bytes --]

The target() callback must run on the affected cpu. This is achieved by
temporarily setting the affinity of the calling thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
---
 drivers/cpufreq/sh-cpufreq.c |   45 +++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 18 deletions(-)

--- a/drivers/cpufreq/sh-cpufreq.c
+++ b/drivers/cpufreq/sh-cpufreq.c
@@ -30,54 +30,63 @@
 
 static DEFINE_PER_CPU(struct clk, sh_cpuclk);
 
+struct cpufreq_target {
+	struct cpufreq_policy	*policy;
+	unsigned int		freq;
+};
+
 static unsigned int sh_cpufreq_get(unsigned int cpu)
 {
 	return (clk_get_rate(&per_cpu(sh_cpuclk, cpu)) + 500) / 1000;
 }
 
-/*
- * Here we notify other drivers of the proposed change and the final change.
- */
-static int sh_cpufreq_target(struct cpufreq_policy *policy,
-			     unsigned int target_freq,
-			     unsigned int relation)
+static long __sh_cpufreq_target(void *arg)
 {
-	unsigned int cpu = policy->cpu;
+	struct cpufreq_target *target = arg;
+	struct cpufreq_policy *policy = target->policy;
+	int cpu = policy->cpu;
 	struct clk *cpuclk = &per_cpu(sh_cpuclk, cpu);
-	cpumask_t cpus_allowed;
 	struct cpufreq_freqs freqs;
 	struct device *dev;
 	long freq;
 
-	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-	BUG_ON(smp_processor_id() != cpu);
+	if (smp_processor_id() != cpu)
+		return -ENODEV;
 
 	dev = get_cpu_device(cpu);
 
 	/* Convert target_freq from kHz to Hz */
-	freq = clk_round_rate(cpuclk, target_freq * 1000);
+	freq = clk_round_rate(cpuclk, target->freq * 1000);
 
 	if (freq < (policy->min * 1000) || freq > (policy->max * 1000))
 		return -EINVAL;
 
-	dev_dbg(dev, "requested frequency %u Hz\n", target_freq * 1000);
+	dev_dbg(dev, "requested frequency %u Hz\n", target->freq * 1000);
 
 	freqs.old	= sh_cpufreq_get(cpu);
 	freqs.new	= (freq + 500) / 1000;
 	freqs.flags	= 0;
 
-	cpufreq_freq_transition_begin(policy, &freqs);
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+	cpufreq_freq_transition_begin(target->policy, &freqs);
 	clk_set_rate(cpuclk, freq);
-	cpufreq_freq_transition_end(policy, &freqs, 0);
+	cpufreq_freq_transition_end(target->policy, &freqs, 0);
 
 	dev_dbg(dev, "set frequency %lu Hz\n", freq);
-
 	return 0;
 }
 
+/*
+ * Here we notify other drivers of the proposed change and the final change.
+ */
+static int sh_cpufreq_target(struct cpufreq_policy *policy,
+			     unsigned int target_freq,
+			     unsigned int relation)
+{
+	struct cpufreq_target data = { .policy = policy, .freq = target_freq };
+
+	return work_on_cpu(policy->cpu, __sh_cpufreq_target, &data);
+}
+
 static int sh_cpufreq_verify(struct cpufreq_policy *policy)
 {
 	struct clk *cpuclk = &per_cpu(sh_cpuclk, policy->cpu);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 11/13] cpufreq/sparc-us3: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (9 preceding siblings ...)
  2017-04-12 20:07 ` [patch 10/13] cpufreq/sh: " Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13  2:48   ` Viresh Kumar
  2017-04-15 14:21   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-12 20:07 ` [patch 12/13] cpufreq/sparc-us2e: " Thomas Gleixner
                   ` (2 subsequent siblings)
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

[-- Attachment #1: cpufreq-sparc-us3--Remove-affinity-games.patch --]
[-- Type: text/plain, Size: 3119 bytes --]

The access to the safari config register in the CPU frequency functions
must be executed on the target CPU. This is achieved by temporarily setting
the affinity of the calling user space thread to the requested CPU and
reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-pm@vger.kernel.org
---
 drivers/cpufreq/sparc-us3-cpufreq.c |   46 ++++++++++++------------------------
 1 file changed, 16 insertions(+), 30 deletions(-)

--- a/drivers/cpufreq/sparc-us3-cpufreq.c
+++ b/drivers/cpufreq/sparc-us3-cpufreq.c
@@ -35,22 +35,28 @@ static struct us3_freq_percpu_info *us3_
 #define SAFARI_CFG_DIV_32	0x0000000080000000UL
 #define SAFARI_CFG_DIV_MASK	0x00000000C0000000UL
 
-static unsigned long read_safari_cfg(void)
+static void read_safari_cfg(void *arg)
 {
-	unsigned long ret;
+	unsigned long ret, *val = arg;
 
 	__asm__ __volatile__("ldxa	[%%g0] %1, %0"
 			     : "=&r" (ret)
 			     : "i" (ASI_SAFARI_CONFIG));
-	return ret;
+	*val = ret;
 }
 
-static void write_safari_cfg(unsigned long val)
+static void update_safari_cfg(void *arg)
 {
+	unsigned long reg, *new_bits = arg;
+
+	read_safari_cfg(&reg);
+	reg &= ~SAFARI_CFG_DIV_MASK;
+	reg |= *new_bits;
+
 	__asm__ __volatile__("stxa	%0, [%%g0] %1\n\t"
 			     "membar	#Sync"
 			     : /* no outputs */
-			     : "r" (val), "i" (ASI_SAFARI_CONFIG)
+			     : "r" (reg), "i" (ASI_SAFARI_CONFIG)
 			     : "memory");
 }
 
@@ -78,29 +84,17 @@ static unsigned long get_current_freq(un
 
 static unsigned int us3_freq_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	unsigned long reg;
-	unsigned int ret;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-	reg = read_safari_cfg();
-	ret = get_current_freq(cpu, reg);
 
-	set_cpus_allowed_ptr(current, &cpus_allowed);
-
-	return ret;
+	if (smp_call_function_single(cpu, read_safari_cfg, &reg, 1))
+		return 0;
+	return get_current_freq(cpu, reg);
 }
 
 static int us3_freq_target(struct cpufreq_policy *policy, unsigned int index)
 {
 	unsigned int cpu = policy->cpu;
-	unsigned long new_bits, new_freq, reg;
-	cpumask_t cpus_allowed;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
+	unsigned long new_bits, new_freq;
 
 	new_freq = sparc64_get_clock_tick(cpu) / 1000;
 	switch (index) {
@@ -121,15 +115,7 @@ static int us3_freq_target(struct cpufre
 		BUG();
 	}
 
-	reg = read_safari_cfg();
-
-	reg &= ~SAFARI_CFG_DIV_MASK;
-	reg |= new_bits;
-	write_safari_cfg(reg);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
-
-	return 0;
+	return smp_call_function_single(cpu, update_safari_cfg, &new_bits, 1);
 }
 
 static int __init us3_freq_cpu_init(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 12/13] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (10 preceding siblings ...)
  2017-04-12 20:07 ` [patch 11/13] cpufreq/sparc-us3: " Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13  2:50   ` Viresh Kumar
  2017-04-13  8:19   ` [patch V2 " Thomas Gleixner
  2017-04-12 20:07 ` [patch 13/13] crypto: n2 - " Thomas Gleixner
  2017-04-13  9:02 ` [patch 00/13] sched/treewide: Clean up various racy task affinity issues Peter Zijlstra
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

[-- Attachment #1: cpufreq-sparc-us2e--Remove-affinity-games.patch --]
[-- Type: text/plain, Size: 3412 bytes --]

The access to the HBIRD_ESTAR_MODE register in the cpu frequency control
functions must happen on the target CPU. This is achieved by temporarily
setting the affinity of the calling user space thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
---
 drivers/cpufreq/sparc-us2e-cpufreq.c |   45 ++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 24 deletions(-)

--- a/drivers/cpufreq/sparc-us2e-cpufreq.c
+++ b/drivers/cpufreq/sparc-us2e-cpufreq.c
@@ -118,10 +118,6 @@ static void us2e_transition(unsigned lon
 			    unsigned long clock_tick,
 			    unsigned long old_divisor, unsigned long divisor)
 {
-	unsigned long flags;
-
-	local_irq_save(flags);
-
 	estar &= ~ESTAR_MODE_DIV_MASK;
 
 	/* This is based upon the state transition diagram in the IIe manual.  */
@@ -152,8 +148,6 @@ static void us2e_transition(unsigned lon
 	} else {
 		BUG();
 	}
-
-	local_irq_restore(flags);
 }
 
 static unsigned long index_to_estar_mode(unsigned int index)
@@ -229,48 +223,51 @@ static unsigned long estar_to_divisor(un
 	return ret;
 }
 
+static void __us2e_freq_get(void *arg)
+{
+	unsigned long *estar = arg;
+
+	*estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
+}
+
 static unsigned int us2e_freq_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	unsigned long clock_tick, estar;
 
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
 	clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+	if (smp_call_function_single(cpu, __us2e_freq_get, &estar, 1))
+		return 0;
 
 	return clock_tick / estar_to_divisor(estar);
 }
 
-static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+static void __us2e_freq_target(void *arg)
 {
-	unsigned int cpu = policy->cpu;
+	unsigned int cpu = smp_processor_id();
+	unsigned int *index = arg;
 	unsigned long new_bits, new_freq;
 	unsigned long clock_tick, divisor, old_divisor, estar;
-	cpumask_t cpus_allowed;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 
 	new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	new_bits = index_to_estar_mode(index);
-	divisor = index_to_divisor(index);
+	new_bits = index_to_estar_mode(*index);
+	divisor = index_to_divisor(*index);
 	new_freq /= divisor;
 
 	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
 	old_divisor = estar_to_divisor(estar);
 
-	if (old_divisor != divisor)
+	if (old_divisor != divisor) {
 		us2e_transition(estar, new_bits, clock_tick * 1000,
 				old_divisor, divisor);
+	}
+}
 
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+{
+	unsigned int cpu = policy->cpu;
 
-	return 0;
+	return smp_call_function_single(cpu, __us2e_freq_target, &index, 1))
 }
 
 static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch 13/13] crypto: n2 - Replace racy task affinity logic
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (11 preceding siblings ...)
  2017-04-12 20:07 ` [patch 12/13] cpufreq/sparc-us2e: " Thomas Gleixner
@ 2017-04-12 20:07 ` Thomas Gleixner
  2017-04-13  4:56   ` Herbert Xu
  2017-04-13  8:20   ` [patch V2 " Thomas Gleixner
  2017-04-13  9:02 ` [patch 00/13] sched/treewide: Clean up various racy task affinity issues Peter Zijlstra
  13 siblings, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:07 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-crypto

[-- Attachment #1: crypto--n2---Use-work_on_cpu---instead-of-affinity-games.patch --]
[-- Type: text/plain, Size: 2324 bytes --]

spu_queue_register() needs to invoke setup functions on a particular
CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
---
 drivers/crypto/n2_core.c |   31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -65,6 +65,11 @@ struct spu_queue {
 	struct list_head	list;
 };
 
+struct spu_qreg {
+	struct spu_queue	*queue;
+	unsigned long		type;
+};
+
 static struct spu_queue **cpu_to_cwq;
 static struct spu_queue **cpu_to_mau;
 
@@ -1631,31 +1636,27 @@ static void queue_cache_destroy(void)
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
 }
 
-static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+static long spu_queue_register_workfn(void *arg)
 {
-	cpumask_var_t old_allowed;
+	struct spu_qreg *qr = arg;
+	struct spu_queue *p = qr->queue;
+	unsigned long q_type = qr->type;
 	unsigned long hv_ret;
 
-	if (cpumask_empty(&p->sharing))
-		return -EINVAL;
-
-	if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL))
-		return -ENOMEM;
-
-	cpumask_copy(old_allowed, &current->cpus_allowed);
-
-	set_cpus_allowed_ptr(current, &p->sharing);
-
 	hv_ret = sun4v_ncs_qconf(q_type, __pa(p->q),
 				 CWQ_NUM_ENTRIES, &p->qhandle);
 	if (!hv_ret)
 		sun4v_ncs_sethead_marker(p->qhandle, 0);
 
-	set_cpus_allowed_ptr(current, old_allowed);
+	return hv_ret ? -EINVAL : 0;
+}
 
-	free_cpumask_var(old_allowed);
+static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+{
+	int cpu = cpumask_any_and(&p->sharing, cpu_online_mask);
+	struct spu_qreg qr = { .queue = p, .type = q_type };
 
-	return (hv_ret ? -EINVAL : 0);
+	return work_on_cpu_safe(cpu, &qr);
 }
 
 static int spu_queue_setup(struct spu_queue *p)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V 2 04/13] ia64/sn/hwperf: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 04/13] ia64/sn/hwperf: " Thomas Gleixner
@ 2017-04-12 20:53   ` Thomas Gleixner
  2017-04-15 14:17     ` [tip:sched/core] " tip-bot for Thomas Gleixner
  0 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:53 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-ia64

Subject: ia64/sn/hwperf: Replace racy task affinity logic
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 06 Apr 2017 14:56:18 +0200

sn_hwperf_op_cpu() which is invoked from an ioctl requires to run code on
the requested cpu. This is achieved by temporarily setting the affinity of
the calling user space thread to the requested CPU and reset it to the
original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
---

V2: Use the work function for real and remove the unused variable (reported
    by kbuild robot)

 arch/ia64/sn/kernel/sn2/sn_hwperf.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Index: b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
===================================================================
--- a/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -598,12 +598,17 @@ static void sn_hwperf_call_sal(void *inf
 	op_info->ret = r;
 }
 
+static long sn_hwperf_call_sal_work(void *info)
+{
+	sn_hwperf_call_sal(info);
+	return 0;
+}
+
 static int sn_hwperf_op_cpu(struct sn_hwperf_op_info *op_info)
 {
 	u32 cpu;
 	u32 use_ipi;
 	int r = 0;
-	cpumask_t save_allowed;
 	
 	cpu = (op_info->a->arg & SN_HWPERF_ARG_CPU_MASK) >> 32;
 	use_ipi = op_info->a->arg & SN_HWPERF_ARG_USE_IPI_MASK;
@@ -629,13 +634,9 @@ static int sn_hwperf_op_cpu(struct sn_hw
 			/* use an interprocessor interrupt to call SAL */
 			smp_call_function_single(cpu, sn_hwperf_call_sal,
 				op_info, 1);
-		}
-		else {
-			/* migrate the task before calling SAL */ 
-			save_allowed = current->cpus_allowed;
-			set_cpus_allowed_ptr(current, cpumask_of(cpu));
-			sn_hwperf_call_sal(op_info);
-			set_cpus_allowed_ptr(current, &save_allowed);
+		} else {
+			/* Call on the target CPU */
+			work_on_cpu_safe(cpu, sn_hwperf_call_sal_work, op_info);
 		}
 	}
 	r = op_info->ret;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V2 09/13] cpufreq/ia64: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 09/13] cpufreq/ia64: " Thomas Gleixner
@ 2017-04-12 20:55   ` Thomas Gleixner
  2017-04-15 14:20     ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-13  2:42   ` [patch 09/13] " Viresh Kumar
  1 sibling, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-12 20:55 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

The get() and target() callbacks must run on the affected cpu. This is
achieved by temporarily setting the affinity of the calling thread to the
requested CPU and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-pm@vger.kernel.org
---

V2: Fixup kbuild-robot complaints

 drivers/cpufreq/ia64-acpi-cpufreq.c |   92 +++++++++++++++---------------------
 1 file changed, 39 insertions(+), 53 deletions(-)

Index: b/drivers/cpufreq/ia64-acpi-cpufreq.c
===================================================================
--- a/drivers/cpufreq/ia64-acpi-cpufreq.c
+++ b/drivers/cpufreq/ia64-acpi-cpufreq.c
@@ -34,6 +34,11 @@ struct cpufreq_acpi_io {
 	unsigned int				resume;
 };
 
+struct cpufreq_acpi_req {
+	unsigned int		cpu;
+	unsigned int		state;
+};
+
 static struct cpufreq_acpi_io	*acpi_io_data[NR_CPUS];
 
 static struct cpufreq_driver acpi_cpufreq_driver;
@@ -83,8 +88,7 @@ processor_get_pstate (
 static unsigned
 extract_clock (
 	struct cpufreq_acpi_io *data,
-	unsigned value,
-	unsigned int cpu)
+	unsigned value)
 {
 	unsigned long i;
 
@@ -98,60 +102,43 @@ extract_clock (
 }
 
 
-static unsigned int
+static long
 processor_get_freq (
-	struct cpufreq_acpi_io	*data,
-	unsigned int		cpu)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	unsigned long 		clock_freq;
+	struct cpufreq_acpi_req *req = arg;
+	unsigned int		cpu = req->cpu;
+	struct cpufreq_acpi_io	*data = acpi_io_data[cpu];
+	u32			value;
+	int			ret;
 
 	pr_debug("processor_get_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 	if (smp_processor_id() != cpu)
-		goto migrate_end;
+		return -EAGAIN;
 
 	/* processor_get_pstate gets the instantaneous frequency */
 	ret = processor_get_pstate(&value);
-
 	if (ret) {
-		set_cpus_allowed_ptr(current, &saved_mask);
 		pr_warn("get performance failed with error %d\n", ret);
-		ret = 0;
-		goto migrate_end;
+		return ret;
 	}
-	clock_freq = extract_clock(data, value, cpu);
-	ret = (clock_freq*1000);
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return ret;
+	return 1000 * extract_clock(data, value);
 }
 
 
-static int
+static long
 processor_set_freq (
-	struct cpufreq_acpi_io	*data,
-	struct cpufreq_policy   *policy,
-	int			state)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	int			retval;
+	struct cpufreq_acpi_req *req = arg;
+	unsigned int		cpu = req->cpu;
+	struct cpufreq_acpi_io	*data = acpi_io_data[cpu];
+	int			ret, state = req->state;
+	u32			value;
 
 	pr_debug("processor_set_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(policy->cpu));
-	if (smp_processor_id() != policy->cpu) {
-		retval = -EAGAIN;
-		goto migrate_end;
-	}
+	if (smp_processor_id() != cpu)
+		return -EAGAIN;
 
 	if (state == data->acpi_data.state) {
 		if (unlikely(data->resume)) {
@@ -159,8 +146,7 @@ processor_set_freq (
 			data->resume = 0;
 		} else {
 			pr_debug("Already at target state (P%d)\n", state);
-			retval = 0;
-			goto migrate_end;
+			return 0;
 		}
 	}
 
@@ -171,7 +157,6 @@ processor_set_freq (
 	 * First we write the target state's 'control' value to the
 	 * control_register.
 	 */
-
 	value = (u32) data->acpi_data.states[state].control;
 
 	pr_debug("Transitioning to state: 0x%08x\n", value);
@@ -179,17 +164,11 @@ processor_set_freq (
 	ret = processor_set_pstate(value);
 	if (ret) {
 		pr_warn("Transition failed with error %d\n", ret);
-		retval = -ENODEV;
-		goto migrate_end;
+		return -ENODEV;
 	}
 
 	data->acpi_data.state = state;
-
-	retval = 0;
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return (retval);
+	return 0;
 }
 
 
@@ -197,11 +176,13 @@ static unsigned int
 acpi_cpufreq_get (
 	unsigned int		cpu)
 {
-	struct cpufreq_acpi_io *data = acpi_io_data[cpu];
+	struct cpufreq_acpi_req req;
+	long ret;
 
-	pr_debug("acpi_cpufreq_get\n");
+	req.cpu = cpu;
+	ret = work_on_cpu(cpu, processor_get_freq, &req);
 
-	return processor_get_freq(data, cpu);
+	return ret > 0 ? (unsigned int) ret : 0;
 }
 
 
@@ -210,7 +191,12 @@ acpi_cpufreq_target (
 	struct cpufreq_policy   *policy,
 	unsigned int index)
 {
-	return processor_set_freq(acpi_io_data[policy->cpu], policy, index);
+	struct cpufreq_acpi_req req;
+
+	req.cpu = policy->cpu;
+	req.state = index;
+
+	return work_on_cpu(req.cpu, processor_set_freq, &req);
 }
 
 static int

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 06/13] sparc/sysfs: Replace racy task affinity logic
  2017-04-12 20:07   ` Thomas Gleixner
@ 2017-04-13  1:52     ` David Miller
  -1 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2017-04-13  1:52 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	sparclinux

From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 12 Apr 2017 22:07:32 +0200

> @@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
>  static ssize_t show_mmustat_enable(struct device *s,
>  				struct device_attribute *attr, char *buf)
>  {
> -	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
> +	long val = work_on_cpu(s->id, read_mmustat_enable, 0);

Last argument to work_on_cpu() should be NULL since it's a pointer
I guess.

But otherwise I have no problems with this:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 06/13] sparc/sysfs: Replace racy task affinity logic
@ 2017-04-13  1:52     ` David Miller
  0 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2017-04-13  1:52 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	sparclinux

From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 12 Apr 2017 22:07:32 +0200

> @@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
>  static ssize_t show_mmustat_enable(struct device *s,
>  				struct device_attribute *attr, char *buf)
>  {
> -	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
> +	long val = work_on_cpu(s->id, read_mmustat_enable, 0);

Last argument to work_on_cpu() should be NULL since it's a pointer
I guess.

But otherwise I have no problems with this:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 09/13] cpufreq/ia64: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 09/13] cpufreq/ia64: " Thomas Gleixner
  2017-04-12 20:55   ` [patch V2 " Thomas Gleixner
@ 2017-04-13  2:42   ` Viresh Kumar
  1 sibling, 0 replies; 56+ messages in thread
From: Viresh Kumar @ 2017-04-13  2:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, linux-pm

On 12-04-17, 22:07, Thomas Gleixner wrote:
> The get() and target() callbacks must run on the affected cpu. This is
> achieved by temporarily setting the affinity of the calling thread to the
> requested CPU and reset it to the original affinity afterwards.
> 
> That's racy vs. concurrent affinity settings for that thread resulting in
> code executing on the wrong CPU and overwriting the new affinity setting.
> 
> Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
> already protected against CPU hotplug.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: linux-pm@vger.kernel.org
> ---
>  drivers/cpufreq/ia64-acpi-cpufreq.c |   91 +++++++++++++++---------------------
>  1 file changed, 38 insertions(+), 53 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 10/13] cpufreq/sh: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 10/13] cpufreq/sh: " Thomas Gleixner
@ 2017-04-13  2:46   ` Viresh Kumar
  2017-04-15 14:20   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: Viresh Kumar @ 2017-04-13  2:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, linux-pm

On 12-04-17, 22:07, Thomas Gleixner wrote:
> The target() callback must run on the affected cpu. This is achieved by
> temporarily setting the affinity of the calling thread to the requested CPU
> and reset it to the original affinity afterwards.
> 
> That's racy vs. concurrent affinity settings for that thread resulting in
> code executing on the wrong CPU.
> 
> Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
> already protected against CPU hotplug.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: linux-pm@vger.kernel.org
> ---
>  drivers/cpufreq/sh-cpufreq.c |   45 +++++++++++++++++++++++++------------------
>  1 file changed, 27 insertions(+), 18 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 11/13] cpufreq/sparc-us3: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 11/13] cpufreq/sparc-us3: " Thomas Gleixner
@ 2017-04-13  2:48   ` Viresh Kumar
  2017-04-15 14:21   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: Viresh Kumar @ 2017-04-13  2:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, linux-pm

On 12-04-17, 22:07, Thomas Gleixner wrote:
> The access to the safari config register in the CPU frequency functions
> must be executed on the target CPU. This is achieved by temporarily setting
> the affinity of the calling user space thread to the requested CPU and
> reset it to the original affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> Replace it by a straight forward smp function call. 
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: linux-pm@vger.kernel.org
> ---
>  drivers/cpufreq/sparc-us3-cpufreq.c |   46 ++++++++++++------------------------
>  1 file changed, 16 insertions(+), 30 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 12/13] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 12/13] cpufreq/sparc-us2e: " Thomas Gleixner
@ 2017-04-13  2:50   ` Viresh Kumar
  2017-04-13  8:19   ` [patch V2 " Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: Viresh Kumar @ 2017-04-13  2:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, linux-pm

On 12-04-17, 22:07, Thomas Gleixner wrote:
> The access to the HBIRD_ESTAR_MODE register in the cpu frequency control
> functions must happen on the target CPU. This is achieved by temporarily
> setting the affinity of the calling user space thread to the requested CPU
> and reset it to the original affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> Replace it by a straight forward smp function call. 
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: linux-pm@vger.kernel.org
> ---
>  drivers/cpufreq/sparc-us2e-cpufreq.c |   45 ++++++++++++++++-------------------
>  1 file changed, 21 insertions(+), 24 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 13/13] crypto: n2 - Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 13/13] crypto: n2 - " Thomas Gleixner
@ 2017-04-13  4:56   ` Herbert Xu
  2017-04-13  8:20   ` [patch V2 " Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: Herbert Xu @ 2017-04-13  4:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-crypto

On Wed, Apr 12, 2017 at 10:07:39PM +0200, Thomas Gleixner wrote:
> spu_queue_register() needs to invoke setup functions on a particular
> CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> Replace it by using work_on_cpu_safe() which guarantees to run the code on
> the requested CPU or to fail in case the CPU is offline.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: linux-crypto@vger.kernel.org

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 05/13] powerpc/smp: Replace open coded task affinity logic
  2017-04-12 20:07   ` Thomas Gleixner
  (?)
@ 2017-04-13  5:47   ` Michael Ellerman
  -1 siblings, 0 replies; 56+ messages in thread
From: Michael Ellerman @ 2017-04-13  5:47 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar, Paul Mackerras, linuxppc-dev

Thomas Gleixner <tglx@linutronix.de> writes:

> Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
> run on any online CPU at this point, but the setup_cpu() callback requires
> to be invoked on the boot CPU. This is achieved by temporarily setting the
> affinity of the calling user space thread to the requested CPU and reset it
> to the original affinity afterwards.
>
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
>
> That's actually not a problem in this context as neither CPU hotplug nor
> affinity settings can happen, but the access to task_struct::cpus_allowed
> is about to restricted.
>
> Replace it with a call to work_on_cpu_safe() which achieves the same result.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/kernel/smp.c |   26 +++++++++++---------------
>  1 file changed, 11 insertions(+), 15 deletions(-)

LGTM.

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V2 06/13] sparc/sysfs: Replace racy task affinity logic
  2017-04-13  1:52     ` David Miller
@ 2017-04-13  8:17       ` Thomas Gleixner
  -1 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13  8:17 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	sparclinux

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
---

V2: Use NULL instead of 0 (David), fix compile breakage (build-bot)

 arch/sparc/kernel/sysfs.c |   39 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 28 deletions(-)

Index: b/arch/sparc/kernel/sysfs.c
===================================================================
--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_g
 	.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-			        unsigned long (*func)(unsigned long),
-				unsigned long arg)
-{
-	cpumask_t old_affinity;
-	unsigned long ret;
-
-	cpumask_copy(&old_affinity, &current->cpus_allowed);
-	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-		return 0;
-
-	ret = func(arg);
-
-	set_cpus_allowed_ptr(current, &old_affinity);
-
-	return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
 	unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable
 	return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-	unsigned long ra, orig_ra;
+	unsigned long ra, orig_ra, *val = data;
 
-	if (val)
+	if (*val)
 		ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
 	else
 		ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
 static ssize_t show_mmustat_enable(struct device *s,
 				struct device_attribute *attr, char *buf)
 {
-	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+	long val = work_on_cpu(s->id, read_mmustat_enable, NULL);
+
 	return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,15 @@ static ssize_t store_mmustat_enable(stru
 			struct device_attribute *attr, const char *buf,
 			size_t count)
 {
-	unsigned long val, err;
-	int ret = sscanf(buf, "%lu", &val);
+	unsigned long val;
+	long err;
+	int ret;
 
+	ret = sscanf(buf, "%lu", &val);
 	if (ret != 1)
 		return -EINVAL;
 
-	err = run_on_cpu(s->id, write_mmustat_enable, val);
+	err = work_on_cpu(s->id, write_mmustat_enable, &val);
 	if (err)
 		return -EIO;
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V2 06/13] sparc/sysfs: Replace racy task affinity logic
@ 2017-04-13  8:17       ` Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13  8:17 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	sparclinux

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
---

V2: Use NULL instead of 0 (David), fix compile breakage (build-bot)

 arch/sparc/kernel/sysfs.c |   39 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 28 deletions(-)

Index: b/arch/sparc/kernel/sysfs.c
=================================--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_g
 	.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-			        unsigned long (*func)(unsigned long),
-				unsigned long arg)
-{
-	cpumask_t old_affinity;
-	unsigned long ret;
-
-	cpumask_copy(&old_affinity, &current->cpus_allowed);
-	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-		return 0;
-
-	ret = func(arg);
-
-	set_cpus_allowed_ptr(current, &old_affinity);
-
-	return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
 	unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable
 	return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-	unsigned long ra, orig_ra;
+	unsigned long ra, orig_ra, *val = data;
 
-	if (val)
+	if (*val)
 		ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
 	else
 		ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enabl
 static ssize_t show_mmustat_enable(struct device *s,
 				struct device_attribute *attr, char *buf)
 {
-	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+	long val = work_on_cpu(s->id, read_mmustat_enable, NULL);
+
 	return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,15 @@ static ssize_t store_mmustat_enable(stru
 			struct device_attribute *attr, const char *buf,
 			size_t count)
 {
-	unsigned long val, err;
-	int ret = sscanf(buf, "%lu", &val);
+	unsigned long val;
+	long err;
+	int ret;
 
+	ret = sscanf(buf, "%lu", &val);
 	if (ret != 1)
 		return -EINVAL;
 
-	err = run_on_cpu(s->id, write_mmustat_enable, val);
+	err = work_on_cpu(s->id, write_mmustat_enable, &val);
 	if (err)
 		return -EIO;
 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V2 12/13] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 12/13] cpufreq/sparc-us2e: " Thomas Gleixner
  2017-04-13  2:50   ` Viresh Kumar
@ 2017-04-13  8:19   ` Thomas Gleixner
  2017-04-13  8:22     ` [patch V3 " Thomas Gleixner
  2017-04-13 14:50     ` [patch V2 12/13] " David Miller
  1 sibling, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13  8:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

spu_queue_register() needs to invoke setup functions on a particular
CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
---

V2: Fixup build-bot complaint

 drivers/crypto/n2_core.c |   31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -65,6 +65,11 @@ struct spu_queue {
 	struct list_head	list;
 };
 
+struct spu_qreg {
+	struct spu_queue	*queue;
+	unsigned long		type;
+};
+
 static struct spu_queue **cpu_to_cwq;
 static struct spu_queue **cpu_to_mau;
 
@@ -1631,31 +1636,27 @@ static void queue_cache_destroy(void)
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
 }
 
-static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+static long spu_queue_register_workfn(void *arg)
 {
-	cpumask_var_t old_allowed;
+	struct spu_qreg *qr = arg;
+	struct spu_queue *p = qr->queue;
+	unsigned long q_type = qr->type;
 	unsigned long hv_ret;
 
-	if (cpumask_empty(&p->sharing))
-		return -EINVAL;
-
-	if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL))
-		return -ENOMEM;
-
-	cpumask_copy(old_allowed, &current->cpus_allowed);
-
-	set_cpus_allowed_ptr(current, &p->sharing);
-
 	hv_ret = sun4v_ncs_qconf(q_type, __pa(p->q),
 				 CWQ_NUM_ENTRIES, &p->qhandle);
 	if (!hv_ret)
 		sun4v_ncs_sethead_marker(p->qhandle, 0);
 
-	set_cpus_allowed_ptr(current, old_allowed);
+	return hv_ret ? -EINVAL : 0;
+}
 
-	free_cpumask_var(old_allowed);
+static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+{
+	int cpu = cpumask_any_and(&p->sharing, cpu_online_mask);
+	struct spu_qreg qr = { .queue = p, .type = q_type };
 
-	return (hv_ret ? -EINVAL : 0);
+	return work_on_cpu_safe(cpu, spu_queue_register_workfn, &qr);
 }
 
 static int spu_queue_setup(struct spu_queue *p)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V2 13/13] crypto: n2 - Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 13/13] crypto: n2 - " Thomas Gleixner
  2017-04-13  4:56   ` Herbert Xu
@ 2017-04-13  8:20   ` Thomas Gleixner
  2017-04-13 14:51     ` David Miller
  2017-04-15 14:22     ` [tip:sched/core] crypto: N2 " tip-bot for Thomas Gleixner
  1 sibling, 2 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13  8:20 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-crypto

spu_queue_register() needs to invoke setup functions on a particular
CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
---

V2: Fixup build-bot complaints

 drivers/crypto/n2_core.c |   31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -65,6 +65,11 @@ struct spu_queue {
 	struct list_head	list;
 };
 
+struct spu_qreg {
+	struct spu_queue	*queue;
+	unsigned long		type;
+};
+
 static struct spu_queue **cpu_to_cwq;
 static struct spu_queue **cpu_to_mau;
 
@@ -1631,31 +1636,27 @@ static void queue_cache_destroy(void)
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
 }
 
-static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+static long spu_queue_register_workfn(void *arg)
 {
-	cpumask_var_t old_allowed;
+	struct spu_qreg *qr = arg;
+	struct spu_queue *p = qr->queue;
+	unsigned long q_type = qr->type;
 	unsigned long hv_ret;
 
-	if (cpumask_empty(&p->sharing))
-		return -EINVAL;
-
-	if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL))
-		return -ENOMEM;
-
-	cpumask_copy(old_allowed, &current->cpus_allowed);
-
-	set_cpus_allowed_ptr(current, &p->sharing);
-
 	hv_ret = sun4v_ncs_qconf(q_type, __pa(p->q),
 				 CWQ_NUM_ENTRIES, &p->qhandle);
 	if (!hv_ret)
 		sun4v_ncs_sethead_marker(p->qhandle, 0);
 
-	set_cpus_allowed_ptr(current, old_allowed);
+	return hv_ret ? -EINVAL : 0;
+}
 
-	free_cpumask_var(old_allowed);
+static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+{
+	int cpu = cpumask_any_and(&p->sharing, cpu_online_mask);
+	struct spu_qreg qr = { .queue = p, .type = q_type };
 
-	return (hv_ret ? -EINVAL : 0);
+	return work_on_cpu_safe(cpu, spu_queue_register_workfn, &qr);
 }
 
 static int spu_queue_setup(struct spu_queue *p)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [patch V3 12/13] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-13  8:19   ` [patch V2 " Thomas Gleixner
@ 2017-04-13  8:22     ` Thomas Gleixner
  2017-04-15 14:21       ` [tip:sched/core] " tip-bot for Thomas Gleixner
  2017-04-13 14:50     ` [patch V2 12/13] " David Miller
  1 sibling, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13  8:22 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar, linux-pm

The access to the HBIRD_ESTAR_MODE register in the cpu frequency control
functions must happen on the target CPU. This is achieved by temporarily
setting the affinity of the calling user space thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
---

V3: Include the proper patch
V2: Fix build-bot complaint (-ENOTAWAKEYET / -EWRONGPATCH)
 
 drivers/cpufreq/sparc-us2e-cpufreq.c |   45 ++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 24 deletions(-)

Index: b/drivers/cpufreq/sparc-us2e-cpufreq.c
===================================================================
--- a/drivers/cpufreq/sparc-us2e-cpufreq.c
+++ b/drivers/cpufreq/sparc-us2e-cpufreq.c
@@ -118,10 +118,6 @@ static void us2e_transition(unsigned lon
 			    unsigned long clock_tick,
 			    unsigned long old_divisor, unsigned long divisor)
 {
-	unsigned long flags;
-
-	local_irq_save(flags);
-
 	estar &= ~ESTAR_MODE_DIV_MASK;
 
 	/* This is based upon the state transition diagram in the IIe manual.  */
@@ -152,8 +148,6 @@ static void us2e_transition(unsigned lon
 	} else {
 		BUG();
 	}
-
-	local_irq_restore(flags);
 }
 
 static unsigned long index_to_estar_mode(unsigned int index)
@@ -229,48 +223,51 @@ static unsigned long estar_to_divisor(un
 	return ret;
 }
 
+static void __us2e_freq_get(void *arg)
+{
+	unsigned long *estar = arg;
+
+	*estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
+}
+
 static unsigned int us2e_freq_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	unsigned long clock_tick, estar;
 
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
 	clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+	if (smp_call_function_single(cpu, __us2e_freq_get, &estar, 1))
+		return 0;
 
 	return clock_tick / estar_to_divisor(estar);
 }
 
-static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+static void __us2e_freq_target(void *arg)
 {
-	unsigned int cpu = policy->cpu;
+	unsigned int cpu = smp_processor_id();
+	unsigned int *index = arg;
 	unsigned long new_bits, new_freq;
 	unsigned long clock_tick, divisor, old_divisor, estar;
-	cpumask_t cpus_allowed;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 
 	new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	new_bits = index_to_estar_mode(index);
-	divisor = index_to_divisor(index);
+	new_bits = index_to_estar_mode(*index);
+	divisor = index_to_divisor(*index);
 	new_freq /= divisor;
 
 	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
 	old_divisor = estar_to_divisor(estar);
 
-	if (old_divisor != divisor)
+	if (old_divisor != divisor) {
 		us2e_transition(estar, new_bits, clock_tick * 1000,
 				old_divisor, divisor);
+	}
+}
 
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+{
+	unsigned int cpu = policy->cpu;
 
-	return 0;
+	return smp_call_function_single(cpu, __us2e_freq_target, &index, 1);
 }
 
 static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 00/13] sched/treewide: Clean up various racy task affinity issues
  2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
                   ` (12 preceding siblings ...)
  2017-04-12 20:07 ` [patch 13/13] crypto: n2 - " Thomas Gleixner
@ 2017-04-13  9:02 ` Peter Zijlstra
  13 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2017-04-13  9:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar

On Wed, Apr 12, 2017 at 10:07:26PM +0200, Thomas Gleixner wrote:
> While dealing with the fallout of the scheduler cleanups on RT, we found
> several racy usage sites of the following scheme:
> 
> 	cpumask_copy(&save_cpus_allowed, &current->cpus_allowed);
> 	set_cpus_allowed_ptr(current, cpumask_of(cpu));
> 	do_stuff();
> 	set_cpus_allowed_ptr(current, &save_cpus_allowed);
> 
> That's racy in two aspects:
> 
> 1) Nothing prevents the CPU from being unplugged after the temporary
>    affinity setting is in place. This results on code being executed on the
>    wrong CPU(s).
> 
> 2) Nothing prevents a concurrent affinity setting from user space. That
>    also results in code being executed on the wrong CPU(s) and the restore
>    of the previous affinity setting overwrites the new one.
> 
> Various variants of cleanups:
> 
>  - Removal, because the calling thread is already guaranteed to run on the
>    correct CPU.
> 
>  - Conversion to smp function calls (simple register read/write)
> 
>  - Conversion to work_on_cpu(). There were even files containing comments
>    to that effect.
> 
>  - The rest needs seperate hotplug protection for work_on_cpu(). To avoid open
>    coding the
> 
> 	get_online_cpus();
> 	if (cpu_online(cpu))
> 		ret = do_stuff();
> 	else
> 		ret = -ENODEV;
> 	put_online_cpus();
> 
>    scheme this series provides a new helper function work_on_cpu_safe()
>    which implements the above.
> 
> Aside of fixing these races this allows to restrict the access to
> current->cpus_allowed with a follow up series.

Looks good, thanks for tackling these!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
@ 2017-04-13 11:11   ` Dou Liyang
  2017-04-13 21:28     ` Thomas Gleixner
  2017-04-14  4:18   ` Tejun Heo
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 56+ messages in thread
From: Dou Liyang @ 2017-04-13 11:11 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar



At 04/13/2017 04:07 AM, Thomas Gleixner wrote:
> work_on_cpu() is not protected against CPU hotplug. For code which requires
> to be either executed on an online CPU or to fail if the CPU is not
> available the callsite would have to protect against CPU hotplug.
>
> Provide a function which does get/put_online_cpus() around the call to
> work_on_cpu() and fails the call with -ENODEV if the target CPU is not
> online.
>
> Preparatory patch to convert several racy task affinity manipulations.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> ---
>  include/linux/workqueue.h |    5 +++++
>  kernel/workqueue.c        |   23 +++++++++++++++++++++++
>  2 files changed, 28 insertions(+)
>
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -608,8 +608,13 @@ static inline long work_on_cpu(int cpu,
>  {
>  	return fn(arg);
>  }
> +static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
> +{
> +	return fn(arg);
> +}
>  #else
>  long work_on_cpu(int cpu, long (*fn)(void *), void *arg);
> +long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg);
>  #endif /* CONFIG_SMP */
>
>  #ifdef CONFIG_FREEZER
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -4735,6 +4735,29 @@ long work_on_cpu(int cpu, long (*fn)(voi
>  	return wfc.ret;
>  }
>  EXPORT_SYMBOL_GPL(work_on_cpu);
> +
> +/**

There is an extra '*' :)

Thanks,
	Liyang

> + * work_on_cpu_safe - run a function in thread context on a particular cpu
> + * @cpu: the cpu to run on
> + * @fn:  the function to run
> + * @arg: the function argument
> + *
> + * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold
> + * any locks which would prevent @fn from completing.
> + *
> + * Return: The value @fn returns.
> + */
> +long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
> +{
> +	long ret = -ENODEV;
> +
> +	get_online_cpus();
> +	if (cpu_online(cpu))
> +		ret = work_on_cpu(cpu, fn, arg);
> +	put_online_cpus();
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(work_on_cpu_safe);
>  #endif /* CONFIG_SMP */
>
>  #ifdef CONFIG_FREEZER

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 08/13] ACPI/processor: Replace racy task affinity logic.
  2017-04-12 20:07 ` [patch 08/13] ACPI/processor: Replace racy task affinity logic Thomas Gleixner
@ 2017-04-13 11:39   ` Peter Zijlstra
  2017-04-13 12:01     ` Thomas Gleixner
  2017-04-15 14:19   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2017-04-13 11:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar, linux-acpi

On Wed, Apr 12, 2017 at 10:07:34PM +0200, Thomas Gleixner wrote:
> acpi_processor_get_throttling() requires to invoke the getter function on
> the target CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> acpi_processor_get_throttling() is invoked in two ways:
> 
> 1) The CPU online callback, which is already running on the target CPU and
>    obviously protected against hotplug and not affected by affinity
>    settings.
> 
> 2) The ACPI driver probe function, which is not protected against hotplug
>    during modprobe.
> 
> Switch it over to work_on_cpu() and protect the probe function against CPU
> hotplug.
> 

> +static int acpi_processor_get_throttling(struct acpi_processor *pr)
> +{
>  	if (!pr)
>  		return -EINVAL;
>  
>  	if (!pr->flags.throttling)
>  		return -ENODEV;
>  
> +	 * This is either called from the CPU hotplug callback of
> +	 * processor_driver or via the ACPI probe function. In the latter
> +	 * case the CPU is not guaranteed to be online. Both call sites are
> +	 * protected against CPU hotplug.
>  	 */
> +	if (!cpu_online(pr->id))
>  		return -ENODEV;
>  
> +	return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
>  }

That makes my machine sad...


[    9.583030] =============================================
[    9.589053] [ INFO: possible recursive locking detected ]
[    9.595079] 4.11.0-rc6-00385-g5aee78a-dirty #678 Not tainted
[    9.601393] ---------------------------------------------
[    9.607418] kworker/0:0/3 is trying to acquire lock:
[    9.612954]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110c172>] flush_work+0x12/0x2a0
[    9.621406] 
[    9.621406] but task is already holding lock:
[    9.627915]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.637044] 
[    9.637044] other info that might help us debug this:
[    9.644330]  Possible unsafe locking scenario:
[    9.644330] 
[    9.650934]        CPU0
[    9.653660]        ----
[    9.656386]   lock((&wfc.work));
[    9.659987]   lock((&wfc.work));
[    9.663586] 
[    9.663586]  *** DEADLOCK ***
[    9.663586] 
[    9.670189]  May be due to missing lock nesting notation
[    9.670189] 
[    9.677765] 2 locks held by kworker/0:0/3:
[    9.682332]  #0:  ("events"){.+.+.+}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.691654]  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] process_one_work+0x1e7/0x670
[    9.701267] 
[    9.701267] stack backtrace:
[    9.706127] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.11.0-rc6-00385-g5aee78a-dirty #678
[    9.715545] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[    9.726999] Workqueue: events work_for_cpu_fn
[    9.731860] Call Trace:
[    9.734591]  dump_stack+0x86/0xcf
[    9.738290]  __lock_acquire+0x790/0x1620
[    9.742667]  ? __lock_acquire+0x4a5/0x1620
[    9.747237]  lock_acquire+0x100/0x210
[    9.751319]  ? lock_acquire+0x100/0x210
[    9.755596]  ? flush_work+0x12/0x2a0
[    9.759583]  flush_work+0x47/0x2a0
[    9.763375]  ? flush_work+0x12/0x2a0
[    9.767362]  ? queue_work_on+0x47/0xa0
[    9.771545]  ? __this_cpu_preempt_check+0x13/0x20
[    9.776792]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.782139]  ? trace_hardirqs_on+0xd/0x10
[    9.786610]  work_on_cpu+0x82/0x90
[    9.790404]  ? __usermodehelper_disable+0x110/0x110
[    9.795846]  ? __acpi_processor_get_throttling+0x20/0x20
[    9.801773]  acpi_processor_set_throttling+0x199/0x220
[    9.807506]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.812851]  acpi_processor_get_throttling_ptc+0xec/0x180
[    9.818876]  __acpi_processor_get_throttling+0xf/0x20
[    9.824511]  work_for_cpu_fn+0x14/0x20
[    9.828692]  process_one_work+0x261/0x670
[    9.833165]  worker_thread+0x21b/0x3f0
[    9.837348]  kthread+0x108/0x140
[    9.840947]  ? process_one_work+0x670/0x670
[    9.845611]  ? kthread_create_on_node+0x40/0x40
[    9.850667]  ret_from_fork+0x31/0x40

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 08/13] ACPI/processor: Replace racy task affinity logic.
  2017-04-13 11:39   ` Peter Zijlstra
@ 2017-04-13 12:01     ` Thomas Gleixner
  2017-04-13 12:52       ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13 12:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar, linux-acpi

On Thu, 13 Apr 2017, Peter Zijlstra wrote:
> 
> That makes my machine sad...
> [    9.786610]  work_on_cpu+0x82/0x90
> [    9.790404]  ? __usermodehelper_disable+0x110/0x110
> [    9.795846]  ? __acpi_processor_get_throttling+0x20/0x20
> [    9.801773]  acpi_processor_set_throttling+0x199/0x220
> [    9.807506]  ? trace_hardirqs_on_caller+0xfb/0x1d0
> [    9.812851]  acpi_processor_get_throttling_ptc+0xec/0x180
> [    9.818876]  __acpi_processor_get_throttling+0xf/0x20
> [    9.824511]  work_for_cpu_fn+0x14/0x20
> [    9.828692]  process_one_work+0x261/0x670
> [    9.833165]  worker_thread+0x21b/0x3f0
> [    9.837348]  kthread+0x108/0x140
> [    9.840947]  ? process_one_work+0x670/0x670
> [    9.845611]  ? kthread_create_on_node+0x40/0x40
> [    9.850667]  ret_from_fork+0x31/0x40

Yuck. So the call chain is:

acpi_processor_get_throttling()
  work_on_cpu(acpi_processor_get_throttling)

That work does:

__acpi_processor_get_throttling()
    acpi_processor_get_throttling_ptc()
      acpi_processor_set_throttling()
        work_on_cpu(__acpi_processor_set_throttling)

Why the heck calls a get_throttling() function set_throttling()? I'm mildly
surprised.

Does the delta patch below cure the problem?

Thanks,

	tglx

8<--------------

--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -62,8 +62,8 @@ struct acpi_processor_throttling_arg {
 #define THROTTLING_POSTCHANGE      (2)
 
 static int acpi_processor_get_throttling(struct acpi_processor *pr);
-int acpi_processor_set_throttling(struct acpi_processor *pr,
-						int state, bool force);
+static int __acpi_processor_set_throttling(struct acpi_processor *pr,
+					   int state, bool force, bool direct);
 
 static int acpi_processor_update_tsd_coord(void)
 {
@@ -891,7 +891,8 @@ static int acpi_processor_get_throttling
 			ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				"Invalid throttling state, reset\n"));
 			state = 0;
-			ret = acpi_processor_set_throttling(pr, state, true);
+			ret = __acpi_processor_set_throttling(pr, state, true,
+							      true);
 			if (ret)
 				return ret;
 		}
@@ -1075,8 +1076,15 @@ static long acpi_processor_throttling_fn
 			arg->target_state, arg->force);
 }
 
-int acpi_processor_set_throttling(struct acpi_processor *pr,
-						int state, bool force)
+static int call_on_cpu(int cpu, long (*fn)(void *), void *arg, bool direct)
+{
+	if (direct)
+		return fn(arg);
+	return work_on_cpu(cpu, fn, arg);
+}
+
+static int __acpi_processor_set_throttling(struct acpi_processor *pr,
+					   int state, bool force, bool direct)
 {
 	int ret = 0;
 	unsigned int i;
@@ -1125,7 +1133,8 @@ int acpi_processor_set_throttling(struct
 		arg.pr = pr;
 		arg.target_state = state;
 		arg.force = force;
-		ret = work_on_cpu(pr->id, acpi_processor_throttling_fn, &arg);
+		ret = call_on_cpu(pr->id, acpi_processor_throttling_fn, &arg,
+				  direct);
 	} else {
 		/*
 		 * When the T-state coordination is SW_ALL or HW_ALL,
@@ -1158,8 +1167,8 @@ int acpi_processor_set_throttling(struct
 			arg.pr = match_pr;
 			arg.target_state = state;
 			arg.force = force;
-			ret = work_on_cpu(pr->id, acpi_processor_throttling_fn,
-				&arg);
+			ret = call_on_cpu(pr->id, acpi_processor_throttling_fn,
+					  &arg, direct);
 		}
 	}
 	/*
@@ -1177,6 +1186,12 @@ int acpi_processor_set_throttling(struct
 	return ret;
 }
 
+int acpi_processor_set_throttling(struct acpi_processor *pr, int state,
+				  bool force)
+{
+	return __acpi_processor_set_throttling(pr, state, force, false);
+}
+
 int acpi_processor_get_throttling_info(struct acpi_processor *pr)
 {
 	int result = 0;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 08/13] ACPI/processor: Replace racy task affinity logic.
  2017-04-13 12:01     ` Thomas Gleixner
@ 2017-04-13 12:52       ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2017-04-13 12:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar, linux-acpi

On Thu, Apr 13, 2017 at 02:01:42PM +0200, Thomas Gleixner wrote:
> On Thu, 13 Apr 2017, Peter Zijlstra wrote:
> > 
> > That makes my machine sad...
> > [    9.786610]  work_on_cpu+0x82/0x90
> > [    9.790404]  ? __usermodehelper_disable+0x110/0x110
> > [    9.795846]  ? __acpi_processor_get_throttling+0x20/0x20
> > [    9.801773]  acpi_processor_set_throttling+0x199/0x220
> > [    9.807506]  ? trace_hardirqs_on_caller+0xfb/0x1d0
> > [    9.812851]  acpi_processor_get_throttling_ptc+0xec/0x180
> > [    9.818876]  __acpi_processor_get_throttling+0xf/0x20
> > [    9.824511]  work_for_cpu_fn+0x14/0x20
> > [    9.828692]  process_one_work+0x261/0x670
> > [    9.833165]  worker_thread+0x21b/0x3f0
> > [    9.837348]  kthread+0x108/0x140
> > [    9.840947]  ? process_one_work+0x670/0x670
> > [    9.845611]  ? kthread_create_on_node+0x40/0x40
> > [    9.850667]  ret_from_fork+0x31/0x40
> 
> Yuck. So the call chain is:
> 
> acpi_processor_get_throttling()
>   work_on_cpu(acpi_processor_get_throttling)
> 
> That work does:
> 
> __acpi_processor_get_throttling()
>     acpi_processor_get_throttling_ptc()
>       acpi_processor_set_throttling()
>         work_on_cpu(__acpi_processor_set_throttling)
> 
> Why the heck calls a get_throttling() function set_throttling()? I'm mildly
> surprised.
> 
> Does the delta patch below cure the problem?

Yes, aside from a compile warn.

But that's disgusting... then again, you merely reintroduced this hack.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch V2 12/13] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-13  8:19   ` [patch V2 " Thomas Gleixner
  2017-04-13  8:22     ` [patch V3 " Thomas Gleixner
@ 2017-04-13 14:50     ` David Miller
  1 sibling, 0 replies; 56+ messages in thread
From: David Miller @ 2017-04-13 14:50 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	linux-pm

From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 13 Apr 2017 10:19:07 +0200 (CEST)

> spu_queue_register() needs to invoke setup functions on a particular
> CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> Replace it by using work_on_cpu_safe() which guarantees to run the code on
> the requested CPU or to fail in case the CPU is offline.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch V2 13/13] crypto: n2 - Replace racy task affinity logic
  2017-04-13  8:20   ` [patch V2 " Thomas Gleixner
@ 2017-04-13 14:51     ` David Miller
  2017-04-15 14:22     ` [tip:sched/core] crypto: N2 " tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: David Miller @ 2017-04-13 14:51 UTC (permalink / raw)
  To: tglx
  Cc: linux-kernel, peterz, mingo, bigeasy, benh, fenghua.yu, herbert,
	jiangshanlai, lenb, mpe, rjw, tj, tony.luck, viresh.kumar,
	linux-crypto

From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 13 Apr 2017 10:20:23 +0200 (CEST)

> spu_queue_register() needs to invoke setup functions on a particular
> CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> Replace it by using work_on_cpu_safe() which guarantees to run the code on
> the requested CPU or to fail in case the CPU is offline.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-13 11:11   ` Dou Liyang
@ 2017-04-13 21:28     ` Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-13 21:28 UTC (permalink / raw)
  To: Dou Liyang
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tejun Heo, Tony Luck, Viresh Kumar

On Thu, 13 Apr 2017, Dou Liyang wrote:
> At 04/13/2017 04:07 AM, Thomas Gleixner wrote:
> > +
> > +/**
> 
> There is an extra '*' :)

On purpose. It's the begin of a kernel-doc format comment for interface
documentation.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
  2017-04-13 11:11   ` Dou Liyang
@ 2017-04-14  4:18   ` Tejun Heo
  2017-04-14  8:54   ` Peter Zijlstra
  2017-04-15 14:16   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2017-04-14  4:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Sebastian Siewior,
	Benjamin Herrenschmidt, David S. Miller, Fenghua Yu, Herbert Xu,
	Lai Jiangshan, Len Brown, Michael Ellerman, Rafael J. Wysocki,
	Tony Luck, Viresh Kumar

On Wed, Apr 12, 2017 at 10:07:28PM +0200, Thomas Gleixner wrote:
> work_on_cpu() is not protected against CPU hotplug. For code which requires
> to be either executed on an online CPU or to fail if the CPU is not
> available the callsite would have to protect against CPU hotplug.
> 
> Provide a function which does get/put_online_cpus() around the call to
> work_on_cpu() and fails the call with -ENODEV if the target CPU is not
> online.
> 
> Preparatory patch to convert several racy task affinity manipulations.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
  2017-04-13 11:11   ` Dou Liyang
  2017-04-14  4:18   ` Tejun Heo
@ 2017-04-14  8:54   ` Peter Zijlstra
  2017-04-14  9:51     ` Thomas Gleixner
  2017-04-15 14:16   ` [tip:sched/core] " tip-bot for Thomas Gleixner
  3 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2017-04-14  8:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar

On Wed, Apr 12, 2017 at 10:07:28PM +0200, Thomas Gleixner wrote:
> +long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
> +{
> +	long ret = -ENODEV;
> +
> +	get_online_cpus();
> +	if (cpu_online(cpu))
> +		ret = work_on_cpu(cpu, fn, arg);
> +	put_online_cpus();
> +	return ret;
> +}

But doesn't workqueue have this lovelt 'feature' where it will unbind
per-cpu work and run it on random CPUs when hotplug happens?

That is, I think you need a flush_work() before put_online_cpus() if you
want to guarantee anything.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-14  8:54   ` Peter Zijlstra
@ 2017-04-14  9:51     ` Thomas Gleixner
  2017-04-14  9:56       ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Thomas Gleixner @ 2017-04-14  9:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar

On Fri, 14 Apr 2017, Peter Zijlstra wrote:

> On Wed, Apr 12, 2017 at 10:07:28PM +0200, Thomas Gleixner wrote:
> > +long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
> > +{
> > +	long ret = -ENODEV;
> > +
> > +	get_online_cpus();
> > +	if (cpu_online(cpu))
> > +		ret = work_on_cpu(cpu, fn, arg);
> > +	put_online_cpus();
> > +	return ret;
> > +}
> 
> But doesn't workqueue have this lovelt 'feature' where it will unbind
> per-cpu work and run it on random CPUs when hotplug happens?
> 
> That is, I think you need a flush_work() before put_online_cpus() if you
> want to guarantee anything.

work_on_cpu() is sychnronous, it flushes already.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [patch 02/13] workqueue: Provide work_on_cpu_safe()
  2017-04-14  9:51     ` Thomas Gleixner
@ 2017-04-14  9:56       ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2017-04-14  9:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Sebastian Siewior, Benjamin Herrenschmidt,
	David S. Miller, Fenghua Yu, Herbert Xu, Lai Jiangshan,
	Len Brown, Michael Ellerman, Rafael J. Wysocki, Tejun Heo,
	Tony Luck, Viresh Kumar

On Fri, Apr 14, 2017 at 11:51:07AM +0200, Thomas Gleixner wrote:
> On Fri, 14 Apr 2017, Peter Zijlstra wrote:
> 
> > On Wed, Apr 12, 2017 at 10:07:28PM +0200, Thomas Gleixner wrote:
> > > +long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
> > > +{
> > > +	long ret = -ENODEV;
> > > +
> > > +	get_online_cpus();
> > > +	if (cpu_online(cpu))
> > > +		ret = work_on_cpu(cpu, fn, arg);
> > > +	put_online_cpus();
> > > +	return ret;
> > > +}
> > 
> > But doesn't workqueue have this lovelt 'feature' where it will unbind
> > per-cpu work and run it on random CPUs when hotplug happens?
> > 
> > That is, I think you need a flush_work() before put_online_cpus() if you
> > want to guarantee anything.
> 
> work_on_cpu() is sychnronous, it flushes already.

So much for being awake ... 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] ia64/topology: Remove cpus_allowed manipulation
  2017-04-12 20:07 ` [patch 01/13] ia64/topology: Remove cpus_allowed manipulation Thomas Gleixner
@ 2017-04-15 14:15   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, viresh.kumar, bigeasy, tj, jiangshanlai, fenghua.yu,
	linux-kernel, peterz, lenb, tony.luck, davem, rjw, hpa, mingo,
	mpe, herbert, benh

Commit-ID:  048c9b954e20396e0c45ee778466994d1be2e612
Gitweb:     http://git.kernel.org/tip/048c9b954e20396e0c45ee778466994d1be2e612
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:27 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:52 +0200

ia64/topology: Remove cpus_allowed manipulation

The CPU hotplug callback fiddles with the cpus_allowed pointer to pin the
calling thread on the plugged CPU. That's already guaranteed by the hotplug
core code.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.174518069@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/ia64/kernel/topology.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index 1a68f01..d76529c 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -355,18 +355,12 @@ static int cache_add_dev(unsigned int cpu)
 	unsigned long i, j;
 	struct cache_info *this_object;
 	int retval = 0;
-	cpumask_t oldmask;
 
 	if (all_cpu_cache_info[cpu].kobj.parent)
 		return 0;
 
-	oldmask = current->cpus_allowed;
-	retval = set_cpus_allowed_ptr(current, cpumask_of(cpu));
-	if (unlikely(retval))
-		return retval;
 
 	retval = cpu_cache_sysfs_init(cpu);
-	set_cpus_allowed_ptr(current, &oldmask);
 	if (unlikely(retval < 0))
 		return retval;
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] workqueue: Provide work_on_cpu_safe()
  2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
                     ` (2 preceding siblings ...)
  2017-04-14  8:54   ` Peter Zijlstra
@ 2017-04-15 14:16   ` tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: benh, davem, lenb, tj, peterz, hpa, mpe, bigeasy, tglx,
	fenghua.yu, linux-kernel, rjw, tony.luck, herbert, mingo,
	jiangshanlai, viresh.kumar

Commit-ID:  0e8d6a9336b487a1dd6f1991ff376e669d4c87c6
Gitweb:     http://git.kernel.org/tip/0e8d6a9336b487a1dd6f1991ff376e669d4c87c6
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:28 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

workqueue: Provide work_on_cpu_safe()

work_on_cpu() is not protected against CPU hotplug. For code which requires
to be either executed on an online CPU or to fail if the CPU is not
available the callsite would have to protect against CPU hotplug.

Provide a function which does get/put_online_cpus() around the call to
work_on_cpu() and fails the call with -ENODEV if the target CPU is not
online.

Preparatory patch to convert several racy task affinity manipulations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.262610721@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 include/linux/workqueue.h |  5 +++++
 kernel/workqueue.c        | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index bde063c..c102ef6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -608,8 +608,13 @@ static inline long work_on_cpu(int cpu, long (*fn)(void *), void *arg)
 {
 	return fn(arg);
 }
+static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+	return fn(arg);
+}
 #else
 long work_on_cpu(int cpu, long (*fn)(void *), void *arg);
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c0168b7..5bf1be0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4735,6 +4735,29 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg)
 	return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
+
+/**
+ * work_on_cpu_safe - run a function in thread context on a particular cpu
+ * @cpu: the cpu to run on
+ * @fn:  the function to run
+ * @arg: the function argument
+ *
+ * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold
+ * any locks which would prevent @fn from completing.
+ *
+ * Return: The value @fn returns.
+ */
+long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg)
+{
+	long ret = -ENODEV;
+
+	get_online_cpus();
+	if (cpu_online(cpu))
+		ret = work_on_cpu(cpu, fn, arg);
+	put_online_cpus();
+	return ret;
+}
+EXPORT_SYMBOL_GPL(work_on_cpu_safe);
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_FREEZER

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] ia64/salinfo: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 03/13] ia64/salinfo: Replace racy task affinity logic Thomas Gleixner
@ 2017-04-15 14:17   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lenb, hpa, benh, herbert, mingo, jiangshanlai, rjw, peterz,
	viresh.kumar, bigeasy, tj, tglx, fenghua.yu, davem, linux-kernel,
	mpe, tony.luck

Commit-ID:  67cb85fdcee7fbc61c09c00360d1a4ae37641db4
Gitweb:     http://git.kernel.org/tip/67cb85fdcee7fbc61c09c00360d1a4ae37641db4
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:29 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

ia64/salinfo: Replace racy task affinity logic

Some of the file operations in /proc/sal require to run code on the
requested cpu. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.341863457@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/ia64/kernel/salinfo.c | 31 ++++++++++++-------------------
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index d194d5c..63dc9cd 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -179,14 +179,14 @@ struct salinfo_platform_oemdata_parms {
 	const u8 *efi_guid;
 	u8 **oemdata;
 	u64 *oemdata_size;
-	int ret;
 };
 
-static void
+static long
 salinfo_platform_oemdata_cpu(void *context)
 {
 	struct salinfo_platform_oemdata_parms *parms = context;
-	parms->ret = salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, parms->oemdata_size);
+
+	return salinfo_platform_oemdata(parms->efi_guid, parms->oemdata, parms->oemdata_size);
 }
 
 static void
@@ -380,16 +380,7 @@ salinfo_log_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static void
-call_on_cpu(int cpu, void (*fn)(void *), void *arg)
-{
-	cpumask_t save_cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-	(*fn)(arg);
-	set_cpus_allowed_ptr(current, &save_cpus_allowed);
-}
-
-static void
+static long
 salinfo_log_read_cpu(void *context)
 {
 	struct salinfo_data *data = context;
@@ -399,6 +390,7 @@ salinfo_log_read_cpu(void *context)
 	/* Clear corrected errors as they are read from SAL */
 	if (rh->severity == sal_log_severity_corrected)
 		ia64_sal_clear_state_info(data->type);
+	return 0;
 }
 
 static void
@@ -430,7 +422,7 @@ retry:
 	spin_unlock_irqrestore(&data_saved_lock, flags);
 
 	if (!data->saved_num)
-		call_on_cpu(cpu, salinfo_log_read_cpu, data);
+		work_on_cpu_safe(cpu, salinfo_log_read_cpu, data);
 	if (!data->log_size) {
 		data->state = STATE_NO_DATA;
 		cpumask_clear_cpu(cpu, &data->cpu_event);
@@ -459,11 +451,13 @@ salinfo_log_read(struct file *file, char __user *buffer, size_t count, loff_t *p
 	return simple_read_from_buffer(buffer, count, ppos, buf, bufsize);
 }
 
-static void
+static long
 salinfo_log_clear_cpu(void *context)
 {
 	struct salinfo_data *data = context;
+
 	ia64_sal_clear_state_info(data->type);
+	return 0;
 }
 
 static int
@@ -486,7 +480,7 @@ salinfo_log_clear(struct salinfo_data *data, int cpu)
 	rh = (sal_log_record_header_t *)(data->log_buffer);
 	/* Corrected errors have already been cleared from SAL */
 	if (rh->severity != sal_log_severity_corrected)
-		call_on_cpu(cpu, salinfo_log_clear_cpu, data);
+		work_on_cpu_safe(cpu, salinfo_log_clear_cpu, data);
 	/* clearing a record may make a new record visible */
 	salinfo_log_new_read(cpu, data);
 	if (data->state == STATE_LOG_RECORD) {
@@ -531,9 +525,8 @@ salinfo_log_write(struct file *file, const char __user *buffer, size_t count, lo
 				.oemdata = &data->oemdata,
 				.oemdata_size = &data->oemdata_size
 			};
-			call_on_cpu(cpu, salinfo_platform_oemdata_cpu, &parms);
-			if (parms.ret)
-				count = parms.ret;
+			count = work_on_cpu_safe(cpu, salinfo_platform_oemdata_cpu,
+						 &parms);
 		} else
 			data->oemdata_size = 0;
 	} else

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] ia64/sn/hwperf: Replace racy task affinity logic
  2017-04-12 20:53   ` [patch V 2 " Thomas Gleixner
@ 2017-04-15 14:17     ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jiangshanlai, tj, herbert, mingo, viresh.kumar, fenghua.yu, rjw,
	peterz, davem, tglx, linux-kernel, benh, mpe, bigeasy, lenb,
	tony.luck, hpa

Commit-ID:  9feb42ac88b516e378b9782e82b651ca5bed95c4
Gitweb:     http://git.kernel.org/tip/9feb42ac88b516e378b9782e82b651ca5bed95c4
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 6 Apr 2017 14:56:18 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:53 +0200

ia64/sn/hwperf: Replace racy task affinity logic

sn_hwperf_op_cpu() which is invoked from an ioctl requires to run code on
the requested cpu. This is achieved by temporarily setting the affinity of
the calling user space thread to the requested CPU and reset it to the
original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122251450.2548@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/ia64/sn/kernel/sn2/sn_hwperf.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/ia64/sn/kernel/sn2/sn_hwperf.c b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
index 52704f1..55febd6 100644
--- a/arch/ia64/sn/kernel/sn2/sn_hwperf.c
+++ b/arch/ia64/sn/kernel/sn2/sn_hwperf.c
@@ -598,12 +598,17 @@ static void sn_hwperf_call_sal(void *info)
 	op_info->ret = r;
 }
 
+static long sn_hwperf_call_sal_work(void *info)
+{
+	sn_hwperf_call_sal(info);
+	return 0;
+}
+
 static int sn_hwperf_op_cpu(struct sn_hwperf_op_info *op_info)
 {
 	u32 cpu;
 	u32 use_ipi;
 	int r = 0;
-	cpumask_t save_allowed;
 	
 	cpu = (op_info->a->arg & SN_HWPERF_ARG_CPU_MASK) >> 32;
 	use_ipi = op_info->a->arg & SN_HWPERF_ARG_USE_IPI_MASK;
@@ -629,13 +634,9 @@ static int sn_hwperf_op_cpu(struct sn_hwperf_op_info *op_info)
 			/* use an interprocessor interrupt to call SAL */
 			smp_call_function_single(cpu, sn_hwperf_call_sal,
 				op_info, 1);
-		}
-		else {
-			/* migrate the task before calling SAL */ 
-			save_allowed = current->cpus_allowed;
-			set_cpus_allowed_ptr(current, cpumask_of(cpu));
-			sn_hwperf_call_sal(op_info);
-			set_cpus_allowed_ptr(current, &save_allowed);
+		} else {
+			/* Call on the target CPU */
+			work_on_cpu_safe(cpu, sn_hwperf_call_sal_work, op_info);
 		}
 	}
 	r = op_info->ret;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] powerpc/smp: Replace open coded task affinity logic
  2017-04-12 20:07   ` Thomas Gleixner
  (?)
  (?)
@ 2017-04-15 14:18   ` tip-bot for Thomas Gleixner
  -1 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fenghua.yu, hpa, tj, mpe, bigeasy, herbert, viresh.kumar, lenb,
	jiangshanlai, tony.luck, rjw, mingo, davem, benh, peterz, tglx,
	paulus, linux-kernel

Commit-ID:  6d11b87d55eb75007a3721c2de5938f5bbf607fb
Gitweb:     http://git.kernel.org/tip/6d11b87d55eb75007a3721c2de5938f5bbf607fb
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:31 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

powerpc/smp: Replace open coded task affinity logic

Init task invokes smp_ops->setup_cpu() from smp_cpus_done(). Init task can
run on any online CPU at this point, but the setup_cpu() callback requires
to be invoked on the boot CPU. This is achieved by temporarily setting the
affinity of the calling user space thread to the requested CPU and reset it
to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

That's actually not a problem in this context as neither CPU hotplug nor
affinity settings can happen, but the access to task_struct::cpus_allowed
is about to restricted.

Replace it with a call to work_on_cpu_safe() which achieves the same result.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.518053336@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/powerpc/kernel/smp.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 46f89e6..d68ed1f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -787,24 +787,21 @@ static struct sched_domain_topology_level powerpc_topology[] = {
 	{ NULL, },
 };
 
-void __init smp_cpus_done(unsigned int max_cpus)
+static __init long smp_setup_cpu_workfn(void *data __always_unused)
 {
-	cpumask_var_t old_mask;
+	smp_ops->setup_cpu(boot_cpuid);
+	return 0;
+}
 
-	/* We want the setup_cpu() here to be called from CPU 0, but our
-	 * init thread may have been "borrowed" by another CPU in the meantime
-	 * se we pin us down to CPU 0 for a short while
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+	/*
+	 * We want the setup_cpu() here to be called on the boot CPU, but
+	 * init might run on any CPU, so make sure it's invoked on the boot
+	 * CPU.
 	 */
-	alloc_cpumask_var(&old_mask, GFP_NOWAIT);
-	cpumask_copy(old_mask, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(boot_cpuid));
-	
 	if (smp_ops && smp_ops->setup_cpu)
-		smp_ops->setup_cpu(boot_cpuid);
-
-	set_cpus_allowed_ptr(current, old_mask);
-
-	free_cpumask_var(old_mask);
+		work_on_cpu_safe(boot_cpuid, smp_setup_cpu_workfn, NULL);
 
 	if (smp_ops && smp_ops->bringup_done)
 		smp_ops->bringup_done();
@@ -812,7 +809,6 @@ void __init smp_cpus_done(unsigned int max_cpus)
 	dump_numa_cpu_topology();
 
 	set_sched_topology(powerpc_topology);
-
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] sparc/sysfs: Replace racy task affinity logic
  2017-04-13  8:17       ` Thomas Gleixner
  (?)
@ 2017-04-15 14:18       ` tip-bot for Thomas Gleixner
  -1 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:18 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: tglx, linux-kernel, mingo, hpa, davem

Commit-ID:  ea875ec94eafb858990f3fe9528501f983105653
Gitweb:     http://git.kernel.org/tip/ea875ec94eafb858990f3fe9528501f983105653
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 13 Apr 2017 10:17:07 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

sparc/sysfs: Replace racy task affinity logic

The mmustat_enable sysfs file accessor functions must run code on the
target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by using work_on_cpu() which guarantees to run the code on the
requested CPU.

Protection against CPU hotplug is not required as the open sysfs file
already prevents the removal from the CPU offline callback. Using the
hotplug protected version would actually be wrong because it would deadlock
against a CPU hotplug operation of the CPU associated to the sysfs file in
progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: herbert@gondor.apana.org.au
Cc: rjw@rjwysocki.net
Cc: peterz@infradead.org
Cc: benh@kernel.crashing.org
Cc: bigeasy@linutronix.de
Cc: jiangshanlai@gmail.com
Cc: sparclinux@vger.kernel.org
Cc: viresh.kumar@linaro.org
Cc: mpe@ellerman.id.au
Cc: tj@kernel.org
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131001270.2408@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/sparc/kernel/sysfs.c | 39 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 28 deletions(-)

diff --git a/arch/sparc/kernel/sysfs.c b/arch/sparc/kernel/sysfs.c
index d63fc61..5fd352b 100644
--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -98,27 +98,7 @@ static struct attribute_group mmu_stat_group = {
 	.name = "mmu_stats",
 };
 
-/* XXX convert to rusty's on_one_cpu */
-static unsigned long run_on_cpu(unsigned long cpu,
-			        unsigned long (*func)(unsigned long),
-				unsigned long arg)
-{
-	cpumask_t old_affinity;
-	unsigned long ret;
-
-	cpumask_copy(&old_affinity, &current->cpus_allowed);
-	/* should return -EINVAL to userspace */
-	if (set_cpus_allowed_ptr(current, cpumask_of(cpu)))
-		return 0;
-
-	ret = func(arg);
-
-	set_cpus_allowed_ptr(current, &old_affinity);
-
-	return ret;
-}
-
-static unsigned long read_mmustat_enable(unsigned long junk)
+static long read_mmustat_enable(void *data __maybe_unused)
 {
 	unsigned long ra = 0;
 
@@ -127,11 +107,11 @@ static unsigned long read_mmustat_enable(unsigned long junk)
 	return ra != 0;
 }
 
-static unsigned long write_mmustat_enable(unsigned long val)
+static long write_mmustat_enable(void *data)
 {
-	unsigned long ra, orig_ra;
+	unsigned long ra, orig_ra, *val = data;
 
-	if (val)
+	if (*val)
 		ra = __pa(&per_cpu(mmu_stats, smp_processor_id()));
 	else
 		ra = 0UL;
@@ -142,7 +122,8 @@ static unsigned long write_mmustat_enable(unsigned long val)
 static ssize_t show_mmustat_enable(struct device *s,
 				struct device_attribute *attr, char *buf)
 {
-	unsigned long val = run_on_cpu(s->id, read_mmustat_enable, 0);
+	long val = work_on_cpu(s->id, read_mmustat_enable, NULL);
+
 	return sprintf(buf, "%lx\n", val);
 }
 
@@ -150,13 +131,15 @@ static ssize_t store_mmustat_enable(struct device *s,
 			struct device_attribute *attr, const char *buf,
 			size_t count)
 {
-	unsigned long val, err;
-	int ret = sscanf(buf, "%lu", &val);
+	unsigned long val;
+	long err;
+	int ret;
 
+	ret = sscanf(buf, "%lu", &val);
 	if (ret != 1)
 		return -EINVAL;
 
-	err = run_on_cpu(s->id, write_mmustat_enable, val);
+	err = work_on_cpu(s->id, write_mmustat_enable, &val);
 	if (err)
 		return -EIO;
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] ACPI/processor: Fix error handling in __acpi_processor_start()
  2017-04-12 20:07 ` [patch 07/13] ACPI/processor: Fix error handling in __acpi_processor_start() Thomas Gleixner
@ 2017-04-15 14:19   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fenghua.yu, herbert, lenb, benh, hpa, mingo, viresh.kumar,
	tony.luck, bigeasy, peterz, tj, mpe, jiangshanlai, davem,
	linux-kernel, tglx, rjw

Commit-ID:  a5cbdf693a60d5b86d4d21dfedd90f17754eb273
Gitweb:     http://git.kernel.org/tip/a5cbdf693a60d5b86d4d21dfedd90f17754eb273
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:33 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

ACPI/processor: Fix error handling in __acpi_processor_start()

When acpi_install_notify_handler() fails the cooling device stays
registered and the sysfs files created via acpi_pss_perf_init() are
leaked and the function returns success.

Undo acpi_pss_perf_init() and return a proper error code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.695499645@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/acpi/processor_driver.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 9d5f0c7..eab8cda 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -251,6 +251,9 @@ static int __acpi_processor_start(struct acpi_device *device)
 	if (ACPI_SUCCESS(status))
 		return 0;
 
+	result = -ENODEV;
+	acpi_pss_perf_exit(pr, device);
+
 err_power_exit:
 	acpi_processor_power_exit(pr);
 	return result;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] ACPI/processor: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 08/13] ACPI/processor: Replace racy task affinity logic Thomas Gleixner
  2017-04-13 11:39   ` Peter Zijlstra
@ 2017-04-15 14:19   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jiangshanlai, tj, mpe, benh, peterz, rjw, tglx, mingo, lenb, hpa,
	fenghua.yu, herbert, bigeasy, linux-kernel, davem, tony.luck,
	viresh.kumar

Commit-ID:  8153f9ac43897f9f4786b30badc134fcc1a4fb11
Gitweb:     http://git.kernel.org/tip/8153f9ac43897f9f4786b30badc134fcc1a4fb11
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:34 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:54 +0200

ACPI/processor: Replace racy task affinity logic

acpi_processor_get_throttling() requires to invoke the getter function on
the target CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

acpi_processor_get_throttling() is invoked in two ways:

1) The CPU online callback, which is already running on the target CPU and
   obviously protected against hotplug and not affected by affinity
   settings.

2) The ACPI driver probe function, which is not protected against hotplug
   during modprobe.

Switch it over to work_on_cpu() and protect the probe function against CPU
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.785920903@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/acpi/processor_driver.c     |  7 ++++-
 drivers/acpi/processor_throttling.c | 62 +++++++++++++++++++++----------------
 2 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index eab8cda..8697a82 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -262,11 +262,16 @@ err_power_exit:
 static int acpi_processor_start(struct device *dev)
 {
 	struct acpi_device *device = ACPI_COMPANION(dev);
+	int ret;
 
 	if (!device)
 		return -ENODEV;
 
-	return __acpi_processor_start(device);
+	/* Protect against concurrent CPU hotplug operations */
+	get_online_cpus();
+	ret = __acpi_processor_start(device);
+	put_online_cpus();
+	return ret;
 }
 
 static int acpi_processor_stop(struct device *dev)
diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c
index a12f96c..3de34633 100644
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -62,8 +62,8 @@ struct acpi_processor_throttling_arg {
 #define THROTTLING_POSTCHANGE      (2)
 
 static int acpi_processor_get_throttling(struct acpi_processor *pr);
-int acpi_processor_set_throttling(struct acpi_processor *pr,
-						int state, bool force);
+static int __acpi_processor_set_throttling(struct acpi_processor *pr,
+					   int state, bool force, bool direct);
 
 static int acpi_processor_update_tsd_coord(void)
 {
@@ -891,7 +891,8 @@ static int acpi_processor_get_throttling_ptc(struct acpi_processor *pr)
 			ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				"Invalid throttling state, reset\n"));
 			state = 0;
-			ret = acpi_processor_set_throttling(pr, state, true);
+			ret = __acpi_processor_set_throttling(pr, state, true,
+							      true);
 			if (ret)
 				return ret;
 		}
@@ -901,36 +902,31 @@ static int acpi_processor_get_throttling_ptc(struct acpi_processor *pr)
 	return 0;
 }
 
-static int acpi_processor_get_throttling(struct acpi_processor *pr)
+static long __acpi_processor_get_throttling(void *data)
 {
-	cpumask_var_t saved_mask;
-	int ret;
+	struct acpi_processor *pr = data;
+
+	return pr->throttling.acpi_processor_get_throttling(pr);
+}
 
+static int acpi_processor_get_throttling(struct acpi_processor *pr)
+{
 	if (!pr)
 		return -EINVAL;
 
 	if (!pr->flags.throttling)
 		return -ENODEV;
 
-	if (!alloc_cpumask_var(&saved_mask, GFP_KERNEL))
-		return -ENOMEM;
-
 	/*
-	 * Migrate task to the cpu pointed by pr.
+	 * This is either called from the CPU hotplug callback of
+	 * processor_driver or via the ACPI probe function. In the latter
+	 * case the CPU is not guaranteed to be online. Both call sites are
+	 * protected against CPU hotplug.
 	 */
-	cpumask_copy(saved_mask, &current->cpus_allowed);
-	/* FIXME: use work_on_cpu() */
-	if (set_cpus_allowed_ptr(current, cpumask_of(pr->id))) {
-		/* Can't migrate to the target pr->id CPU. Exit */
-		free_cpumask_var(saved_mask);
+	if (!cpu_online(pr->id))
 		return -ENODEV;
-	}
-	ret = pr->throttling.acpi_processor_get_throttling(pr);
-	/* restore the previous state */
-	set_cpus_allowed_ptr(current, saved_mask);
-	free_cpumask_var(saved_mask);
 
-	return ret;
+	return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
 }
 
 static int acpi_processor_get_fadt_info(struct acpi_processor *pr)
@@ -1080,8 +1076,15 @@ static long acpi_processor_throttling_fn(void *data)
 			arg->target_state, arg->force);
 }
 
-int acpi_processor_set_throttling(struct acpi_processor *pr,
-						int state, bool force)
+static int call_on_cpu(int cpu, long (*fn)(void *), void *arg, bool direct)
+{
+	if (direct)
+		return fn(arg);
+	return work_on_cpu(cpu, fn, arg);
+}
+
+static int __acpi_processor_set_throttling(struct acpi_processor *pr,
+					   int state, bool force, bool direct)
 {
 	int ret = 0;
 	unsigned int i;
@@ -1130,7 +1133,8 @@ int acpi_processor_set_throttling(struct acpi_processor *pr,
 		arg.pr = pr;
 		arg.target_state = state;
 		arg.force = force;
-		ret = work_on_cpu(pr->id, acpi_processor_throttling_fn, &arg);
+		ret = call_on_cpu(pr->id, acpi_processor_throttling_fn, &arg,
+				  direct);
 	} else {
 		/*
 		 * When the T-state coordination is SW_ALL or HW_ALL,
@@ -1163,8 +1167,8 @@ int acpi_processor_set_throttling(struct acpi_processor *pr,
 			arg.pr = match_pr;
 			arg.target_state = state;
 			arg.force = force;
-			ret = work_on_cpu(pr->id, acpi_processor_throttling_fn,
-				&arg);
+			ret = call_on_cpu(pr->id, acpi_processor_throttling_fn,
+					  &arg, direct);
 		}
 	}
 	/*
@@ -1182,6 +1186,12 @@ int acpi_processor_set_throttling(struct acpi_processor *pr,
 	return ret;
 }
 
+int acpi_processor_set_throttling(struct acpi_processor *pr, int state,
+				  bool force)
+{
+	return __acpi_processor_set_throttling(pr, state, force, false);
+}
+
 int acpi_processor_get_throttling_info(struct acpi_processor *pr)
 {
 	int result = 0;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] cpufreq/ia64: Replace racy task affinity logic
  2017-04-12 20:55   ` [patch V2 " Thomas Gleixner
@ 2017-04-15 14:20     ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, benh, mingo, lenb, viresh.kumar, tj, rjw,
	bigeasy, tony.luck, mpe, fenghua.yu, herbert, peterz, tglx,
	davem, jiangshanlai

Commit-ID:  38f05ed04beb276f780fcd2b5c0b78c76d0b3c0c
Gitweb:     http://git.kernel.org/tip/38f05ed04beb276f780fcd2b5c0b78c76d0b3c0c
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:55:03 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/ia64: Replace racy task affinity logic

The get() and target() callbacks must run on the affected cpu. This is
achieved by temporarily setting the affinity of the calling thread to the
requested CPU and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU and overwriting the new affinity setting.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122231100.2548@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/cpufreq/ia64-acpi-cpufreq.c | 92 ++++++++++++++++---------------------
 1 file changed, 39 insertions(+), 53 deletions(-)

diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c b/drivers/cpufreq/ia64-acpi-cpufreq.c
index e28a31a..a757c0a 100644
--- a/drivers/cpufreq/ia64-acpi-cpufreq.c
+++ b/drivers/cpufreq/ia64-acpi-cpufreq.c
@@ -34,6 +34,11 @@ struct cpufreq_acpi_io {
 	unsigned int				resume;
 };
 
+struct cpufreq_acpi_req {
+	unsigned int		cpu;
+	unsigned int		state;
+};
+
 static struct cpufreq_acpi_io	*acpi_io_data[NR_CPUS];
 
 static struct cpufreq_driver acpi_cpufreq_driver;
@@ -83,8 +88,7 @@ processor_get_pstate (
 static unsigned
 extract_clock (
 	struct cpufreq_acpi_io *data,
-	unsigned value,
-	unsigned int cpu)
+	unsigned value)
 {
 	unsigned long i;
 
@@ -98,60 +102,43 @@ extract_clock (
 }
 
 
-static unsigned int
+static long
 processor_get_freq (
-	struct cpufreq_acpi_io	*data,
-	unsigned int		cpu)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	unsigned long 		clock_freq;
+	struct cpufreq_acpi_req *req = arg;
+	unsigned int		cpu = req->cpu;
+	struct cpufreq_acpi_io	*data = acpi_io_data[cpu];
+	u32			value;
+	int			ret;
 
 	pr_debug("processor_get_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 	if (smp_processor_id() != cpu)
-		goto migrate_end;
+		return -EAGAIN;
 
 	/* processor_get_pstate gets the instantaneous frequency */
 	ret = processor_get_pstate(&value);
-
 	if (ret) {
-		set_cpus_allowed_ptr(current, &saved_mask);
 		pr_warn("get performance failed with error %d\n", ret);
-		ret = 0;
-		goto migrate_end;
+		return ret;
 	}
-	clock_freq = extract_clock(data, value, cpu);
-	ret = (clock_freq*1000);
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return ret;
+	return 1000 * extract_clock(data, value);
 }
 
 
-static int
+static long
 processor_set_freq (
-	struct cpufreq_acpi_io	*data,
-	struct cpufreq_policy   *policy,
-	int			state)
+	void *arg)
 {
-	int			ret = 0;
-	u32			value = 0;
-	cpumask_t		saved_mask;
-	int			retval;
+	struct cpufreq_acpi_req *req = arg;
+	unsigned int		cpu = req->cpu;
+	struct cpufreq_acpi_io	*data = acpi_io_data[cpu];
+	int			ret, state = req->state;
+	u32			value;
 
 	pr_debug("processor_set_freq\n");
-
-	saved_mask = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(policy->cpu));
-	if (smp_processor_id() != policy->cpu) {
-		retval = -EAGAIN;
-		goto migrate_end;
-	}
+	if (smp_processor_id() != cpu)
+		return -EAGAIN;
 
 	if (state == data->acpi_data.state) {
 		if (unlikely(data->resume)) {
@@ -159,8 +146,7 @@ processor_set_freq (
 			data->resume = 0;
 		} else {
 			pr_debug("Already at target state (P%d)\n", state);
-			retval = 0;
-			goto migrate_end;
+			return 0;
 		}
 	}
 
@@ -171,7 +157,6 @@ processor_set_freq (
 	 * First we write the target state's 'control' value to the
 	 * control_register.
 	 */
-
 	value = (u32) data->acpi_data.states[state].control;
 
 	pr_debug("Transitioning to state: 0x%08x\n", value);
@@ -179,17 +164,11 @@ processor_set_freq (
 	ret = processor_set_pstate(value);
 	if (ret) {
 		pr_warn("Transition failed with error %d\n", ret);
-		retval = -ENODEV;
-		goto migrate_end;
+		return -ENODEV;
 	}
 
 	data->acpi_data.state = state;
-
-	retval = 0;
-
-migrate_end:
-	set_cpus_allowed_ptr(current, &saved_mask);
-	return (retval);
+	return 0;
 }
 
 
@@ -197,11 +176,13 @@ static unsigned int
 acpi_cpufreq_get (
 	unsigned int		cpu)
 {
-	struct cpufreq_acpi_io *data = acpi_io_data[cpu];
+	struct cpufreq_acpi_req req;
+	long ret;
 
-	pr_debug("acpi_cpufreq_get\n");
+	req.cpu = cpu;
+	ret = work_on_cpu(cpu, processor_get_freq, &req);
 
-	return processor_get_freq(data, cpu);
+	return ret > 0 ? (unsigned int) ret : 0;
 }
 
 
@@ -210,7 +191,12 @@ acpi_cpufreq_target (
 	struct cpufreq_policy   *policy,
 	unsigned int index)
 {
-	return processor_set_freq(acpi_io_data[policy->cpu], policy, index);
+	struct cpufreq_acpi_req req;
+
+	req.cpu = policy->cpu;
+	req.state = index;
+
+	return work_on_cpu(req.cpu, processor_set_freq, &req);
 }
 
 static int

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] cpufreq/sh: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 10/13] cpufreq/sh: " Thomas Gleixner
  2017-04-13  2:46   ` Viresh Kumar
@ 2017-04-15 14:20   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, lenb, tony.luck, viresh.kumar, mpe, linux-kernel,
	jiangshanlai, fenghua.yu, mingo, bigeasy, benh, rjw, peterz,
	herbert, tj, davem

Commit-ID:  205dcc1ecbc566cbc20acf246e68de3b080b3ecf
Gitweb:     http://git.kernel.org/tip/205dcc1ecbc566cbc20acf246e68de3b080b3ecf
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:36 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/sh: Replace racy task affinity logic

The target() callback must run on the affected cpu. This is achieved by
temporarily setting the affinity of the calling thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. concurrent affinity settings for that thread resulting in
code executing on the wrong CPU.

Replace it by work_on_cpu(). All call pathes which invoke the callbacks are
already protected against CPU hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201042.958216363@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/cpufreq/sh-cpufreq.c | 45 ++++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
index 86628e2..719c3d9 100644
--- a/drivers/cpufreq/sh-cpufreq.c
+++ b/drivers/cpufreq/sh-cpufreq.c
@@ -30,54 +30,63 @@
 
 static DEFINE_PER_CPU(struct clk, sh_cpuclk);
 
+struct cpufreq_target {
+	struct cpufreq_policy	*policy;
+	unsigned int		freq;
+};
+
 static unsigned int sh_cpufreq_get(unsigned int cpu)
 {
 	return (clk_get_rate(&per_cpu(sh_cpuclk, cpu)) + 500) / 1000;
 }
 
-/*
- * Here we notify other drivers of the proposed change and the final change.
- */
-static int sh_cpufreq_target(struct cpufreq_policy *policy,
-			     unsigned int target_freq,
-			     unsigned int relation)
+static long __sh_cpufreq_target(void *arg)
 {
-	unsigned int cpu = policy->cpu;
+	struct cpufreq_target *target = arg;
+	struct cpufreq_policy *policy = target->policy;
+	int cpu = policy->cpu;
 	struct clk *cpuclk = &per_cpu(sh_cpuclk, cpu);
-	cpumask_t cpus_allowed;
 	struct cpufreq_freqs freqs;
 	struct device *dev;
 	long freq;
 
-	cpus_allowed = current->cpus_allowed;
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-	BUG_ON(smp_processor_id() != cpu);
+	if (smp_processor_id() != cpu)
+		return -ENODEV;
 
 	dev = get_cpu_device(cpu);
 
 	/* Convert target_freq from kHz to Hz */
-	freq = clk_round_rate(cpuclk, target_freq * 1000);
+	freq = clk_round_rate(cpuclk, target->freq * 1000);
 
 	if (freq < (policy->min * 1000) || freq > (policy->max * 1000))
 		return -EINVAL;
 
-	dev_dbg(dev, "requested frequency %u Hz\n", target_freq * 1000);
+	dev_dbg(dev, "requested frequency %u Hz\n", target->freq * 1000);
 
 	freqs.old	= sh_cpufreq_get(cpu);
 	freqs.new	= (freq + 500) / 1000;
 	freqs.flags	= 0;
 
-	cpufreq_freq_transition_begin(policy, &freqs);
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+	cpufreq_freq_transition_begin(target->policy, &freqs);
 	clk_set_rate(cpuclk, freq);
-	cpufreq_freq_transition_end(policy, &freqs, 0);
+	cpufreq_freq_transition_end(target->policy, &freqs, 0);
 
 	dev_dbg(dev, "set frequency %lu Hz\n", freq);
-
 	return 0;
 }
 
+/*
+ * Here we notify other drivers of the proposed change and the final change.
+ */
+static int sh_cpufreq_target(struct cpufreq_policy *policy,
+			     unsigned int target_freq,
+			     unsigned int relation)
+{
+	struct cpufreq_target data = { .policy = policy, .freq = target_freq };
+
+	return work_on_cpu(policy->cpu, __sh_cpufreq_target, &data);
+}
+
 static int sh_cpufreq_verify(struct cpufreq_policy *policy)
 {
 	struct clk *cpuclk = &per_cpu(sh_cpuclk, policy->cpu);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] cpufreq/sparc-us3: Replace racy task affinity logic
  2017-04-12 20:07 ` [patch 11/13] cpufreq/sparc-us3: " Thomas Gleixner
  2017-04-13  2:48   ` Viresh Kumar
@ 2017-04-15 14:21   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tony.luck, mingo, jiangshanlai, bigeasy, davem, lenb, tj,
	herbert, tglx, peterz, rjw, linux-kernel, hpa, viresh.kumar,
	benh, mpe, fenghua.yu

Commit-ID:  9fe24c4e92d3963d92d7d383e28ed098bd5689d8
Gitweb:     http://git.kernel.org/tip/9fe24c4e92d3963d92d7d383e28ed098bd5689d8
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 12 Apr 2017 22:07:37 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:55 +0200

cpufreq/sparc-us3: Replace racy task affinity logic

The access to the safari config register in the CPU frequency functions
must be executed on the target CPU. This is achieved by temporarily setting
the affinity of the calling user space thread to the requested CPU and
reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170412201043.047558840@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/cpufreq/sparc-us3-cpufreq.c | 46 +++++++++++++------------------------
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/cpufreq/sparc-us3-cpufreq.c b/drivers/cpufreq/sparc-us3-cpufreq.c
index a8d86a4..30645b0 100644
--- a/drivers/cpufreq/sparc-us3-cpufreq.c
+++ b/drivers/cpufreq/sparc-us3-cpufreq.c
@@ -35,22 +35,28 @@ static struct us3_freq_percpu_info *us3_freq_table;
 #define SAFARI_CFG_DIV_32	0x0000000080000000UL
 #define SAFARI_CFG_DIV_MASK	0x00000000C0000000UL
 
-static unsigned long read_safari_cfg(void)
+static void read_safari_cfg(void *arg)
 {
-	unsigned long ret;
+	unsigned long ret, *val = arg;
 
 	__asm__ __volatile__("ldxa	[%%g0] %1, %0"
 			     : "=&r" (ret)
 			     : "i" (ASI_SAFARI_CONFIG));
-	return ret;
+	*val = ret;
 }
 
-static void write_safari_cfg(unsigned long val)
+static void update_safari_cfg(void *arg)
 {
+	unsigned long reg, *new_bits = arg;
+
+	read_safari_cfg(&reg);
+	reg &= ~SAFARI_CFG_DIV_MASK;
+	reg |= *new_bits;
+
 	__asm__ __volatile__("stxa	%0, [%%g0] %1\n\t"
 			     "membar	#Sync"
 			     : /* no outputs */
-			     : "r" (val), "i" (ASI_SAFARI_CONFIG)
+			     : "r" (reg), "i" (ASI_SAFARI_CONFIG)
 			     : "memory");
 }
 
@@ -78,29 +84,17 @@ static unsigned long get_current_freq(unsigned int cpu, unsigned long safari_cfg
 
 static unsigned int us3_freq_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	unsigned long reg;
-	unsigned int ret;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
-	reg = read_safari_cfg();
-	ret = get_current_freq(cpu, reg);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
 
-	return ret;
+	if (smp_call_function_single(cpu, read_safari_cfg, &reg, 1))
+		return 0;
+	return get_current_freq(cpu, reg);
 }
 
 static int us3_freq_target(struct cpufreq_policy *policy, unsigned int index)
 {
 	unsigned int cpu = policy->cpu;
-	unsigned long new_bits, new_freq, reg;
-	cpumask_t cpus_allowed;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
+	unsigned long new_bits, new_freq;
 
 	new_freq = sparc64_get_clock_tick(cpu) / 1000;
 	switch (index) {
@@ -121,15 +115,7 @@ static int us3_freq_target(struct cpufreq_policy *policy, unsigned int index)
 		BUG();
 	}
 
-	reg = read_safari_cfg();
-
-	reg &= ~SAFARI_CFG_DIV_MASK;
-	reg |= new_bits;
-	write_safari_cfg(reg);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
-
-	return 0;
+	return smp_call_function_single(cpu, update_safari_cfg, &new_bits, 1);
 }
 
 static int __init us3_freq_cpu_init(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] cpufreq/sparc-us2e: Replace racy task affinity logic
  2017-04-13  8:22     ` [patch V3 " Thomas Gleixner
@ 2017-04-15 14:21       ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rjw, tglx, peterz, benh, linux-kernel, fenghua.yu, mpe, mingo,
	viresh.kumar, lenb, hpa, davem, tj, bigeasy, jiangshanlai,
	tony.luck, herbert

Commit-ID:  12699ac53a2e5fbd1fd7c164b11685d55c8aa28b
Gitweb:     http://git.kernel.org/tip/12699ac53a2e5fbd1fd7c164b11685d55c8aa28b
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 13 Apr 2017 10:22:43 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:56 +0200

cpufreq/sparc-us2e: Replace racy task affinity logic

The access to the HBIRD_ESTAR_MODE register in the cpu frequency control
functions must happen on the target CPU. This is achieved by temporarily
setting the affinity of the calling user space thread to the requested CPU
and reset it to the original affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by a straight forward smp function call. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131020280.2408@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/cpufreq/sparc-us2e-cpufreq.c | 45 +++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/sparc-us2e-cpufreq.c b/drivers/cpufreq/sparc-us2e-cpufreq.c
index 35ddb6d..90f33ef 100644
--- a/drivers/cpufreq/sparc-us2e-cpufreq.c
+++ b/drivers/cpufreq/sparc-us2e-cpufreq.c
@@ -118,10 +118,6 @@ static void us2e_transition(unsigned long estar, unsigned long new_bits,
 			    unsigned long clock_tick,
 			    unsigned long old_divisor, unsigned long divisor)
 {
-	unsigned long flags;
-
-	local_irq_save(flags);
-
 	estar &= ~ESTAR_MODE_DIV_MASK;
 
 	/* This is based upon the state transition diagram in the IIe manual.  */
@@ -152,8 +148,6 @@ static void us2e_transition(unsigned long estar, unsigned long new_bits,
 	} else {
 		BUG();
 	}
-
-	local_irq_restore(flags);
 }
 
 static unsigned long index_to_estar_mode(unsigned int index)
@@ -229,48 +223,51 @@ static unsigned long estar_to_divisor(unsigned long estar)
 	return ret;
 }
 
+static void __us2e_freq_get(void *arg)
+{
+	unsigned long *estar = arg;
+
+	*estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
+}
+
 static unsigned int us2e_freq_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	unsigned long clock_tick, estar;
 
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
-
 	clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
-
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+	if (smp_call_function_single(cpu, __us2e_freq_get, &estar, 1))
+		return 0;
 
 	return clock_tick / estar_to_divisor(estar);
 }
 
-static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+static void __us2e_freq_target(void *arg)
 {
-	unsigned int cpu = policy->cpu;
+	unsigned int cpu = smp_processor_id();
+	unsigned int *index = arg;
 	unsigned long new_bits, new_freq;
 	unsigned long clock_tick, divisor, old_divisor, estar;
-	cpumask_t cpus_allowed;
-
-	cpumask_copy(&cpus_allowed, &current->cpus_allowed);
-	set_cpus_allowed_ptr(current, cpumask_of(cpu));
 
 	new_freq = clock_tick = sparc64_get_clock_tick(cpu) / 1000;
-	new_bits = index_to_estar_mode(index);
-	divisor = index_to_divisor(index);
+	new_bits = index_to_estar_mode(*index);
+	divisor = index_to_divisor(*index);
 	new_freq /= divisor;
 
 	estar = read_hbreg(HBIRD_ESTAR_MODE_ADDR);
 
 	old_divisor = estar_to_divisor(estar);
 
-	if (old_divisor != divisor)
+	if (old_divisor != divisor) {
 		us2e_transition(estar, new_bits, clock_tick * 1000,
 				old_divisor, divisor);
+	}
+}
 
-	set_cpus_allowed_ptr(current, &cpus_allowed);
+static int us2e_freq_target(struct cpufreq_policy *policy, unsigned int index)
+{
+	unsigned int cpu = policy->cpu;
 
-	return 0;
+	return smp_call_function_single(cpu, __us2e_freq_target, &index, 1);
 }
 
 static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:sched/core] crypto: N2 - Replace racy task affinity logic
  2017-04-13  8:20   ` [patch V2 " Thomas Gleixner
  2017-04-13 14:51     ` David Miller
@ 2017-04-15 14:22     ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-04-15 14:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bigeasy, benh, tglx, lenb, rjw, linux-kernel, jiangshanlai,
	fenghua.yu, mpe, tj, viresh.kumar, davem, peterz, herbert,
	tony.luck, mingo, hpa

Commit-ID:  73810a069120aa831debb4d967310ab900f628ad
Gitweb:     http://git.kernel.org/tip/73810a069120aa831debb4d967310ab900f628ad
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 13 Apr 2017 10:20:23 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 15 Apr 2017 12:20:56 +0200

crypto: N2 - Replace racy task affinity logic

spu_queue_register() needs to invoke setup functions on a particular
CPU. This is achieved by temporarily setting the affinity of the
calling user space thread to the requested CPU and reset it to the original
affinity afterwards.

That's racy vs. CPU hotplug and concurrent affinity settings for that
thread resulting in code executing on the wrong CPU and overwriting the
new affinity setting.

Replace it by using work_on_cpu_safe() which guarantees to run the code on
the requested CPU or to fail in case the CPU is offline.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: "David S. Miller" <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-crypto@vger.kernel.org
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131019420.2408@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/crypto/n2_core.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c
index c5aac25..4ecb77a 100644
--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -65,6 +65,11 @@ struct spu_queue {
 	struct list_head	list;
 };
 
+struct spu_qreg {
+	struct spu_queue	*queue;
+	unsigned long		type;
+};
+
 static struct spu_queue **cpu_to_cwq;
 static struct spu_queue **cpu_to_mau;
 
@@ -1631,31 +1636,27 @@ static void queue_cache_destroy(void)
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
 }
 
-static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+static long spu_queue_register_workfn(void *arg)
 {
-	cpumask_var_t old_allowed;
+	struct spu_qreg *qr = arg;
+	struct spu_queue *p = qr->queue;
+	unsigned long q_type = qr->type;
 	unsigned long hv_ret;
 
-	if (cpumask_empty(&p->sharing))
-		return -EINVAL;
-
-	if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL))
-		return -ENOMEM;
-
-	cpumask_copy(old_allowed, &current->cpus_allowed);
-
-	set_cpus_allowed_ptr(current, &p->sharing);
-
 	hv_ret = sun4v_ncs_qconf(q_type, __pa(p->q),
 				 CWQ_NUM_ENTRIES, &p->qhandle);
 	if (!hv_ret)
 		sun4v_ncs_sethead_marker(p->qhandle, 0);
 
-	set_cpus_allowed_ptr(current, old_allowed);
+	return hv_ret ? -EINVAL : 0;
+}
 
-	free_cpumask_var(old_allowed);
+static int spu_queue_register(struct spu_queue *p, unsigned long q_type)
+{
+	int cpu = cpumask_any_and(&p->sharing, cpu_online_mask);
+	struct spu_qreg qr = { .queue = p, .type = q_type };
 
-	return (hv_ret ? -EINVAL : 0);
+	return work_on_cpu_safe(cpu, spu_queue_register_workfn, &qr);
 }
 
 static int spu_queue_setup(struct spu_queue *p)

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-04-15 14:29 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-12 20:07 [patch 00/13] sched/treewide: Clean up various racy task affinity issues Thomas Gleixner
2017-04-12 20:07 ` [patch 01/13] ia64/topology: Remove cpus_allowed manipulation Thomas Gleixner
2017-04-15 14:15   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 02/13] workqueue: Provide work_on_cpu_safe() Thomas Gleixner
2017-04-13 11:11   ` Dou Liyang
2017-04-13 21:28     ` Thomas Gleixner
2017-04-14  4:18   ` Tejun Heo
2017-04-14  8:54   ` Peter Zijlstra
2017-04-14  9:51     ` Thomas Gleixner
2017-04-14  9:56       ` Peter Zijlstra
2017-04-15 14:16   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 03/13] ia64/salinfo: Replace racy task affinity logic Thomas Gleixner
2017-04-15 14:17   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 04/13] ia64/sn/hwperf: " Thomas Gleixner
2017-04-12 20:53   ` [patch V 2 " Thomas Gleixner
2017-04-15 14:17     ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 05/13] powerpc/smp: Replace open coded " Thomas Gleixner
2017-04-12 20:07   ` Thomas Gleixner
2017-04-13  5:47   ` Michael Ellerman
2017-04-15 14:18   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 06/13] sparc/sysfs: Replace racy " Thomas Gleixner
2017-04-12 20:07   ` Thomas Gleixner
2017-04-13  1:52   ` David Miller
2017-04-13  1:52     ` David Miller
2017-04-13  8:17     ` [patch V2 " Thomas Gleixner
2017-04-13  8:17       ` Thomas Gleixner
2017-04-15 14:18       ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 07/13] ACPI/processor: Fix error handling in __acpi_processor_start() Thomas Gleixner
2017-04-15 14:19   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 08/13] ACPI/processor: Replace racy task affinity logic Thomas Gleixner
2017-04-13 11:39   ` Peter Zijlstra
2017-04-13 12:01     ` Thomas Gleixner
2017-04-13 12:52       ` Peter Zijlstra
2017-04-15 14:19   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 09/13] cpufreq/ia64: " Thomas Gleixner
2017-04-12 20:55   ` [patch V2 " Thomas Gleixner
2017-04-15 14:20     ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-13  2:42   ` [patch 09/13] " Viresh Kumar
2017-04-12 20:07 ` [patch 10/13] cpufreq/sh: " Thomas Gleixner
2017-04-13  2:46   ` Viresh Kumar
2017-04-15 14:20   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 11/13] cpufreq/sparc-us3: " Thomas Gleixner
2017-04-13  2:48   ` Viresh Kumar
2017-04-15 14:21   ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-12 20:07 ` [patch 12/13] cpufreq/sparc-us2e: " Thomas Gleixner
2017-04-13  2:50   ` Viresh Kumar
2017-04-13  8:19   ` [patch V2 " Thomas Gleixner
2017-04-13  8:22     ` [patch V3 " Thomas Gleixner
2017-04-15 14:21       ` [tip:sched/core] " tip-bot for Thomas Gleixner
2017-04-13 14:50     ` [patch V2 12/13] " David Miller
2017-04-12 20:07 ` [patch 13/13] crypto: n2 - " Thomas Gleixner
2017-04-13  4:56   ` Herbert Xu
2017-04-13  8:20   ` [patch V2 " Thomas Gleixner
2017-04-13 14:51     ` David Miller
2017-04-15 14:22     ` [tip:sched/core] crypto: N2 " tip-bot for Thomas Gleixner
2017-04-13  9:02 ` [patch 00/13] sched/treewide: Clean up various racy task affinity issues Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.