linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem
@ 2017-04-15 17:01 Thomas Gleixner
  2017-04-15 17:01 ` [patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked() Thomas Gleixner
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

get_online_cpus() is used in hot pathes in mainline and even more so in
RT. That can show up badly under certain conditions because every locker
contends on a global mutex. RT has it's own homebrewn mitigation which is
an (badly done) open coded implementation of percpu_rwsems with recursion
support.

The proper replacement for that are percpu_rwsems, but that requires to
remove recursion support.

The conversion unearthed real locking issues which were previously not
visible because the get_online_cpus() lockdep annotation was implemented
with recursion support which prevents lockdep from tracking full dependency
chains. These potential deadlocks are not related to recursive calls, they
trigger on the first invocation because lockdep now has the full dependency
chains available.

The following patch series addresses this by

 - Cleaning up places which call get_online_cpus() nested

 - Replacing a few instances with cpu_hotplug_disable() to prevent circular
   locking dependencies.

The series depends on

    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core
  plus
    Linus tree merged in to avoid conflicts

It's available in git from

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.hotplug

Thanks,

	tglx

-------
 arch/arm/kernel/hw_breakpoint.c               |    5 
 arch/powerpc/kvm/book3s_hv.c                  |    8 -
 arch/powerpc/platforms/powernv/subcore.c      |    2 
 arch/s390/kernel/time.c                       |    2 
 arch/x86/events/core.c                        |    1 
 arch/x86/events/intel/core.c                  |    4 
 arch/x86/events/intel/cqm.c                   |   12 +-
 arch/x86/kernel/cpu/mtrr/main.c               |    2 
 drivers/acpi/processor_driver.c               |    4 
 drivers/cpufreq/cpufreq.c                     |    9 -
 drivers/hwtracing/coresight/coresight-etm3x.c |   12 +-
 drivers/hwtracing/coresight/coresight-etm4x.c |   12 +-
 drivers/pci/pci-driver.c                      |   46 ++++----
 include/linux/cpuhotplug.h                    |   29 +++++
 include/linux/padata.h                        |    3 
 include/linux/pci.h                           |    1 
 include/linux/stop_machine.h                  |   26 ++++
 kernel/cpu.c                                  |  149 +++++++-------------------
 kernel/padata.c                               |   38 +++---
 kernel/stop_machine.c                         |    4 
 20 files changed, 177 insertions(+), 192 deletions(-)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 02/20] stop_machine: Provide stop_machine_locked() Thomas Gleixner
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

[-- Attachment #1: cpuhotplug_Provide_cpuhp_setup_state_locked.patch --]
[-- Type: text/plain, Size: 6778 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Some call sites of cpuhp_setup/remove_state[_nocalls]() are within a
get_online_cpus() protected region.

cpuhp_setup/remove_state[_nocalls]() call get_online_cpus() as well, which
is possible in the current implementation but prevetns converting the
hotplug locking to a percpu rwsem.

Provide locked versions of the interfaces to avoid nested calls to
get_online_cpus().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 include/linux/cpuhotplug.h |   29 +++++++++++++++++++++++++++++
 kernel/cpu.c               |   45 +++++++++++++++++++++++++++++++++------------
 2 files changed, 62 insertions(+), 12 deletions(-)

--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -151,6 +151,11 @@ int __cpuhp_setup_state(enum cpuhp_state
 			int (*startup)(unsigned int cpu),
 			int (*teardown)(unsigned int cpu), bool multi_instance);
 
+int __cpuhp_setup_state_locked(enum cpuhp_state state,	const char *name,
+			       bool invoke,
+			       int (*startup)(unsigned int cpu),
+			       int (*teardown)(unsigned int cpu),
+			       bool multi_instance);
 /**
  * cpuhp_setup_state - Setup hotplug state callbacks with calling the callbacks
  * @state:	The state for which the calls are installed
@@ -169,6 +174,15 @@ static inline int cpuhp_setup_state(enum
 	return __cpuhp_setup_state(state, name, true, startup, teardown, false);
 }
 
+static inline int cpuhp_setup_state_locked(enum cpuhp_state state,
+					   const char *name,
+					   int (*startup)(unsigned int cpu),
+					   int (*teardown)(unsigned int cpu))
+{
+	return __cpuhp_setup_state_locked(state, name, true, startup, teardown,
+					  false);
+}
+
 /**
  * cpuhp_setup_state_nocalls - Setup hotplug state callbacks without calling the
  *			       callbacks
@@ -189,6 +203,15 @@ static inline int cpuhp_setup_state_noca
 				   false);
 }
 
+static inline int cpuhp_setup_state_nocalls_locked(enum cpuhp_state state,
+						   const char *name,
+						   int (*startup)(unsigned int cpu),
+						   int (*teardown)(unsigned int cpu))
+{
+	return __cpuhp_setup_state_locked(state, name, false, startup, teardown,
+					  false);
+}
+
 /**
  * cpuhp_setup_state_multi - Add callbacks for multi state
  * @state:	The state for which the calls are installed
@@ -248,6 +271,7 @@ static inline int cpuhp_state_add_instan
 }
 
 void __cpuhp_remove_state(enum cpuhp_state state, bool invoke);
+void __cpuhp_remove_state_locked(enum cpuhp_state state, bool invoke);
 
 /**
  * cpuhp_remove_state - Remove hotplug state callbacks and invoke the teardown
@@ -271,6 +295,11 @@ static inline void cpuhp_remove_state_no
 	__cpuhp_remove_state(state, false);
 }
 
+static inline void cpuhp_remove_state_nocalls_locked(enum cpuhp_state state)
+{
+	__cpuhp_remove_state_locked(state, false);
+}
+
 /**
  * cpuhp_remove_multi_state - Remove hotplug multi state callback
  * @state:	The state for which the calls are removed
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1457,7 +1457,7 @@ int __cpuhp_state_add_instance(enum cpuh
 EXPORT_SYMBOL_GPL(__cpuhp_state_add_instance);
 
 /**
- * __cpuhp_setup_state - Setup the callbacks for an hotplug machine state
+ * __cpuhp_setup_state_locked - Setup the callbacks for an hotplug machine state
  * @state:		The state to setup
  * @invoke:		If true, the startup function is invoked for cpus where
  *			cpu state >= @state
@@ -1466,17 +1466,18 @@ EXPORT_SYMBOL_GPL(__cpuhp_state_add_inst
  * @multi_instance:	State is set up for multiple instances which get
  *			added afterwards.
  *
+ * The caller needs to hold get_online_cpus() while calling this function.
  * Returns:
  *   On success:
  *      Positive state number if @state is CPUHP_AP_ONLINE_DYN
  *      0 for all other states
  *   On failure: proper (negative) error code
  */
-int __cpuhp_setup_state(enum cpuhp_state state,
-			const char *name, bool invoke,
-			int (*startup)(unsigned int cpu),
-			int (*teardown)(unsigned int cpu),
-			bool multi_instance)
+int __cpuhp_setup_state_locked(enum cpuhp_state state,
+			       const char *name, bool invoke,
+			       int (*startup)(unsigned int cpu),
+			       int (*teardown)(unsigned int cpu),
+			       bool multi_instance)
 {
 	int cpu, ret = 0;
 	bool dynstate;
@@ -1484,7 +1485,6 @@ int __cpuhp_setup_state(enum cpuhp_state
 	if (cpuhp_cb_check(state) || !name)
 		return -EINVAL;
 
-	get_online_cpus();
 	mutex_lock(&cpuhp_state_mutex);
 
 	ret = cpuhp_store_callbacks(state, name, startup, teardown,
@@ -1520,7 +1520,6 @@ int __cpuhp_setup_state(enum cpuhp_state
 	}
 out:
 	mutex_unlock(&cpuhp_state_mutex);
-	put_online_cpus();
 	/*
 	 * If the requested state is CPUHP_AP_ONLINE_DYN, return the
 	 * dynamically allocated state in case of success.
@@ -1529,6 +1528,22 @@ int __cpuhp_setup_state(enum cpuhp_state
 		return state;
 	return ret;
 }
+EXPORT_SYMBOL(__cpuhp_setup_state_locked);
+
+int __cpuhp_setup_state(enum cpuhp_state state,
+			const char *name, bool invoke,
+			int (*startup)(unsigned int cpu),
+			int (*teardown)(unsigned int cpu),
+			bool multi_instance)
+{
+	int ret;
+
+	get_online_cpus();
+	ret = __cpuhp_setup_state_locked(state, name, invoke, startup, teardown,
+					 multi_instance);
+	put_online_cpus();
+	return ret;
+}
 EXPORT_SYMBOL(__cpuhp_setup_state);
 
 int __cpuhp_state_remove_instance(enum cpuhp_state state,
@@ -1570,23 +1585,22 @@ int __cpuhp_state_remove_instance(enum c
 EXPORT_SYMBOL_GPL(__cpuhp_state_remove_instance);
 
 /**
- * __cpuhp_remove_state - Remove the callbacks for an hotplug machine state
+ * __cpuhp_remove_state_locked - Remove the callbacks for an hotplug machine state
  * @state:	The state to remove
  * @invoke:	If true, the teardown function is invoked for cpus where
  *		cpu state >= @state
  *
+ * The caller needs to hold get_online_cpus() while calling this function.
  * The teardown callback is currently not allowed to fail. Think
  * about module removal!
  */
-void __cpuhp_remove_state(enum cpuhp_state state, bool invoke)
+void __cpuhp_remove_state_locked(enum cpuhp_state state, bool invoke)
 {
 	struct cpuhp_step *sp = cpuhp_get_step(state);
 	int cpu;
 
 	BUG_ON(cpuhp_cb_check(state));
 
-	get_online_cpus();
-
 	mutex_lock(&cpuhp_state_mutex);
 	if (sp->multi_instance) {
 		WARN(!hlist_empty(&sp->list),
@@ -1613,6 +1627,13 @@ void __cpuhp_remove_state(enum cpuhp_sta
 remove:
 	cpuhp_store_callbacks(state, NULL, NULL, NULL, false);
 	mutex_unlock(&cpuhp_state_mutex);
+}
+EXPORT_SYMBOL(__cpuhp_remove_state_locked);
+
+void __cpuhp_remove_state(enum cpuhp_state state, bool invoke)
+{
+	get_online_cpus();
+	__cpuhp_remove_state_locked(state, invoke);
 	put_online_cpus();
 }
 EXPORT_SYMBOL(__cpuhp_remove_state);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 02/20] stop_machine: Provide stop_machine_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
  2017-04-15 17:01 ` [patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 03/20] padata: Make padata_alloc() static Thomas Gleixner
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

[-- Attachment #1: kernel-stopmachine-provide-stop-machine-locked.patch --]
[-- Type: text/plain, Size: 2902 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Some call sites of stop_machine() are within a get_online_cpus() protected
region.

stop_machine() calls get_online_cpus() as well, which is possible in the
current implementation but prevents converting the hotplug locking to a
percpu rwsem.

Provide stop_machine_locked() to avoid nested calls to get_online_cpus().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 include/linux/stop_machine.h |   26 +++++++++++++++++++++++---
 kernel/stop_machine.c        |    4 ++--
 2 files changed, 25 insertions(+), 5 deletions(-)

--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -116,15 +116,29 @@ static inline int try_stop_cpus(const st
  * @fn() runs.
  *
  * This can be thought of as a very heavy write lock, equivalent to
- * grabbing every spinlock in the kernel. */
+ * grabbing every spinlock in the kernel.
+ *
+ * Protects against CPU hotplug.
+ */
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
+/**
+ * stop_machine_locked: freeze the machine on all CPUs and run this function
+ * @fn: the function to run
+ * @data: the data ptr for the @fn()
+ * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
+ *
+ * Same as above. Must be called from with in a get_online_cpus() protected
+ * region. Avoids nested calls to get_online_cpus().
+ */
+int stop_machine_locked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
+
 int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
 				   const struct cpumask *cpus);
 #else	/* CONFIG_SMP || CONFIG_HOTPLUG_CPU */
 
-static inline int stop_machine(cpu_stop_fn_t fn, void *data,
-				 const struct cpumask *cpus)
+static inline int stop_machine_locked(cpu_stop_fn_t fn, void *data,
+				      const struct cpumask *cpus)
 {
 	unsigned long flags;
 	int ret;
@@ -134,6 +148,12 @@ static inline int stop_machine(cpu_stop_
 	return ret;
 }
 
+static inline int stop_machine(cpu_stop_fn_t fn, void *data,
+			       const struct cpumask *cpus)
+{
+	return stop_machine_locked(fn, data, cpus);
+}
+
 static inline int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
 						 const struct cpumask *cpus)
 {
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -552,7 +552,7 @@ static int __init cpu_stop_init(void)
 }
 early_initcall(cpu_stop_init);
 
-static int __stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus)
+int stop_machine_locked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus)
 {
 	struct multi_stop_data msdata = {
 		.fn = fn,
@@ -591,7 +591,7 @@ int stop_machine(cpu_stop_fn_t fn, void
 
 	/* No CPUs can come up or down during this. */
 	get_online_cpus();
-	ret = __stop_machine(fn, data, cpus);
+	ret = stop_machine_locked(fn, data, cpus);
 	put_online_cpus();
 	return ret;
 }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 03/20] padata: Make padata_alloc() static
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
  2017-04-15 17:01 ` [patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked() Thomas Gleixner
  2017-04-15 17:01 ` [patch 02/20] stop_machine: Provide stop_machine_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-16  6:22   ` Jason A. Donenfeld
  2017-04-15 17:01 ` [patch 04/20] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata() Thomas Gleixner
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Steffen Klassert, linux-crypto

[-- Attachment #1: padata--Make-padata_alloc---static.patch --]
[-- Type: text/plain, Size: 3035 bytes --]

No users outside of padata.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
---
 include/linux/padata.h |    3 ---
 kernel/padata.c        |   34 +++++++++++++++++-----------------
 2 files changed, 17 insertions(+), 20 deletions(-)

--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -166,9 +166,6 @@ struct padata_instance {
 
 extern struct padata_instance *padata_alloc_possible(
 					struct workqueue_struct *wq);
-extern struct padata_instance *padata_alloc(struct workqueue_struct *wq,
-					    const struct cpumask *pcpumask,
-					    const struct cpumask *cbcpumask);
 extern void padata_free(struct padata_instance *pinst);
 extern int padata_do_parallel(struct padata_instance *pinst,
 			      struct padata_priv *padata, int cb_cpu);
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -913,7 +913,7 @@ static ssize_t padata_sysfs_show(struct
 }
 
 static ssize_t padata_sysfs_store(struct kobject *kobj, struct attribute *attr,
-				  const char *buf, size_t count)
+s				  const char *buf, size_t count)
 {
 	struct padata_instance *pinst;
 	struct padata_sysfs_entry *pentry;
@@ -939,19 +939,6 @@ static struct kobj_type padata_attr_type
 };
 
 /**
- * padata_alloc_possible - Allocate and initialize padata instance.
- *                         Use the cpu_possible_mask for serial and
- *                         parallel workers.
- *
- * @wq: workqueue to use for the allocated padata instance
- */
-struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
-{
-	return padata_alloc(wq, cpu_possible_mask, cpu_possible_mask);
-}
-EXPORT_SYMBOL(padata_alloc_possible);
-
-/**
  * padata_alloc - allocate and initialize a padata instance and specify
  *                cpumasks for serial and parallel workers.
  *
@@ -959,9 +946,9 @@ EXPORT_SYMBOL(padata_alloc_possible);
  * @pcpumask: cpumask that will be used for padata parallelization
  * @cbcpumask: cpumask that will be used for padata serialization
  */
-struct padata_instance *padata_alloc(struct workqueue_struct *wq,
-				     const struct cpumask *pcpumask,
-				     const struct cpumask *cbcpumask)
+static struct padata_instance *padata_alloc(struct workqueue_struct *wq,
+					    const struct cpumask *pcpumask,
+					    const struct cpumask *cbcpumask)
 {
 	struct padata_instance *pinst;
 	struct parallel_data *pd = NULL;
@@ -1016,6 +1003,19 @@ struct padata_instance *padata_alloc(str
 }
 
 /**
+ * padata_alloc_possible - Allocate and initialize padata instance.
+ *                         Use the cpu_possible_mask for serial and
+ *                         parallel workers.
+ *
+ * @wq: workqueue to use for the allocated padata instance
+ */
+struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
+{
+	return padata_alloc(wq, cpu_possible_mask, cpu_possible_mask);
+}
+EXPORT_SYMBOL(padata_alloc_possible);
+
+/**
  * padata_free - free a padata instance
  *
  * @padata_inst: padata instance to free

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 04/20] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (2 preceding siblings ...)
  2017-04-15 17:01 ` [patch 03/20] padata: Make padata_alloc() static Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 05/20] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Steffen Klassert, linux-crypto

[-- Attachment #1: kernelpadata_Avoid_get_online_cpus_recursion_via_pcrypt_init_padata.patch --]
[-- Type: text/plain, Size: 2349 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

pcrypt_init_padata()
   get_online_cpus()
   padata_alloc_possible()
     padata_alloc()
       get_online_cpus()

The nested call to get_online_cpus() works with the current implementation,
but prevents the conversion to a percpu rwsem.

The other caller of padata_alloc_possible() is pcrypt_init_padata() which
calls from a get_online_cpus() protected region as well.

Remove the get_online_cpus() call in padata_alloc() and document the
calling convention.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org

---
 kernel/padata.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -913,7 +913,7 @@ static ssize_t padata_sysfs_show(struct
 }
 
 static ssize_t padata_sysfs_store(struct kobject *kobj, struct attribute *attr,
-s				  const char *buf, size_t count)
+				  const char *buf, size_t count)
 {
 	struct padata_instance *pinst;
 	struct padata_sysfs_entry *pentry;
@@ -945,6 +945,8 @@ static struct kobj_type padata_attr_type
  * @wq: workqueue to use for the allocated padata instance
  * @pcpumask: cpumask that will be used for padata parallelization
  * @cbcpumask: cpumask that will be used for padata serialization
+ *
+ * Must be called from a get_online_cpus() protected region
  */
 static struct padata_instance *padata_alloc(struct workqueue_struct *wq,
 					    const struct cpumask *pcpumask,
@@ -957,7 +959,6 @@ static struct padata_instance *padata_al
 	if (!pinst)
 		goto err;
 
-	get_online_cpus();
 	if (!alloc_cpumask_var(&pinst->cpumask.pcpu, GFP_KERNEL))
 		goto err_free_inst;
 	if (!alloc_cpumask_var(&pinst->cpumask.cbcpu, GFP_KERNEL)) {
@@ -997,7 +998,6 @@ static struct padata_instance *padata_al
 	free_cpumask_var(pinst->cpumask.cbcpu);
 err_free_inst:
 	kfree(pinst);
-	put_online_cpus();
 err:
 	return NULL;
 }
@@ -1008,6 +1008,8 @@ static struct padata_instance *padata_al
  *                         parallel workers.
  *
  * @wq: workqueue to use for the allocated padata instance
+ *
+ * Must be called from a get_online_cpus() protected region
  */
 struct padata_instance *padata_alloc_possible(struct workqueue_struct *wq)
 {

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 05/20] x86/mtrr: Remove get_online_cpus() from mtrr_save_state()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (3 preceding siblings ...)
  2017-04-15 17:01 ` [patch 04/20] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior, x86

[-- Attachment #1: x86mtrr_Remove_get_online_cpus_from_mtrr_save_state.patch --]
[-- Type: text/plain, Size: 936 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

mtrr_save_state() is invoked from native_cpu_up() which is in the context
of a CPU hotplug operation and therefor calling get_online_cpus() is
pointless.

While this works in the current get_online_cpus() implementation it
prevents from converting the hotplug locking to percpu rwsems.

Remove it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org

---
 arch/x86/kernel/cpu/mtrr/main.c |    2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -807,10 +807,8 @@ void mtrr_save_state(void)
 	if (!mtrr_enabled())
 		return;
 
-	get_online_cpus();
 	first_cpu = cpumask_first(cpu_online_mask);
 	smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
-	put_online_cpus();
 }
 
 void set_mtrr_aps_delayed_init(void)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (4 preceding siblings ...)
  2017-04-15 17:01 ` [patch 05/20] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 22:54   ` Rafael J. Wysocki
  2017-04-17  4:14   ` Viresh Kumar
  2017-04-15 17:01 ` [patch 07/20] KVM/PPC/Book3S HV: " Thomas Gleixner
                   ` (13 subsequent siblings)
  19 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Rafael J. Wysocki, Viresh Kumar, linux-pm

[-- Attachment #1: cpufreq-Use-cpuhp_setup_state_nocalls_locked.patch --]
[-- Type: text/plain, Size: 1599 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

cpufreq holds get_online_cpus() while invoking cpuhp_setup_state_nocalls()
to make subsys_interface_register() and the registration of hotplug calls
atomic versus cpu hotplug.

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org

---
 drivers/cpufreq/cpufreq.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2473,9 +2473,10 @@ int cpufreq_register_driver(struct cpufr
 		goto err_if_unreg;
 	}
 
-	ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
-					cpuhp_cpufreq_online,
-					cpuhp_cpufreq_offline);
+	ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+					       "cpufreq:online",
+					       cpuhp_cpufreq_online,
+					       cpuhp_cpufreq_offline);
 	if (ret < 0)
 		goto err_if_unreg;
 	hp_online = ret;
@@ -2519,7 +2520,7 @@ int cpufreq_unregister_driver(struct cpu
 	get_online_cpus();
 	subsys_interface_unregister(&cpufreq_interface);
 	remove_boost_sysfs_file();
-	cpuhp_remove_state_nocalls(hp_online);
+	cpuhp_remove_state_nocalls_locked(hp_online);
 
 	write_lock_irqsave(&cpufreq_driver_lock, flags);
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 07/20] KVM/PPC/Book3S HV: Use cpuhp_setup_state_nocalls_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (5 preceding siblings ...)
  2017-04-15 17:01 ` [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 08/20] hwtracing/coresight-etm3x: Use the locked version of cpuhp_setup_state_nocalls() Thomas Gleixner
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Alexander Graf, Benjamin Herrenschmidt, Michael Ellerman, kvm,
	kvm-ppc, linuxppc-dev

[-- Attachment #1: KVMPPCBook3S_HV_Use_cpuhp_setup_state_nocalls_locked.patch --]
[-- Type: text/plain, Size: 1290 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

kvmppc_alloc_host_rm_ops() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Graf <agraf@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kvm@vger.kernel.org
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org

---
 arch/powerpc/kvm/book3s_hv.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3336,10 +3336,10 @@ void kvmppc_alloc_host_rm_ops(void)
 		return;
 	}
 
-	cpuhp_setup_state_nocalls(CPUHP_KVM_PPC_BOOK3S_PREPARE,
-				  "ppc/kvm_book3s:prepare",
-				  kvmppc_set_host_core,
-				  kvmppc_clear_host_core);
+	cpuhp_setup_state_nocalls_locked(CPUHP_KVM_PPC_BOOK3S_PREPARE,
+					 "ppc/kvm_book3s:prepare",
+					 kvmppc_set_host_core,
+					 kvmppc_clear_host_core);
 	put_online_cpus();
 }
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 08/20] hwtracing/coresight-etm3x: Use the locked version of cpuhp_setup_state_nocalls()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (6 preceding siblings ...)
  2017-04-15 17:01 ` [patch 07/20] KVM/PPC/Book3S HV: " Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 09/20] hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Mathieu Poirier, linux-arm-kernel

[-- Attachment #1: wtracingcoresight-etm3x_Use_cpuhp_setup_state_nocalls_locked.patch --]
[-- Type: text/plain, Size: 1709 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

etm_probe() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org

---
 drivers/hwtracing/coresight/coresight-etm3x.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c b/drivers/hwtracing/coresight/coresight-etm3x.c
index a51b6b64ecdf..0887265f361d 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x.c
@@ -803,12 +803,12 @@ static int etm_probe(struct amba_device *adev, const struct amba_id *id)
 		dev_err(dev, "ETM arch init failed\n");
 
 	if (!etm_count++) {
-		cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING,
-					  "arm/coresight:starting",
-					  etm_starting_cpu, etm_dying_cpu);
-		ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-						"arm/coresight:online",
-						etm_online_cpu, NULL);
+		cpuhp_setup_state_nocalls_locked(CPUHP_AP_ARM_CORESIGHT_STARTING,
+						 "arm/coresight:starting",
+						 etm_starting_cpu, etm_dying_cpu);
+		ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+						       "arm/coresight:online",
+						       etm_online_cpu, NULL);
 		if (ret < 0)
 			goto err_arch_supported;
 		hp_online = ret;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [patch 09/20] hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (7 preceding siblings ...)
  2017-04-15 17:01 ` [patch 08/20] hwtracing/coresight-etm3x: Use the locked version of cpuhp_setup_state_nocalls() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 10/20] perf/x86/intel/cqm: Use cpuhp_setup_state_locked() Thomas Gleixner
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Mathieu Poirier, linux-arm-kernel

[-- Attachment #1: wtracingcoresight-etm4x_Use_cpuhp_setup_state_nocalls_locked.patch --]
[-- Type: text/plain, Size: 1533 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

etm_probe4() holds get_online_cpus() while invoking
cpuhp_setup_state_nocalls().

cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
correct, but prevents the conversion of the hotplug locking to a percpu
rwsem.

Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org

---
 drivers/hwtracing/coresight/coresight-etm4x.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -990,12 +990,12 @@ static int etm4_probe(struct amba_device
 		dev_err(dev, "ETM arch init failed\n");
 
 	if (!etm4_count++) {
-		cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING,
-					  "arm/coresight4:starting",
-					  etm4_starting_cpu, etm4_dying_cpu);
-		ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-						"arm/coresight4:online",
-						etm4_online_cpu, NULL);
+		cpuhp_setup_state_nocalls_locked(CPUHP_AP_ARM_CORESIGHT_STARTING,
+						 "arm/coresight4:starting",
+						 etm4_starting_cpu, etm4_dying_cpu);
+		ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
+						       "arm/coresight4:online",
+						       etm4_online_cpu, NULL);
 		if (ret < 0)
 			goto err_arch_supported;
 		hp_online = ret;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 10/20] perf/x86/intel/cqm: Use cpuhp_setup_state_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (8 preceding siblings ...)
  2017-04-15 17:01 ` [patch 09/20] hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 11/20] ARM/hw_breakpoint: " Thomas Gleixner
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	x86, Fenghua Yu

[-- Attachment #1: perfx86intelcqm_Use_cpuhp_setup_state_locked.patch --]
[-- Type: text/plain, Size: 1395 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

intel_cqm_init() holds get_online_cpus() while registerring the hotplug
callbacks.

cpuhp_setup_state() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use cpuhp_setup_state_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>

---
 arch/x86/events/intel/cqm.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -1746,12 +1746,12 @@ static int __init intel_cqm_init(void)
 	 * Setup the hot cpu notifier once we are sure cqm
 	 * is enabled to avoid notifier leak.
 	 */
-	cpuhp_setup_state(CPUHP_AP_PERF_X86_CQM_STARTING,
-			  "perf/x86/cqm:starting",
-			  intel_cqm_cpu_starting, NULL);
-	cpuhp_setup_state(CPUHP_AP_PERF_X86_CQM_ONLINE, "perf/x86/cqm:online",
-			  NULL, intel_cqm_cpu_exit);
-
+	cpuhp_setup_state_locked(CPUHP_AP_PERF_X86_CQM_STARTING,
+				 "perf/x86/cqm:starting",
+				 intel_cqm_cpu_starting, NULL);
+	cpuhp_setup_state_locked(CPUHP_AP_PERF_X86_CQM_ONLINE,
+				 "perf/x86/cqm:online",
+				 NULL, intel_cqm_cpu_exit);
 out:
 	put_online_cpus();
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 11/20] ARM/hw_breakpoint: Use cpuhp_setup_state_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (9 preceding siblings ...)
  2017-04-15 17:01 ` [patch 10/20] perf/x86/intel/cqm: Use cpuhp_setup_state_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 12/20] s390/kernel: Use stop_machine_locked() Thomas Gleixner
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Will Deacon, Mark Rutland, Russell King, linux-arm-kernel

[-- Attachment #1: ARMhw_breakpoint_Use_cpuhp_setup_state_locked.patch --]
[-- Type: text/plain, Size: 1321 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

arch_hw_breakpoint_init() holds get_online_cpus() while registerring the
hotplug callbacks.

cpuhp_setup_state() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use cpuhp_setup_state_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org

---
 arch/arm/kernel/hw_breakpoint.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -1098,8 +1098,9 @@ static int __init arch_hw_breakpoint_ini
 	 * assume that a halting debugger will leave the world in a nice state
 	 * for us.
 	 */
-	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm/hw_breakpoint:online",
-				dbg_reset_online, NULL);
+	ret = cpuhp_setup_state_locked(CPUHP_AP_ONLINE_DYN,
+				       "arm/hw_breakpoint:online",
+				       dbg_reset_online, NULL);
 	unregister_undef_hook(&debug_reg_hook);
 	if (WARN_ON(ret < 0) || !cpumask_empty(&debug_err_mask)) {
 		core_num_brps = 0;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 12/20] s390/kernel: Use stop_machine_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (10 preceding siblings ...)
  2017-04-15 17:01 ` [patch 11/20] ARM/hw_breakpoint: " Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 13/20] powerpc/powernv: " Thomas Gleixner
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Martin Schwidefsky, Heiko Carstens, David Hildenbrand,
	linux-s390

[-- Attachment #1: s390-kernel-Use_stop_machine_locked.patch --]
[-- Type: text/plain, Size: 1054 bytes --]

stp_work_fn() holds get_online_cpus() while invoking stop_machine().

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org

---
 arch/s390/kernel/time.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -636,7 +636,7 @@ static void stp_work_fn(struct work_stru
 	memset(&stp_sync, 0, sizeof(stp_sync));
 	get_online_cpus();
 	atomic_set(&stp_sync.cpus, num_online_cpus() - 1);
-	stop_machine(stp_sync_clock, &stp_sync, cpu_online_mask);
+	stop_machine_locked(stp_sync_clock, &stp_sync, cpu_online_mask);
 	put_online_cpus();
 
 	if (!check_sync_clock())

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 13/20] powerpc/powernv: Use stop_machine_locked()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (11 preceding siblings ...)
  2017-04-15 17:01 ` [patch 12/20] s390/kernel: Use stop_machine_locked() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 14/20] kernel/hotplug: Use stop_machine_locked() in takedown_cpu() Thomas Gleixner
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Benjamin Herrenschmidt, Michael Ellerman, linuxppc-dev

[-- Attachment #1: powerpcpowernv_Use_stop_machine_locked.patch --]
[-- Type: text/plain, Size: 1019 bytes --]

set_subcores_per_core() holds get_online_cpus() while invoking stop_machine().

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org

---
 arch/powerpc/platforms/powernv/subcore.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/platforms/powernv/subcore.c
+++ b/arch/powerpc/platforms/powernv/subcore.c
@@ -356,7 +356,7 @@ static int set_subcores_per_core(int new
 	/* Ensure state is consistent before we call the other cpus */
 	mb();
 
-	stop_machine(cpu_update_split_mode, &new_mode, cpu_online_mask);
+	stop_machine_locked(cpu_update_split_mode, &new_mode, cpu_online_mask);
 
 	put_online_cpus();
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 14/20] kernel/hotplug: Use stop_machine_locked() in takedown_cpu()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (12 preceding siblings ...)
  2017-04-15 17:01 ` [patch 13/20] powerpc/powernv: " Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

[-- Attachment #1: cpu-hotplug-Use-stop-machine-locked.patch --]
[-- Type: text/plain, Size: 933 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

takedown_cpu() is a cpu hotplug function invoking stop_machine(). The cpu
hotplug machinery holds the hotplug lock for write.

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_locked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/cpu.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -701,7 +701,7 @@ static int takedown_cpu(unsigned int cpu
 	/*
 	 * So now all preempt/rcu users must observe !cpu_active().
 	 */
-	err = stop_machine(take_cpu_down, NULL, cpumask_of(cpu));
+	err = stop_machine_locked(take_cpu_down, NULL, cpumask_of(cpu));
 	if (err) {
 		/* CPU refused to die */
 		irq_unlock_sparse();

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (13 preceding siblings ...)
  2017-04-15 17:01 ` [patch 14/20] kernel/hotplug: Use stop_machine_locked() in takedown_cpu() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-18 11:24   ` Borislav Petkov
  2017-04-15 17:01 ` [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Borislav Petkov, x86

[-- Attachment #1: x86-perf--Drop-EXPORT-of-perf_check_microcode.patch --]
[-- Type: text/plain, Size: 582 bytes --]

The only caller is the microcode update, which cannot be modular.

Drop the export.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
---
 arch/x86/events/core.c |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2224,7 +2224,6 @@ void perf_check_microcode(void)
 	if (x86_pmu.check_microcode)
 		x86_pmu.check_microcode();
 }
-EXPORT_SYMBOL_GPL(perf_check_microcode);
 
 static struct pmu pmu = {
 	.pmu_enable		= x86_pmu_enable,

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (14 preceding siblings ...)
  2017-04-15 17:01 ` [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-18 11:27   ` Borislav Petkov
  2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Borislav Petkov, x86

[-- Attachment #1: perfx86intel_Drop_get_online_cpus_in_intel_snb_check_microcode.patch --]
[-- Type: text/plain, Size: 1554 bytes --]

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

If intel_snb_check_microcode() is invoked via
  microcode_init -> perf_check_microcode -> intel_snb_check_microcode

then get_online_cpus() is invoked nested. This works with the current
implementation of get_online_cpus() but prevents converting it to a percpu
rwsem.

intel_snb_check_microcode() is also invoked from intel_sandybridge_quirk()
unprotected.

Drop get_online_cpus() from intel_snb_check_microcode() and add it to
intel_sandybridge_quirk() so both call sites are protected.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org

---
 arch/x86/events/intel/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3389,12 +3389,10 @@ static void intel_snb_check_microcode(vo
 	int pebs_broken = 0;
 	int cpu;
 
-	get_online_cpus();
 	for_each_online_cpu(cpu) {
 		if ((pebs_broken = intel_snb_pebs_broken(cpu)))
 			break;
 	}
-	put_online_cpus();
 
 	if (pebs_broken == x86_pmu.pebs_broken)
 		return;
@@ -3467,7 +3465,9 @@ static bool check_msr(unsigned long msr,
 static __init void intel_sandybridge_quirk(void)
 {
 	x86_pmu.check_microcode = intel_snb_check_microcode;
+	get_online_cpus();
 	intel_snb_check_microcode();
+	put_online_cpus();
 }
 
 static const struct { int id; char *name; } intel_arch_events_map[] __initconst = {

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (15 preceding siblings ...)
  2017-04-15 17:01 ` [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-17  6:46   ` Peter Zijlstra
  2017-04-18 19:44   ` Bjorn Helgaas
  2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
                   ` (2 subsequent siblings)
  19 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Bjorn Helgaas, linux-pci

[-- Attachment #1: PCI--Use-cpu_hotplug_disable-instead-of-get_online_cpus.patch --]
[-- Type: text/plain, Size: 2760 bytes --]

Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
unearthed a circular lock dependency which was hidden from lockdep due to
the lockdep annotation of get_online_cpus() which prevents lockdep from
creating full dependency chains. There are several variants of this. And
example is:

Chain exists of:

cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex

CPU0                    CPU1
----                    ----
lock(&item->mutex);
                        lock(drm_global_mutex);
                        lock(&item->mutex);
lock(cpu_hotplug_lock.rw_sem);

because there are dependencies through workqueues. The call chain is:

	get_online_cpus
	apply_workqueue_attrs
	__alloc_workqueue_key
	ttm_mem_global_init
	ast_ttm_mem_global_init
	drm_global_item_ref
	ast_mm_init
	ast_driver_load
	drm_dev_register
	drm_get_pci_dev
	ast_pci_probe
	local_pci_probe
	work_for_cpu_fn
	process_one_work
	worker_thread

This is not a problem of get_online_cpus() recursion, it's a possible
deadlock undetected by lockdep so far.

The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
protect the PCI probing.

There is a side effect to this: cpu_hotplug_disable() makes a concurrent
cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI
probing usually happens during the boot process where no interaction is
possible. Any later invocations are infrequent enough and concurrent
hotplug attempts are so unlikely that the danger of user space visible
regressions is very close to zero. Anyway, thats preferrable over a real
deadlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-driver.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
 	return 0;
 }
 
+static bool pci_physfn_is_probed(struct pci_dev *dev)
+{
+#ifdef CONFIG_ATS
+	return dev->physfn->is_probed;
+#else
+	return false;
+#endif
+}
+
 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 			  const struct pci_device_id *id)
 {
-	int error, node;
+	int error, node, cpu;
 	struct drv_dev_and_id ddi = { drv, dev, id };
 
 	/*
@@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri
 	if (node >= 0 && node != numa_node_id()) {
 		int cpu;
 
-		get_online_cpus();
+		cpu_hotplug_disable();
 		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
 		if (cpu < nr_cpu_ids)
 			error = work_on_cpu(cpu, local_pci_probe, &ddi);
 		else
 			error = local_pci_probe(&ddi);
-		put_online_cpus();
+		cpu_hotplug_enable();
 	} else
 		error = local_pci_probe(&ddi);
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 18/20] PCI: Replace the racy recursion prevention
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (16 preceding siblings ...)
  2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-15 17:01 ` [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
  2017-04-15 17:01 ` [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
  19 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Bjorn Helgaas, linux-pci

[-- Attachment #1: PCI--Replace-the-racy-recursion-prevention.patch --]
[-- Type: text/plain, Size: 3208 bytes --]

pci_call_probe() can called recursively when a physcial function is probed
and the probing creates virtual functions, which are populated via
pci_bus_add_device() which in turn can end up calling pci_call_probe()
again.

The code has an interesting way to prevent recursing into the workqueue
code.  That's accomplished by a check whether the current task runs already
on the numa node which is associated with the device.

While that works to prevent the recursion into the workqueue code, it's
racy versus normal execution as there is no guarantee that the node does
not vanish after the check.

Make the detection reliable by:

 - Mark a probed device as 'is_probed' in pci_call_probe()
 
 - Check in pci_call_probe for a virtual function. If it's a virtual
   function and the associated physical function device is marked
   'is_probed' then this is a recursive call, so the call can be invoked in
   the calling context.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-driver.c |   35 ++++++++++++++---------------------
 include/linux/pci.h      |    1 +
 2 files changed, 15 insertions(+), 21 deletions(-)

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -341,33 +341,26 @@ static int pci_call_probe(struct pci_dri
 	 * on the right node.
 	 */
 	node = dev_to_node(&dev->dev);
+	dev->is_probed = 1;
+
+	cpu_hotplug_disable();
 
 	/*
-	 * On NUMA systems, we are likely to call a PF probe function using
-	 * work_on_cpu().  If that probe calls pci_enable_sriov() (which
-	 * adds the VF devices via pci_bus_add_device()), we may re-enter
-	 * this function to call the VF probe function.  Calling
-	 * work_on_cpu() again will cause a lockdep warning.  Since VFs are
-	 * always on the same node as the PF, we can work around this by
-	 * avoiding work_on_cpu() when we're already on the correct node.
-	 *
-	 * Preemption is enabled, so it's theoretically unsafe to use
-	 * numa_node_id(), but even if we run the probe function on the
-	 * wrong node, it should be functionally correct.
+	 * Prevent nesting work_on_cpu() for the case where a Virtual Function
+	 * device is probed from work_on_cpu() of the Physical device.
 	 */
-	if (node >= 0 && node != numa_node_id()) {
-		int cpu;
-
-		cpu_hotplug_disable();
+	if (dev->is_virtfn && pci_physfn_is_probed(dev))
+		cpu = nr_cpu_ids;
+	else
 		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-		if (cpu < nr_cpu_ids)
-			error = work_on_cpu(cpu, local_pci_probe, &ddi);
-		else
-			error = local_pci_probe(&ddi);
-		cpu_hotplug_enable();
-	} else
+
+	if (cpu < nr_cpu_ids)
+		error = work_on_cpu(cpu, local_pci_probe, &ddi);
+	else
 		error = local_pci_probe(&ddi);
 
+	dev->is_probed = 0;
+	cpu_hotplug_enable();
 	return error;
 }
 
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -365,6 +365,7 @@ struct pci_dev {
 	unsigned int	irq_managed:1;
 	unsigned int	has_secondary_link:1;
 	unsigned int	non_compliant_bars:1;	/* broken BARs; ignore them */
+	unsigned int	is_probed:1;		/* device probing in progress */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (17 preceding siblings ...)
  2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-16 22:53   ` Rafael J. Wysocki
  2017-04-15 17:01 ` [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
  19 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Rafael J. Wysocki, Len Brown, linux-acpi

[-- Attachment #1: ACPI-processor--Use-cpu_hotplug_disable-instead-of-get_online_cpus.patch --]
[-- Type: text/plain, Size: 1918 bytes --]

Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
unearthed a circular lock dependency which was hidden from lockdep due to
the lockdep annotation of get_online_cpus() which prevents lockdep from
creating full dependency chains.

CPU0                    CPU1
----                    ----
lock((&wfc.work));
                         lock(cpu_hotplug_lock.rw_sem);
                         lock((&wfc.work));
lock(cpu_hotplug_lock.rw_sem);

This dependency is established via acpi_processor_start() which calls into
the work queue code. And the work queue code establishes the reverse
dependency.

This is not a problem of get_online_cpus() recursion, it's a possible
deadlock undetected by lockdep so far.

The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
protect the probing from acpi_processor_start().

There is a side effect to this: cpu_hotplug_disable() makes a concurrent
cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but that
probing usually happens during the boot process where no interaction is
possible. Any later invocations are infrequent enough and concurrent
hotplug attempts are so unlikely that the danger of user space visible
regressions is very close to zero. Anyway, thats preferrable over a real
deadlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
---
 drivers/acpi/processor_driver.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -268,9 +268,9 @@ static int acpi_processor_start(struct d
 		return -ENODEV;
 
 	/* Protect against concurrent CPU hotplug operations */
-	get_online_cpus();
+	cpu_hotplug_disable();
 	ret = __acpi_processor_start(device);
-	put_online_cpus();
+	cpu_hotplug_enable();
 	return ret;
 }
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem
  2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
                   ` (18 preceding siblings ...)
  2017-04-15 17:01 ` [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-15 17:01 ` Thomas Gleixner
  2017-04-17  6:50   ` Peter Zijlstra
  19 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-15 17:01 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Sebastian Siewior

[-- Attachment #1: cpu-hotplug--Convert-hotplug-locking-to-percpu-rwsem.patch --]
[-- Type: text/plain, Size: 4854 bytes --]

There are no more (known) nested calls to get_online_cpus() so it's
possible to remove the nested call magic and convert the mutex to a
percpu-rwsem, which speeds up get/put_online_cpus() significantly for the
uncontended case.

The contended case (write locked for hotplug operations) is slow anyway, so
the slightly more expensive down_write of the percpu rwsem does not matter.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/cpu.c |  102 ++++-------------------------------------------------------
 1 file changed, 8 insertions(+), 94 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -27,6 +27,7 @@
 #include <linux/smpboot.h>
 #include <linux/relay.h>
 #include <linux/slab.h>
+#include <linux/percpu-rwsem.h>
 
 #include <trace/events/power.h>
 #define CREATE_TRACE_POINTS
@@ -196,121 +197,36 @@ void cpu_maps_update_done(void)
 	mutex_unlock(&cpu_add_remove_lock);
 }
 
-/* If set, cpu_up and cpu_down will return -EBUSY and do nothing.
+/*
+ * If set, cpu_up and cpu_down will return -EBUSY and do nothing.
  * Should always be manipulated under cpu_add_remove_lock
  */
 static int cpu_hotplug_disabled;
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-static struct {
-	struct task_struct *active_writer;
-	/* wait queue to wake up the active_writer */
-	wait_queue_head_t wq;
-	/* verifies that no writer will get active while readers are active */
-	struct mutex lock;
-	/*
-	 * Also blocks the new readers during
-	 * an ongoing cpu hotplug operation.
-	 */
-	atomic_t refcount;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
-} cpu_hotplug = {
-	.active_writer = NULL,
-	.wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq),
-	.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	.dep_map = STATIC_LOCKDEP_MAP_INIT("cpu_hotplug.dep_map", &cpu_hotplug.dep_map),
-#endif
-};
-
-/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
-#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire_tryread() \
-				  lock_map_acquire_tryread(&cpu_hotplug.dep_map)
-#define cpuhp_lock_acquire()      lock_map_acquire(&cpu_hotplug.dep_map)
-#define cpuhp_lock_release()      lock_map_release(&cpu_hotplug.dep_map)
-
+DEFINE_STATIC_PERCPU_RWSEM(cpu_hotplug_lock);
 
 void get_online_cpus(void)
 {
-	might_sleep();
-	if (cpu_hotplug.active_writer == current)
-		return;
-	cpuhp_lock_acquire_read();
-	mutex_lock(&cpu_hotplug.lock);
-	atomic_inc(&cpu_hotplug.refcount);
-	mutex_unlock(&cpu_hotplug.lock);
+	percpu_down_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(get_online_cpus);
 
 void put_online_cpus(void)
 {
-	int refcount;
-
-	if (cpu_hotplug.active_writer == current)
-		return;
-
-	refcount = atomic_dec_return(&cpu_hotplug.refcount);
-	if (WARN_ON(refcount < 0)) /* try to fix things up */
-		atomic_inc(&cpu_hotplug.refcount);
-
-	if (refcount <= 0 && waitqueue_active(&cpu_hotplug.wq))
-		wake_up(&cpu_hotplug.wq);
-
-	cpuhp_lock_release();
-
+	percpu_up_read(&cpu_hotplug_lock);
 }
 EXPORT_SYMBOL_GPL(put_online_cpus);
 
-/*
- * This ensures that the hotplug operation can begin only when the
- * refcount goes to zero.
- *
- * Note that during a cpu-hotplug operation, the new readers, if any,
- * will be blocked by the cpu_hotplug.lock
- *
- * Since cpu_hotplug_begin() is always called after invoking
- * cpu_maps_update_begin(), we can be sure that only one writer is active.
- *
- * Note that theoretically, there is a possibility of a livelock:
- * - Refcount goes to zero, last reader wakes up the sleeping
- *   writer.
- * - Last reader unlocks the cpu_hotplug.lock.
- * - A new reader arrives at this moment, bumps up the refcount.
- * - The writer acquires the cpu_hotplug.lock finds the refcount
- *   non zero and goes to sleep again.
- *
- * However, this is very difficult to achieve in practice since
- * get_online_cpus() not an api which is called all that often.
- *
- */
 void cpu_hotplug_begin(void)
 {
-	DEFINE_WAIT(wait);
-
-	cpu_hotplug.active_writer = current;
-	cpuhp_lock_acquire();
-
-	for (;;) {
-		mutex_lock(&cpu_hotplug.lock);
-		prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
-		if (likely(!atomic_read(&cpu_hotplug.refcount)))
-				break;
-		mutex_unlock(&cpu_hotplug.lock);
-		schedule();
-	}
-	finish_wait(&cpu_hotplug.wq, &wait);
+	percpu_down_write(&cpu_hotplug_lock);
 }
 
 void cpu_hotplug_done(void)
 {
-	cpu_hotplug.active_writer = NULL;
-	mutex_unlock(&cpu_hotplug.lock);
-	cpuhp_lock_release();
+	percpu_up_write(&cpu_hotplug_lock);
 }
 
 /*
@@ -344,8 +260,6 @@ void cpu_hotplug_enable(void)
 EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
 #endif	/* CONFIG_HOTPLUG_CPU */
 
-/* Notifier wrappers for transitioning to state machine */
-
 static int bringup_wait_for_ap(unsigned int cpu)
 {
 	struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked()
  2017-04-15 17:01 ` [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
@ 2017-04-15 22:54   ` Rafael J. Wysocki
  2017-04-17  4:14   ` Viresh Kumar
  1 sibling, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2017-04-15 22:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Rafael J. Wysocki, Viresh Kumar, Linux PM

On Sat, Apr 15, 2017 at 7:01 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> cpufreq holds get_online_cpus() while invoking cpuhp_setup_state_nocalls()
> to make subsys_interface_register() and the registration of hotplug calls
> atomic versus cpu hotplug.
>
> cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
> correct, but prevents the conversion of the hotplug locking to a percpu
> rwsem.
>
> Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: linux-pm@vger.kernel.org

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

>
> ---
>  drivers/cpufreq/cpufreq.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2473,9 +2473,10 @@ int cpufreq_register_driver(struct cpufr
>                 goto err_if_unreg;
>         }
>
> -       ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
> -                                       cpuhp_cpufreq_online,
> -                                       cpuhp_cpufreq_offline);
> +       ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
> +                                              "cpufreq:online",
> +                                              cpuhp_cpufreq_online,
> +                                              cpuhp_cpufreq_offline);
>         if (ret < 0)
>                 goto err_if_unreg;
>         hp_online = ret;
> @@ -2519,7 +2520,7 @@ int cpufreq_unregister_driver(struct cpu
>         get_online_cpus();
>         subsys_interface_unregister(&cpufreq_interface);
>         remove_boost_sysfs_file();
> -       cpuhp_remove_state_nocalls(hp_online);
> +       cpuhp_remove_state_nocalls_locked(hp_online);
>
>         write_lock_irqsave(&cpufreq_driver_lock, flags);
>
>
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 03/20] padata: Make padata_alloc() static
  2017-04-15 17:01 ` [patch 03/20] padata: Make padata_alloc() static Thomas Gleixner
@ 2017-04-16  6:22   ` Jason A. Donenfeld
  2017-04-17  9:14     ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Jason A. Donenfeld @ 2017-04-16  6:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Steffen Klassert, Linux Crypto Mailing List

I rather like this option of padata, which, since it lives in
kernel/padata.c and linux/padata.h, should be generic and useful for
other components. Seems like the ability to allocate it for a
particular set of worker CPUs and callback CPUs could be useful down
the line. Would rather not see it become static.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-15 17:01 ` [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-16 22:53   ` Rafael J. Wysocki
  0 siblings, 0 replies; 33+ messages in thread
From: Rafael J. Wysocki @ 2017-04-16 22:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Len Brown, linux-acpi

On Saturday, April 15, 2017 07:01:26 PM Thomas Gleixner wrote:
> Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
> unearthed a circular lock dependency which was hidden from lockdep due to
> the lockdep annotation of get_online_cpus() which prevents lockdep from
> creating full dependency chains.
> 
> CPU0                    CPU1
> ----                    ----
> lock((&wfc.work));
>                          lock(cpu_hotplug_lock.rw_sem);
>                          lock((&wfc.work));
> lock(cpu_hotplug_lock.rw_sem);
> 
> This dependency is established via acpi_processor_start() which calls into
> the work queue code. And the work queue code establishes the reverse
> dependency.
> 
> This is not a problem of get_online_cpus() recursion, it's a possible
> deadlock undetected by lockdep so far.
> 
> The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
> protect the probing from acpi_processor_start().
> 
> There is a side effect to this: cpu_hotplug_disable() makes a concurrent
> cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but that
> probing usually happens during the boot process where no interaction is
> possible. Any later invocations are infrequent enough and concurrent
> hotplug attempts are so unlikely that the danger of user space visible
> regressions is very close to zero. Anyway, thats preferrable over a real
> deadlock.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Len Brown <lenb@kernel.org>
> Cc: linux-acpi@vger.kernel.org

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
>  drivers/acpi/processor_driver.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/acpi/processor_driver.c
> +++ b/drivers/acpi/processor_driver.c
> @@ -268,9 +268,9 @@ static int acpi_processor_start(struct d
>  		return -ENODEV;
>  
>  	/* Protect against concurrent CPU hotplug operations */
> -	get_online_cpus();
> +	cpu_hotplug_disable();
>  	ret = __acpi_processor_start(device);
> -	put_online_cpus();
> +	cpu_hotplug_enable();
>  	return ret;
>  }
>  
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked()
  2017-04-15 17:01 ` [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
  2017-04-15 22:54   ` Rafael J. Wysocki
@ 2017-04-17  4:14   ` Viresh Kumar
  1 sibling, 0 replies; 33+ messages in thread
From: Viresh Kumar @ 2017-04-17  4:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Rafael J. Wysocki, linux-pm

On 15-04-17, 19:01, Thomas Gleixner wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> cpufreq holds get_online_cpus() while invoking cpuhp_setup_state_nocalls()
> to make subsys_interface_register() and the registration of hotplug calls
> atomic versus cpu hotplug.
> 
> cpuhp_setup_state_nocalls() invokes get_online_cpus() as well. This is
> correct, but prevents the conversion of the hotplug locking to a percpu
> rwsem.
> 
> Use cpuhp_setup_state_nocalls_locked() to avoid the nested call.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: linux-pm@vger.kernel.org
> 
> ---
>  drivers/cpufreq/cpufreq.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2473,9 +2473,10 @@ int cpufreq_register_driver(struct cpufr
>  		goto err_if_unreg;
>  	}
>  
> -	ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
> -					cpuhp_cpufreq_online,
> -					cpuhp_cpufreq_offline);
> +	ret = cpuhp_setup_state_nocalls_locked(CPUHP_AP_ONLINE_DYN,
> +					       "cpufreq:online",
> +					       cpuhp_cpufreq_online,
> +					       cpuhp_cpufreq_offline);
>  	if (ret < 0)
>  		goto err_if_unreg;
>  	hp_online = ret;
> @@ -2519,7 +2520,7 @@ int cpufreq_unregister_driver(struct cpu
>  	get_online_cpus();
>  	subsys_interface_unregister(&cpufreq_interface);
>  	remove_boost_sysfs_file();
> -	cpuhp_remove_state_nocalls(hp_online);
> +	cpuhp_remove_state_nocalls_locked(hp_online);
>  
>  	write_lock_irqsave(&cpufreq_driver_lock, flags);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
@ 2017-04-17  6:46   ` Peter Zijlstra
  2017-04-17  7:40     ` Thomas Gleixner
  2017-04-18 19:44   ` Bjorn Helgaas
  1 sibling, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2017-04-17  6:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Bjorn Helgaas, linux-pci

On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> +++ b/drivers/pci/pci-driver.c
> @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
>  	return 0;
>  }
>  
> +static bool pci_physfn_is_probed(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_ATS
> +	return dev->physfn->is_probed;
> +#else
> +	return false;
> +#endif
> +}
> +

Should be in the next patch perhaps?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem
  2017-04-15 17:01 ` [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
@ 2017-04-17  6:50   ` Peter Zijlstra
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2017-04-17  6:50 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior

On Sat, Apr 15, 2017 at 07:01:27PM +0200, Thomas Gleixner wrote:
> There are no more (known) nested calls to get_online_cpus() so it's
> possible to remove the nested call magic and convert the mutex to a
> percpu-rwsem, which speeds up get/put_online_cpus() significantly for the
> uncontended case.
> 
> The contended case (write locked for hotplug operations) is slow anyway, so
> the slightly more expensive down_write of the percpu rwsem does not matter.

So the previous thing was reader biassed and thus prone to writer
starvation. So a slightly more expensive write path doesn't matter; esp.
as its now fair and provides a guarantee it will happen, unlike the
previous one.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-17  6:46   ` Peter Zijlstra
@ 2017-04-17  7:40     ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-17  7:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Steven Rostedt, Sebastian Siewior,
	Bjorn Helgaas, linux-pci

On Mon, 17 Apr 2017, Peter Zijlstra wrote:
> On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> > +++ b/drivers/pci/pci-driver.c
> > @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
> >  	return 0;
> >  }
> >  
> > +static bool pci_physfn_is_probed(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_ATS
> > +	return dev->physfn->is_probed;
> > +#else
> > +	return false;
> > +#endif
> > +}
> > +
> 
> Should be in the next patch perhaps?

Indeed.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 03/20] padata: Make padata_alloc() static
  2017-04-16  6:22   ` Jason A. Donenfeld
@ 2017-04-17  9:14     ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-17  9:14 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Steffen Klassert, Linux Crypto Mailing List

On Sun, 16 Apr 2017, Jason A. Donenfeld wrote:

> I rather like this option of padata, which, since it lives in
> kernel/padata.c and linux/padata.h, should be generic and useful for
> other components. Seems like the ability to allocate it for a
> particular set of worker CPUs and callback CPUs could be useful down
> the line. Would rather not see it become static.

It's simple enough to export it once there is an actual user. Just keeping
stuff global because it might be useful somewhere down the road is really
pointless.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode
  2017-04-15 17:01 ` [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
@ 2017-04-18 11:24   ` Borislav Petkov
  0 siblings, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2017-04-18 11:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, x86

On Sat, Apr 15, 2017 at 07:01:22PM +0200, Thomas Gleixner wrote:
> The only caller is the microcode update, which cannot be modular.
> 
> Drop the export.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org
> ---
>  arch/x86/events/core.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2224,7 +2224,6 @@ void perf_check_microcode(void)
>  	if (x86_pmu.check_microcode)
>  		x86_pmu.check_microcode();
>  }
> -EXPORT_SYMBOL_GPL(perf_check_microcode);
>  
>  static struct pmu pmu = {
>  	.pmu_enable		= x86_pmu_enable,

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()
  2017-04-15 17:01 ` [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
@ 2017-04-18 11:27   ` Borislav Petkov
  0 siblings, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2017-04-18 11:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, x86

On Sat, Apr 15, 2017 at 07:01:23PM +0200, Thomas Gleixner wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> If intel_snb_check_microcode() is invoked via
>   microcode_init -> perf_check_microcode -> intel_snb_check_microcode
> 
> then get_online_cpus() is invoked nested. This works with the current
> implementation of get_online_cpus() but prevents converting it to a percpu
> rwsem.
> 
> intel_snb_check_microcode() is also invoked from intel_sandybridge_quirk()
> unprotected.
> 
> Drop get_online_cpus() from intel_snb_check_microcode() and add it to
> intel_sandybridge_quirk() so both call sites are protected.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
  2017-04-17  6:46   ` Peter Zijlstra
@ 2017-04-18 19:44   ` Bjorn Helgaas
  2017-04-18 19:51     ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2017-04-18 19:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Bjorn Helgaas, linux-pci

On Sat, Apr 15, 2017 at 07:01:24PM +0200, Thomas Gleixner wrote:
> Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem
> unearthed a circular lock dependency which was hidden from lockdep due to
> the lockdep annotation of get_online_cpus() which prevents lockdep from
> creating full dependency chains. There are several variants of this. And
> example is:
> 
> Chain exists of:
> 
> cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex
> 
> CPU0                    CPU1
> ----                    ----
> lock(&item->mutex);
>                         lock(drm_global_mutex);
>                         lock(&item->mutex);
> lock(cpu_hotplug_lock.rw_sem);
> 
> because there are dependencies through workqueues. The call chain is:
> 
> 	get_online_cpus
> 	apply_workqueue_attrs
> 	__alloc_workqueue_key
> 	ttm_mem_global_init
> 	ast_ttm_mem_global_init
> 	drm_global_item_ref
> 	ast_mm_init
> 	ast_driver_load
> 	drm_dev_register
> 	drm_get_pci_dev
> 	ast_pci_probe
> 	local_pci_probe
> 	work_for_cpu_fn
> 	process_one_work
> 	worker_thread
> 
> This is not a problem of get_online_cpus() recursion, it's a possible
> deadlock undetected by lockdep so far.
> 
> The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to
> protect the PCI probing.
> 
> There is a side effect to this: cpu_hotplug_disable() makes a concurrent
> cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI
> probing usually happens during the boot process where no interaction is
> possible. Any later invocations are infrequent enough and concurrent
> hotplug attempts are so unlikely that the danger of user space visible
> regressions is very close to zero. Anyway, thats preferrable over a real
> deadlock.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pci-driver.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -320,10 +320,19 @@ static long local_pci_probe(void *_ddi)
>  	return 0;
>  }
>  
> +static bool pci_physfn_is_probed(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_ATS

I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS.

But I think CONFIG_PCI_IOV would be more appropriate.  With that, and
squashing this into the next patch,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

I expect you'll merge this along with the rest of the series.  Let me
know if you need anything else from me.

> +	return dev->physfn->is_probed;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>  			  const struct pci_device_id *id)
>  {
> -	int error, node;
> +	int error, node, cpu;
>  	struct drv_dev_and_id ddi = { drv, dev, id };
>  
>  	/*
> @@ -349,13 +358,13 @@ static int pci_call_probe(struct pci_dri
>  	if (node >= 0 && node != numa_node_id()) {
>  		int cpu;
>  
> -		get_online_cpus();
> +		cpu_hotplug_disable();
>  		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
>  		if (cpu < nr_cpu_ids)
>  			error = work_on_cpu(cpu, local_pci_probe, &ddi);
>  		else
>  			error = local_pci_probe(&ddi);
> -		put_online_cpus();
> +		cpu_hotplug_enable();
>  	} else
>  		error = local_pci_probe(&ddi);
>  
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
  2017-04-18 19:44   ` Bjorn Helgaas
@ 2017-04-18 19:51     ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2017-04-18 19:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Sebastian Siewior, Bjorn Helgaas, linux-pci

On Tue, 18 Apr 2017, Bjorn Helgaas wrote:
> > +static bool pci_physfn_is_probed(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_ATS
> 
> I think this was intended to be CONFIG_PCI_ATS, not CONFIG_ATS.

yes.

> 
> But I think CONFIG_PCI_IOV would be more appropriate.  With that, and

The physfn member is under CONFIG_PCI_ATS so I took that one, but you are
right PCI_IOV is the proper one. Will fix.

> squashing this into the next patch,

Did so

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-04-18 19:51 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-15 17:01 [patch 00/20] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner
2017-04-15 17:01 ` [patch 01/20] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_locked() Thomas Gleixner
2017-04-15 17:01 ` [patch 02/20] stop_machine: Provide stop_machine_locked() Thomas Gleixner
2017-04-15 17:01 ` [patch 03/20] padata: Make padata_alloc() static Thomas Gleixner
2017-04-16  6:22   ` Jason A. Donenfeld
2017-04-17  9:14     ` Thomas Gleixner
2017-04-15 17:01 ` [patch 04/20] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata() Thomas Gleixner
2017-04-15 17:01 ` [patch 05/20] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner
2017-04-15 17:01 ` [patch 06/20] cpufreq: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
2017-04-15 22:54   ` Rafael J. Wysocki
2017-04-17  4:14   ` Viresh Kumar
2017-04-15 17:01 ` [patch 07/20] KVM/PPC/Book3S HV: " Thomas Gleixner
2017-04-15 17:01 ` [patch 08/20] hwtracing/coresight-etm3x: Use the locked version of cpuhp_setup_state_nocalls() Thomas Gleixner
2017-04-15 17:01 ` [patch 09/20] hwtracing/coresight-etm4x: Use cpuhp_setup_state_nocalls_locked() Thomas Gleixner
2017-04-15 17:01 ` [patch 10/20] perf/x86/intel/cqm: Use cpuhp_setup_state_locked() Thomas Gleixner
2017-04-15 17:01 ` [patch 11/20] ARM/hw_breakpoint: " Thomas Gleixner
2017-04-15 17:01 ` [patch 12/20] s390/kernel: Use stop_machine_locked() Thomas Gleixner
2017-04-15 17:01 ` [patch 13/20] powerpc/powernv: " Thomas Gleixner
2017-04-15 17:01 ` [patch 14/20] kernel/hotplug: Use stop_machine_locked() in takedown_cpu() Thomas Gleixner
2017-04-15 17:01 ` [patch 15/20] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner
2017-04-18 11:24   ` Borislav Petkov
2017-04-15 17:01 ` [patch 16/20] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner
2017-04-18 11:27   ` Borislav Petkov
2017-04-15 17:01 ` [patch 17/20] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-04-17  6:46   ` Peter Zijlstra
2017-04-17  7:40     ` Thomas Gleixner
2017-04-18 19:44   ` Bjorn Helgaas
2017-04-18 19:51     ` Thomas Gleixner
2017-04-15 17:01 ` [patch 18/20] PCI: Replace the racy recursion prevention Thomas Gleixner
2017-04-15 17:01 ` [patch 19/20] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner
2017-04-16 22:53   ` Rafael J. Wysocki
2017-04-15 17:01 ` [patch 20/20] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner
2017-04-17  6:50   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).