From: Thomas Gleixner <tglx@linutronix.de> To: LKML <linux-kernel@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>, Steven Rostedt <rostedt@goodmis.org>, Sebastian Siewior <bigeasy@linutronix.de>, Bjorn Helgaas <bhelgaas@google.com>, linux-pci@vger.kernel.org Subject: [patch V2 17/24] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Date: Tue, 18 Apr 2017 19:04:59 +0200 [thread overview] Message-ID: <20170418170553.806707929@linutronix.de> (raw) In-Reply-To: 20170418170442.665445272@linutronix.de [-- Attachment #1: PCI--Use-cpu_hotplug_disable-instead-of-get_online_cpus.patch --] [-- Type: text/plain, Size: 2312 bytes --] Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem unearthed a circular lock dependency which was hidden from lockdep due to the lockdep annotation of get_online_cpus() which prevents lockdep from creating full dependency chains. There are several variants of this. And example is: Chain exists of: cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex CPU0 CPU1 ---- ---- lock(&item->mutex); lock(drm_global_mutex); lock(&item->mutex); lock(cpu_hotplug_lock.rw_sem); because there are dependencies through workqueues. The call chain is: get_online_cpus apply_workqueue_attrs __alloc_workqueue_key ttm_mem_global_init ast_ttm_mem_global_init drm_global_item_ref ast_mm_init ast_driver_load drm_dev_register drm_get_pci_dev ast_pci_probe local_pci_probe work_for_cpu_fn process_one_work worker_thread This is not a problem of get_online_cpus() recursion, it's a possible deadlock undetected by lockdep so far. The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to protect the PCI probing. There is a side effect to this: cpu_hotplug_disable() makes a concurrent cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI probing usually happens during the boot process where no interaction is possible. Any later invocations are infrequent enough and concurrent hotplug attempts are so unlikely that the danger of user space visible regressions is very close to zero. Anyway, thats preferrable over a real deadlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org --- drivers/pci/pci-driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -349,13 +349,13 @@ static int pci_call_probe(struct pci_dri if (node >= 0 && node != numa_node_id()) { int cpu; - get_online_cpus(); + cpu_hotplug_disable(); cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); - put_online_cpus(); + cpu_hotplug_enable(); } else error = local_pci_probe(&ddi);
WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de> To: LKML <linux-kernel@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>, Steven Rostedt <rostedt@goodmis.org>, Sebastian Siewior <bigeasy@linutronix.de>, Bjorn Helgaas <bhelgaas@google.com>, linux-pci@vger.kernel.org Subject: [patch V2 17/24] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Date: Tue, 18 Apr 2017 19:04:59 +0200 [thread overview] Message-ID: <20170418170553.806707929@linutronix.de> (raw) In-Reply-To: 20170418170442.665445272@linutronix.de Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem unearthed a circular lock dependency which was hidden from lockdep due to the lockdep annotation of get_online_cpus() which prevents lockdep from creating full dependency chains. There are several variants of this. And example is: Chain exists of: cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex CPU0 CPU1 ---- ---- lock(&item->mutex); lock(drm_global_mutex); lock(&item->mutex); lock(cpu_hotplug_lock.rw_sem); because there are dependencies through workqueues. The call chain is: get_online_cpus apply_workqueue_attrs __alloc_workqueue_key ttm_mem_global_init ast_ttm_mem_global_init drm_global_item_ref ast_mm_init ast_driver_load drm_dev_register drm_get_pci_dev ast_pci_probe local_pci_probe work_for_cpu_fn process_one_work worker_thread This is not a problem of get_online_cpus() recursion, it's a possible deadlock undetected by lockdep so far. The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to protect the PCI probing. There is a side effect to this: cpu_hotplug_disable() makes a concurrent cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI probing usually happens during the boot process where no interaction is possible. Any later invocations are infrequent enough and concurrent hotplug attempts are so unlikely that the danger of user space visible regressions is very close to zero. Anyway, thats preferrable over a real deadlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org --- drivers/pci/pci-driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -349,13 +349,13 @@ static int pci_call_probe(struct pci_dri if (node >= 0 && node != numa_node_id()) { int cpu; - get_online_cpus(); + cpu_hotplug_disable(); cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); - put_online_cpus(); + cpu_hotplug_enable(); } else error = local_pci_probe(&ddi);
next prev parent reply other threads:[~2017-04-18 19:52 UTC|newest] Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-04-18 17:04 [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Thomas Gleixner 2017-04-18 17:04 ` [patch V2 01/24] cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_cpuslocked() Thomas Gleixner 2017-04-20 11:18 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 02/24] stop_machine: Provide stop_machine_cpuslocked() Thomas Gleixner 2017-04-20 11:19 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 03/24] padata: Make padata_alloc() static Thomas Gleixner 2017-04-20 11:19 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:04 ` [patch V2 04/24] padata: Avoid nested calls to get_online_cpus() in pcrypt_init_padata() Thomas Gleixner 2017-04-20 11:20 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 05/24] x86/mtrr: Remove get_online_cpus() from mtrr_save_state() Thomas Gleixner 2017-04-20 11:20 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 06/24] cpufreq: Use cpuhp_setup_state_nocalls_cpuslocked() Thomas Gleixner 2017-04-20 11:21 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 07/24] KVM/PPC/Book3S HV: " Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-20 11:21 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 08/24] hwtracing/coresight-etm3x: " Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-20 11:22 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-20 15:14 ` [patch V2 08/24] " Mathieu Poirier 2017-04-20 15:14 ` Mathieu Poirier 2017-04-20 15:32 ` Mathieu Poirier 2017-04-20 15:32 ` Mathieu Poirier 2017-04-18 17:04 ` [patch V2 09/24] hwtracing/coresight-etm4x: " Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-20 11:22 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 10/24] perf/x86/intel/cqm: Use cpuhp_setup_state_cpuslocked() Thomas Gleixner 2017-04-20 11:23 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 11/24] ARM/hw_breakpoint: " Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-19 17:54 ` Mark Rutland 2017-04-19 17:54 ` Mark Rutland 2017-04-19 18:20 ` Thomas Gleixner 2017-04-19 18:20 ` Thomas Gleixner 2017-04-20 11:23 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 12/24] s390/kernel: Use stop_machine_cpuslocked() Thomas Gleixner 2017-04-20 11:24 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 13/24] powerpc/powernv: " Thomas Gleixner 2017-04-18 17:04 ` Thomas Gleixner 2017-04-20 11:24 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 14/24] cpu/hotplug: Use stop_machine_cpuslocked() in takedown_cpu() Thomas Gleixner 2017-04-20 11:25 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` [patch V2 15/24] x86/perf: Drop EXPORT of perf_check_microcode Thomas Gleixner 2017-04-20 11:25 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:04 ` [patch V2 16/24] perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() Thomas Gleixner 2017-04-20 11:26 ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior 2017-04-18 17:04 ` Thomas Gleixner [this message] 2017-04-18 17:04 ` [patch V2 17/24] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner 2017-04-20 11:27 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:05 ` [patch V2 18/24] PCI: Replace the racy recursion prevention Thomas Gleixner 2017-04-18 17:05 ` Thomas Gleixner 2017-04-20 11:27 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:05 ` [patch V2 19/24] ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() Thomas Gleixner 2017-04-20 11:28 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:05 ` [patch V2 20/24] perf/core: Remove redundant get_online_cpus() Thomas Gleixner 2017-04-20 11:28 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-04-18 17:05 ` [patch V2 21/24] jump_label: Pull get_online_cpus() into generic code Thomas Gleixner 2017-04-18 17:05 ` [patch V2 22/24] jump_label: Provide static_key_slow_inc_cpuslocked() Thomas Gleixner 2017-04-18 17:05 ` [patch V2 23/24] perf: Avoid cpu_hotplug_lock r-r recursion Thomas Gleixner 2017-04-18 17:05 ` [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem Thomas Gleixner 2017-04-20 11:30 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner 2017-05-10 4:59 ` [patch V2 24/24] " Michael Ellerman 2017-05-10 8:49 ` Thomas Gleixner 2017-05-10 16:30 ` Steven Rostedt 2017-05-10 17:15 ` Steven Rostedt 2017-05-11 5:49 ` Michael Ellerman 2017-04-25 16:10 ` [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem Mark Rutland 2017-04-25 16:10 ` Mark Rutland 2017-04-25 17:28 ` Sebastian Siewior 2017-04-25 17:28 ` Sebastian Siewior 2017-04-26 8:59 ` Mark Rutland 2017-04-26 8:59 ` Mark Rutland 2017-04-26 9:40 ` Suzuki K Poulose 2017-04-26 9:40 ` Suzuki K Poulose 2017-04-26 10:32 ` Mark Rutland 2017-04-26 10:32 ` Mark Rutland 2017-04-27 8:27 ` Sebastian Siewior 2017-04-27 8:27 ` Sebastian Siewior 2017-04-27 9:57 ` Mark Rutland 2017-04-27 9:57 ` Mark Rutland 2017-04-27 10:01 ` Thomas Gleixner 2017-04-27 10:01 ` Thomas Gleixner 2017-04-27 12:30 ` Mark Rutland 2017-04-27 12:30 ` Mark Rutland 2017-04-27 15:48 ` [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked() (was: Re: [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem) Mark Rutland 2017-04-27 15:48 ` Mark Rutland 2017-04-27 16:35 ` Suzuki K Poulose 2017-04-27 16:35 ` Suzuki K Poulose 2017-04-27 17:03 ` [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked() Suzuki K Poulose 2017-04-27 17:03 ` Suzuki K Poulose 2017-04-27 17:17 ` Mark Rutland 2017-04-27 17:17 ` Mark Rutland 2017-04-28 14:24 ` [RFC PATCH] trace/perf: cure locking issue in perf_event_open() error path Sebastian Siewior 2017-04-28 14:27 ` Sebastian Siewior 2017-05-01 12:57 ` [tip:smp/hotplug] perf: Reorder cpu hotplug rwsem against cred_guard_mutex tip-bot for Thomas Gleixner 2017-05-01 12:58 ` [tip:smp/hotplug] perf: Push hotplug protection down to callers tip-bot for Thomas Gleixner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170418170553.806707929@linutronix.de \ --to=tglx@linutronix.de \ --cc=bhelgaas@google.com \ --cc=bigeasy@linutronix.de \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-pci@vger.kernel.org \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.