From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S945038AbdDTLak (ORCPT ); Thu, 20 Apr 2017 07:30:40 -0400 Received: from terminus.zytor.com ([65.50.211.136]:40991 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S944405AbdDTLae (ORCPT ); Thu, 20 Apr 2017 07:30:34 -0400 Date: Thu, 20 Apr 2017 04:27:00 -0700 From: tip-bot for Thomas Gleixner Message-ID: Cc: bhelgaas@google.com, bigeasy@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, tglx@linutronix.de Reply-To: tglx@linutronix.de, linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@kernel.org, peterz@infradead.org, bigeasy@linutronix.de, rostedt@goodmis.org, bhelgaas@google.com In-Reply-To: <20170418170553.806707929@linutronix.de> References: <20170418170553.806707929@linutronix.de> To: linux-tip-commits@vger.kernel.org Subject: [tip:smp/hotplug] PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Git-Commit-ID: b4d1673371196dd9aebdd2f61d946165c777b931 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: b4d1673371196dd9aebdd2f61d946165c777b931 Gitweb: http://git.kernel.org/tip/b4d1673371196dd9aebdd2f61d946165c777b931 Author: Thomas Gleixner AuthorDate: Tue, 18 Apr 2017 19:04:59 +0200 Committer: Thomas Gleixner CommitDate: Thu, 20 Apr 2017 13:08:55 +0200 PCI: Use cpu_hotplug_disable() instead of get_online_cpus() Converting the hotplug locking, i.e. get_online_cpus(), to a percpu rwsem unearthed a circular lock dependency which was hidden from lockdep due to the lockdep annotation of get_online_cpus() which prevents lockdep from creating full dependency chains. There are several variants of this. And example is: Chain exists of: cpu_hotplug_lock.rw_sem --> drm_global_mutex --> &item->mutex CPU0 CPU1 ---- ---- lock(&item->mutex); lock(drm_global_mutex); lock(&item->mutex); lock(cpu_hotplug_lock.rw_sem); because there are dependencies through workqueues. The call chain is: get_online_cpus apply_workqueue_attrs __alloc_workqueue_key ttm_mem_global_init ast_ttm_mem_global_init drm_global_item_ref ast_mm_init ast_driver_load drm_dev_register drm_get_pci_dev ast_pci_probe local_pci_probe work_for_cpu_fn process_one_work worker_thread This is not a problem of get_online_cpus() recursion, it's a possible deadlock undetected by lockdep so far. The cure is to use cpu_hotplug_disable() instead of get_online_cpus() to protect the PCI probing. There is a side effect to this: cpu_hotplug_disable() makes a concurrent cpu hotplug attempt via the sysfs interfaces fail with -EBUSY, but PCI probing usually happens during the boot process where no interaction is possible. Any later invocations are infrequent enough and concurrent hotplug attempts are so unlikely that the danger of user space visible regressions is very close to zero. Anyway, thats preferrable over a real deadlock. Signed-off-by: Thomas Gleixner Acked-by: Bjorn Helgaas Cc: Peter Zijlstra Cc: Sebastian Siewior Cc: Steven Rostedt Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170418170553.806707929@linutronix.de --- drivers/pci/pci-driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index afa7271..f00e4d9 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -349,13 +349,13 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, if (node >= 0 && node != numa_node_id()) { int cpu; - get_online_cpus(); + cpu_hotplug_disable(); cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask); if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); - put_online_cpus(); + cpu_hotplug_enable(); } else error = local_pci_probe(&ddi);