All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: linux-cxl@vger.kernel.org
Cc: nvdimm@lists.linux.dev
Subject: [PATCH v4 8/8] nvdimm: Fix firmware activation deadlock scenarios
Date: Sat, 23 Apr 2022 14:22:18 -0700	[thread overview]
Message-ID: <165074883800.4116052.10737040861825806582.stgit@dwillia2-desk3.amr.corp.intel.com> (raw)
In-Reply-To: <165055523099.3745911.9091010720291846249.stgit@dwillia2-desk3.amr.corp.intel.com>

Lockdep reports the following deadlock scenarios for CXL root device
power-management, device_prepare(), operations, and device_shutdown()
operations for 'nd_region' devices:

---
 Chain exists of:
   &nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(system_transition_mutex);
                                lock(&nvdimm_bus->reconfig_mutex);
                                lock(system_transition_mutex);
   lock(&nvdimm_region_key);

--

 Chain exists of:
   &cxl_nvdimm_bridge_key --> acpi_scan_lock --> &cxl_root_key

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&cxl_root_key);
                                lock(acpi_scan_lock);
                                lock(&cxl_root_key);
   lock(&cxl_nvdimm_bridge_key);

---

These stem from holding nvdimm_bus_lock() over hibernate_quiet_exec()
which walks the entire system device topology taking device_lock() along
the way. The nvdimm_bus_lock() is protecting against unregistration,
multiple simultaneous ops callers, and preventing activate_show() from
racing activate_store(). For the first 2, the lock is redundant.
Unregistration already flushes all ops users, and sysfs already prevents
multiple threads to be active in an ops handler at the same time. For
the last userspace should already be waiting for its last
activate_store() to complete, and does not need activate_show() to flush
the write side, so this lock usage can be deleted in these attributes.

Fixes: 48001ea50d17 ("PM, libnvdimm: Add runtime firmware activation support")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v3:
- Remove nvdimm_bus_lock() from all ->capability() invocations (Ira)

 drivers/nvdimm/core.c |    9 ---------
 1 file changed, 9 deletions(-)

diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 144926b7451c..d91799b71d23 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -368,9 +368,7 @@ static ssize_t capability_show(struct device *dev,
 	if (!nd_desc->fw_ops)
 		return -EOPNOTSUPP;
 
-	nvdimm_bus_lock(dev);
 	cap = nd_desc->fw_ops->capability(nd_desc);
-	nvdimm_bus_unlock(dev);
 
 	switch (cap) {
 	case NVDIMM_FWA_CAP_QUIESCE:
@@ -395,10 +393,8 @@ static ssize_t activate_show(struct device *dev,
 	if (!nd_desc->fw_ops)
 		return -EOPNOTSUPP;
 
-	nvdimm_bus_lock(dev);
 	cap = nd_desc->fw_ops->capability(nd_desc);
 	state = nd_desc->fw_ops->activate_state(nd_desc);
-	nvdimm_bus_unlock(dev);
 
 	if (cap < NVDIMM_FWA_CAP_QUIESCE)
 		return -EOPNOTSUPP;
@@ -443,7 +439,6 @@ static ssize_t activate_store(struct device *dev,
 	else
 		return -EINVAL;
 
-	nvdimm_bus_lock(dev);
 	state = nd_desc->fw_ops->activate_state(nd_desc);
 
 	switch (state) {
@@ -461,7 +456,6 @@ static ssize_t activate_store(struct device *dev,
 	default:
 		rc = -ENXIO;
 	}
-	nvdimm_bus_unlock(dev);
 
 	if (rc == 0)
 		rc = len;
@@ -484,10 +478,7 @@ static umode_t nvdimm_bus_firmware_visible(struct kobject *kobj, struct attribut
 	if (!nd_desc->fw_ops)
 		return 0;
 
-	nvdimm_bus_lock(dev);
 	cap = nd_desc->fw_ops->capability(nd_desc);
-	nvdimm_bus_unlock(dev);
-
 	if (cap < NVDIMM_FWA_CAP_QUIESCE)
 		return 0;
 


  parent reply	other threads:[~2022-04-23 21:22 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-21 15:33 [PATCH v3 0/8] device-core: Enable device_lock() lockdep validation Dan Williams
2022-04-21 15:33 ` [PATCH v3 1/8] cxl: Replace lockdep_mutex with local lock classes Dan Williams
2022-04-22 23:43   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 2/8] cxl/acpi: Add root device lockdep validation Dan Williams
2022-04-21 16:10   ` Greg Kroah-Hartman
2022-04-22 23:58   ` Ira Weiny
2022-04-23  0:08     ` Dan Williams
2022-04-23 17:27       ` Dan Williams
2022-04-25 10:33         ` Peter Zijlstra
2022-04-25 16:05           ` Dan Williams
2022-04-25 18:57             ` Dan Williams
2022-04-23 21:05   ` [PATCH v4 " Dan Williams
2022-04-26  4:23     ` [PATCH v5 " Dan Williams
2022-04-26 19:22       ` [PATCH v6 " Dan Williams
2022-04-21 15:33 ` [PATCH v3 3/8] cxl: Drop cxl_device_lock() Dan Williams
2022-04-23  0:07   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 4/8] nvdimm: Replace lockdep_mutex with local lock classes Dan Williams
2022-04-23  0:19   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 5/8] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
2022-04-23  0:21   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 6/8] nvdimm: Drop nd_device_lock() Dan Williams
2022-04-23  0:24   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 7/8] device-core: Kill the lockdep_mutex Dan Williams
2022-04-21 16:09   ` Greg Kroah-Hartman
2022-04-23  0:25   ` Ira Weiny
2022-04-21 15:33 ` [PATCH v3 8/8] nvdimm: Fix firmware activation deadlock scenarios Dan Williams
2022-04-23  4:28   ` Ira Weiny
2022-04-23 17:29     ` Dan Williams
2022-04-23 21:22   ` Dan Williams [this message]
2022-04-24 23:30     ` [PATCH v4 " Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=165074883800.4116052.10737040861825806582.stgit@dwillia2-desk3.amr.corp.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.