All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation
@ 2022-04-13  6:01 Dan Williams
  2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ira Weiny, Dave Jiang, Peter Zijlstra, Jonathan Cameron,
	Vishal Verma, Ben Widawsky, Kevin Tian, Pierre-Louis Bossart,
	Alison Schofield, Boqun Feng, Ingo Molnar, Greg Kroah-Hartman,
	Will Deacon, Waiman Long, Rafael J. Wysocki, gregkh,
	linux-kernel, nvdimm

Changes since v1 [1]:
- Improve the clarity of the cover letter and changelogs of the
  major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave)
- Fix device_lock_interruptible() false negative deadlock detection
  (Kevin)
- Fix off-by-one error in the device_set_lock_class() enable case (Kevin)
- Spelling fixes in Patch2 changelog (Pierre)
- Compilation fixes when both CONFIG_CXL_BUS=n and
  CONFIG_LIBNVDIMM=n. (0day robot)

[1]: https://lore.kernel.org/all/164610292916.2682974.12924748003366352335.stgit@dwillia2-desk3.amr.corp.intel.com/

---

The device_lock() is why the lockdep_set_novalidate_class() API exists.
The lock is taken in too many disparate contexts, and lockdep by design
assumes that all device_lock() acquisitions are identical. The lack of
lockdep coverage leads to deadlock scenarios landing upstream. To
mitigate that problem the lockdep_mutex was added [2].

The lockdep_mutex lets a subsystem mirror device_lock() acquisitions
without lockdep_set_novalidate_class() to gain some limited lockdep
coverage. The mirroring approach is limited to taking the device_lock()
after-the-fact in a subsystem's 'struct bus_type' operations and fails
to cover device_lock() acquisition in the driver-core. It also can only
track the needs of one subsystem at a time so, for example the kernel
needs to be recompiled between CONFIG_PROVE_NVDIMM_LOCKING and
CONFIG_PROVE_CXL_LOCKING depending on which subsystem is being
regression tested. Obviously that also means that intra-subsystem
locking dependencies can not be validated.

Two enhancements are proposed to improve the current state of
device_lock() lockdep validation:

1/ Communicate a lock class to the device-core and let it acquire
   dev->lockdep_mutex per the subsystem's nested locking expectations.

2/ Go further and provide a lockdep_mutex per-subsystem so each 
   has the full span of MAX_LOCKDEP_SUBCLASSES available for its use.

This enabling has already prevented at least one device_lock() deadlock
from making its way upstream.

[2]: commit 87a30e1f05d7 ("driver-core, libnvdimm: Let device subsystems add local lockdep coverage")

---

Dan Williams (12):
      device-core: Move device_lock() lockdep init to a helper
      device-core: Add dev->lock_class to enable device_lock() lockdep validation
      cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock()
      cxl/core: Remove cxl_device_lock()
      cxl/core: Clamp max lock_class
      cxl/core: Use dev->lock_class for device_lock() lockdep validation
      cxl/acpi: Add a device_lock() lock class for the root platform device
      libnvdimm: Refactor an nvdimm_lock_class() helper
      ACPI: NFIT: Drop nfit_device_lock()
      libnvdimm: Drop nd_device_lock()
      libnvdimm: Enable lockdep validation
      device-core: Enable multi-subsystem device_lock() lockdep validation


 drivers/acpi/nfit/core.c        |   30 ++++---
 drivers/acpi/nfit/nfit.h        |   24 ------
 drivers/base/core.c             |    5 -
 drivers/cxl/acpi.c              |    1 
 drivers/cxl/core/memdev.c       |    1 
 drivers/cxl/core/pmem.c         |    6 +
 drivers/cxl/core/port.c         |   56 ++++++-------
 drivers/cxl/cxl.h               |   76 +++++++-----------
 drivers/cxl/mem.c               |    4 -
 drivers/cxl/pmem.c              |   12 +--
 drivers/cxl/port.c              |    2 
 drivers/nvdimm/btt_devs.c       |   16 ++--
 drivers/nvdimm/bus.c            |   26 +++---
 drivers/nvdimm/core.c           |   10 +-
 drivers/nvdimm/dimm_devs.c      |    8 +-
 drivers/nvdimm/namespace_devs.c |   36 ++++-----
 drivers/nvdimm/nd-core.h        |   51 +++---------
 drivers/nvdimm/pfn_devs.c       |   24 +++---
 drivers/nvdimm/pmem.c           |    2 
 drivers/nvdimm/region.c         |    2 
 drivers/nvdimm/region_devs.c    |   16 ++--
 include/linux/device.h          |  162 ++++++++++++++++++++++++++++++++++++++-
 lib/Kconfig.debug               |   23 ------
 23 files changed, 325 insertions(+), 268 deletions(-)

--

base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
@ 2022-04-13  6:01 ` Dan Williams
  2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Pierre-Louis Bossart, Dave Jiang, Kevin Tian, peterz,
	vishal.l.verma, alison.schofield, gregkh, linux-kernel, nvdimm

In preparation for new infrastructure to support lockdep validation of
device_lock() usage across driver subsystems, add a
device_lockdep_init() helper to contain those updates.

Suggested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/base/core.c    |    5 +----
 include/linux/device.h |   13 +++++++++++++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3d6430eb0c6a..cb782299ae44 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2864,10 +2864,7 @@ void device_initialize(struct device *dev)
 	kobject_init(&dev->kobj, &device_ktype);
 	INIT_LIST_HEAD(&dev->dma_pools);
 	mutex_init(&dev->mutex);
-#ifdef CONFIG_PROVE_LOCKING
-	mutex_init(&dev->lockdep_mutex);
-#endif
-	lockdep_set_novalidate_class(&dev->mutex);
+	device_lockdep_init(dev);
 	spin_lock_init(&dev->devres_lock);
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_pm_init(dev);
diff --git a/include/linux/device.h b/include/linux/device.h
index 93459724dcde..af2576ace130 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -762,6 +762,19 @@ static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
 	return !!(dev->power.driver_flags & flags);
 }
 
+#ifdef CONFIG_PROVE_LOCKING
+static inline void device_lockdep_init(struct device *dev)
+{
+	mutex_init(&dev->lockdep_mutex);
+	lockdep_set_novalidate_class(&dev->mutex);
+}
+#else
+static inline void device_lockdep_init(struct device *dev)
+{
+	lockdep_set_novalidate_class(&dev->mutex);
+}
+#endif
+
 static inline void device_lock(struct device *dev)
 {
 	mutex_lock(&dev->mutex);


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
  2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
@ 2022-04-13  6:01 ` Dan Williams
  2022-04-13  8:43   ` Peter Zijlstra
  2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang, Kevin Tian,
	peterz, vishal.l.verma, alison.schofield, gregkh, linux-kernel,
	nvdimm

The device_lock() is hidden from lockdep by default because, for
example, a device subsystem may do something like:

---
device_add(dev1);
...in driver core...
device_lock(dev1);
bus->probe(dev1); /* where bus->probe() calls driver1_probe() */

driver1_probe(struct device *dev)
{
	...do some enumeration...
	dev2->parent = dev;
	/* this triggers probe under device_lock(dev2); */
	device_add(dev2);
}
---

To lockdep, that device_lock(dev2) looks like a deadlock because lockdep
only sees lock classes, not individual lock instances. All device_lock()
instances across the entire kernel are the same class. However, this is
not a deadlock in practice because the locking is strictly hierarchical.
I.e. device_lock(dev1) is held over device_lock(dev2), but never the
reverse. In order for lockdep to be satisfied and see that it is
hierarchical in practice the mutex_lock() call in device_lock() needs to
be moved to mutex_lock_nested() where the @subclass argument to
mutex_lock_nested() represents the nesting level, i.e.:

s/device_lock(dev1)/mutex_lock_nested(&dev1->mutex, 1)/

s/device_lock(dev2)/mutex_lock_nested(&dev2->mutex, 2)/

Now, what if the internals of the device_lock() could be annotated with
the right @subclass argument to call mutex_lock_nested()?

With device_set_lock_class() a subsystem can optionally add that
metadata. The device_lock() still takes dev->mutex, but when
dev->lock_class is >= 0 it additionally takes dev->lockdep_mutex with
the proper nesting. Unlike dev->mutex, dev->lockdep_mutex is not marked
lockdep_set_novalidate_class() and lockdep will become useful... at
least for one subsystem at a time.

It is still the case that only one subsystem can be using lockdep with
lockdep_mutex at a time because different subsystems will collide class
numbers. You might say "well, how about subsystem1 gets class ids 0 to 9
and subsystem2 gets class ids 10 to 20?". MAX_LOCKDEP_SUBCLASSES is 8,
and 8 is just enough class ids for one subsystem of moderate complexity.

Fixing that problem needs deeper changes, but for now moving the ability
to set a lock class into the core lets the NVDIMM and CXL subsystems
drop their incomplete solutions which attempt to set the lock class and
take the lockdep mutex after the fact.

This approach has prevented at least one deadlock scenario from making
its way upstream that was not caught by the current "local /
after-the-fact" usage of dev->lockdep_mutex (commit 87a30e1f05d7
("driver-core, libnvdimm: Let device subsystems add local lockdep
coverage")).

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 include/linux/device.h |   92 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 88 insertions(+), 4 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index af2576ace130..6083e757e804 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -402,6 +402,7 @@ struct dev_msi_info {
  * @mutex:	Mutex to synchronize calls to its driver.
  * @lockdep_mutex: An optional debug lock that a subsystem can use as a
  * 		peer lock to gain localized lockdep coverage of the device_lock.
+ * @lock_class: per-subsystem annotated device lock class
  * @bus:	Type of bus device is on.
  * @driver:	Which driver has allocated this
  * @platform_data: Platform data specific to the device.
@@ -501,6 +502,7 @@ struct device {
 					   dev_set_drvdata/dev_get_drvdata */
 #ifdef CONFIG_PROVE_LOCKING
 	struct mutex		lockdep_mutex;
+	int			lock_class;
 #endif
 	struct mutex		mutex;	/* mutex to synchronize calls to
 					 * its driver.
@@ -762,18 +764,100 @@ static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
 	return !!(dev->power.driver_flags & flags);
 }
 
+static inline void device_lock_assert(struct device *dev)
+{
+	lockdep_assert_held(&dev->mutex);
+}
+
 #ifdef CONFIG_PROVE_LOCKING
 static inline void device_lockdep_init(struct device *dev)
 {
 	mutex_init(&dev->lockdep_mutex);
+	dev->lock_class = -1;
 	lockdep_set_novalidate_class(&dev->mutex);
 }
-#else
+
+static inline void device_lock(struct device *dev)
+{
+	/*
+	 * For double-lock programming errors the kernel will hang
+	 * trying to acquire @dev->mutex before lockdep can report the
+	 * problem acquiring @dev->lockdep_mutex, so manually assert
+	 * before that hang.
+	 */
+	lockdep_assert_not_held(&dev->lockdep_mutex);
+
+	mutex_lock(&dev->mutex);
+	if (dev->lock_class >= 0)
+		mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
+}
+
+static inline int device_lock_interruptible(struct device *dev)
+{
+	int rc;
+
+	lockdep_assert_not_held(&dev->lockdep_mutex);
+
+	rc = mutex_lock_interruptible(&dev->mutex);
+	if (rc || dev->lock_class < 0)
+		return rc;
+
+	return mutex_lock_interruptible_nested(&dev->lockdep_mutex,
+					       dev->lock_class);
+}
+
+static inline int device_trylock(struct device *dev)
+{
+	if (mutex_trylock(&dev->mutex)) {
+		if (dev->lock_class >= 0)
+			mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
+		return 1;
+	}
+
+	return 0;
+}
+
+static inline void device_unlock(struct device *dev)
+{
+	if (dev->lock_class >= 0)
+		mutex_unlock(&dev->lockdep_mutex);
+	mutex_unlock(&dev->mutex);
+}
+
+/*
+ * Note: this routine expects that the state of @dev->mutex is stable
+ * from entry to exit. There is no support for changing lockdep
+ * validation classes, only enabling and disabling validation.
+ */
+static inline void device_set_lock_class(struct device *dev, int lock_class)
+{
+	/*
+	 * Allow for setting or clearing the lock class while the
+	 * device_lock() is held, in which case the paired nested lock
+	 * might need to be acquired or released now to accommodate the
+	 * next device_unlock().
+	 */
+	if (dev->lock_class < 0 && lock_class >= 0) {
+		/* Enabling lockdep validation... */
+		if (mutex_is_locked(&dev->mutex))
+			mutex_lock_nested(&dev->lockdep_mutex, lock_class);
+	} else if (dev->lock_class >= 0 && lock_class < 0) {
+		/* Disabling lockdep validation... */
+		if (mutex_is_locked(&dev->mutex))
+			mutex_unlock(&dev->lockdep_mutex);
+	} else {
+		dev_warn(dev,
+			 "%s: failed to change lock_class from: %d to %d\n",
+			 __func__, dev->lock_class, lock_class);
+		return;
+	}
+	dev->lock_class = lock_class;
+}
+#else /* !CONFIG_PROVE_LOCKING */
 static inline void device_lockdep_init(struct device *dev)
 {
 	lockdep_set_novalidate_class(&dev->mutex);
 }
-#endif
 
 static inline void device_lock(struct device *dev)
 {
@@ -795,10 +879,10 @@ static inline void device_unlock(struct device *dev)
 	mutex_unlock(&dev->mutex);
 }
 
-static inline void device_lock_assert(struct device *dev)
+static inline void device_set_lock_class(struct device *dev, int lock_class)
 {
-	lockdep_assert_held(&dev->mutex);
 }
+#endif /* CONFIG_PROVE_LOCKING */
 
 static inline struct device_node *dev_of_node(struct device *dev)
 {


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock()
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
  2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
  2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
@ 2022-04-13  6:01 ` Dan Williams
  2022-04-13  9:39   ` Peter Zijlstra
  2022-04-13  6:01 ` [PATCH v2 04/12] cxl/core: Remove cxl_device_lock() Dan Williams
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Ben Widawsky,
	Jonathan Cameron, Dave Jiang, Kevin Tian, peterz, gregkh,
	linux-kernel, nvdimm

In preparation for upleveling device_lock() lockdep annotation support into
the core, provide a helper to retrieve the lock class. This lock_class
will be used with device_set_lock_class() to identify the CXL nested
locking rules.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/cxl.h |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 990b6670222e..c9fda9304c54 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -418,13 +418,12 @@ enum cxl_lock_class {
 	 */
 };
 
-static inline void cxl_nested_lock(struct device *dev)
+static inline int cxl_lock_class(struct device *dev)
 {
 	if (is_cxl_port(dev)) {
 		struct cxl_port *port = to_cxl_port(dev);
 
-		mutex_lock_nested(&dev->lockdep_mutex,
-				  CXL_PORT_LOCK + port->depth);
+		return CXL_PORT_LOCK + port->depth;
 	} else if (is_cxl_decoder(dev)) {
 		struct cxl_port *port = to_cxl_port(dev->parent);
 
@@ -432,14 +431,18 @@ static inline void cxl_nested_lock(struct device *dev)
 		 * A decoder is the immediate child of a port, so set
 		 * its lock class equal to other child device siblings.
 		 */
-		mutex_lock_nested(&dev->lockdep_mutex,
-				  CXL_PORT_LOCK + port->depth + 1);
+		return CXL_PORT_LOCK + port->depth + 1;
 	} else if (is_cxl_nvdimm_bridge(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_BRIDGE_LOCK);
+		return CXL_NVDIMM_BRIDGE_LOCK;
 	else if (is_cxl_nvdimm(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_LOCK);
+		return CXL_NVDIMM_LOCK;
 	else
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_ANON_LOCK);
+		return CXL_ANON_LOCK;
+}
+
+static inline void cxl_nested_lock(struct device *dev)
+{
+	mutex_lock_nested(&dev->lockdep_mutex, cxl_lock_class(dev));
 }
 
 static inline void cxl_nested_unlock(struct device *dev)


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 04/12] cxl/core: Remove cxl_device_lock()
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (2 preceding siblings ...)
  2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
@ 2022-04-13  6:01 ` Dan Williams
  2022-04-13  6:01 ` [PATCH v2 05/12] cxl/core: Clamp max lock_class Dan Williams
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Ben Widawsky,
	Jonathan Cameron, Dave Jiang, Kevin Tian, peterz, gregkh,
	linux-kernel, nvdimm

In preparation for moving lockdep_mutex nested lock acquisition into the
core, remove the cxl_device_lock() wrapper, but preserve
cxl_lock_class() that will be used to inform the core of the subsystem's
lock ordering rules.

Note that this reverts back to the default state of unvalidated
device_lock(). A follow-on change to use the new dev->lock_class
facility re-enables lockdep validation.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/pmem.c |    4 ++-
 drivers/cxl/core/port.c |   54 ++++++++++++++++++++---------------------------
 drivers/cxl/cxl.h       |   46 ----------------------------------------
 drivers/cxl/mem.c       |    4 ++-
 drivers/cxl/pmem.c      |   12 +++++-----
 lib/Kconfig.debug       |    2 +-
 6 files changed, 34 insertions(+), 88 deletions(-)

diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 8de240c4d96b..60f71caa5dbd 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -121,10 +121,10 @@ static void unregister_nvb(void *_cxl_nvb)
 	 * work to flush. Once the state has been changed to 'dead' then no new
 	 * work can be queued by user-triggered bind.
 	 */
-	cxl_device_lock(&cxl_nvb->dev);
+	device_lock(&cxl_nvb->dev);
 	flush = cxl_nvb->state != CXL_NVB_NEW;
 	cxl_nvb->state = CXL_NVB_DEAD;
-	cxl_device_unlock(&cxl_nvb->dev);
+	device_unlock(&cxl_nvb->dev);
 
 	/*
 	 * Even though the device core will trigger device_release_driver()
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 2ab1ba4499b3..437e8a71e5dc 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -312,10 +312,10 @@ static void cxl_port_release(struct device *dev)
 	struct cxl_port *port = to_cxl_port(dev);
 	struct cxl_ep *ep, *_e;
 
-	cxl_device_lock(dev);
+	device_lock(dev);
 	list_for_each_entry_safe(ep, _e, &port->endpoints, list)
 		cxl_ep_release(ep);
-	cxl_device_unlock(dev);
+	device_unlock(dev);
 	ida_free(&cxl_port_ida, port->id);
 	kfree(port);
 }
@@ -554,7 +554,7 @@ static int match_root_child(struct device *dev, const void *match)
 		return 0;
 
 	port = to_cxl_port(dev);
-	cxl_device_lock(dev);
+	device_lock(dev);
 	list_for_each_entry(dport, &port->dports, list) {
 		iter = match;
 		while (iter) {
@@ -564,7 +564,7 @@ static int match_root_child(struct device *dev, const void *match)
 		}
 	}
 out:
-	cxl_device_unlock(dev);
+	device_unlock(dev);
 
 	return !!iter;
 }
@@ -623,13 +623,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
 static void cond_cxl_root_lock(struct cxl_port *port)
 {
 	if (is_cxl_root(port))
-		cxl_device_lock(&port->dev);
+		device_lock(&port->dev);
 }
 
 static void cond_cxl_root_unlock(struct cxl_port *port)
 {
 	if (is_cxl_root(port))
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 }
 
 static void cxl_dport_remove(void *data)
@@ -736,15 +736,15 @@ static int add_ep(struct cxl_port *port, struct cxl_ep *new)
 {
 	struct cxl_ep *dup;
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	if (port->dead) {
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 		return -ENXIO;
 	}
 	dup = find_ep(port, new->ep);
 	if (!dup)
 		list_add_tail(&new->list, &port->endpoints);
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 
 	return dup ? -EEXIST : 0;
 }
@@ -854,12 +854,12 @@ static void delete_endpoint(void *data)
 		goto out;
 	parent = &parent_port->dev;
 
-	cxl_device_lock(parent);
+	device_lock(parent);
 	if (parent->driver && endpoint->uport) {
 		devm_release_action(parent, cxl_unlink_uport, endpoint);
 		devm_release_action(parent, unregister_port, endpoint);
 	}
-	cxl_device_unlock(parent);
+	device_unlock(parent);
 	put_device(parent);
 out:
 	put_device(&endpoint->dev);
@@ -920,7 +920,7 @@ static void cxl_detach_ep(void *data)
 		}
 
 		parent_port = to_cxl_port(port->dev.parent);
-		cxl_device_lock(&parent_port->dev);
+		device_lock(&parent_port->dev);
 		if (!parent_port->dev.driver) {
 			/*
 			 * The bottom-up race to delete the port lost to a
@@ -928,12 +928,12 @@ static void cxl_detach_ep(void *data)
 			 * parent_port ->remove() will have cleaned up all
 			 * descendants.
 			 */
-			cxl_device_unlock(&parent_port->dev);
+			device_unlock(&parent_port->dev);
 			put_device(&port->dev);
 			continue;
 		}
 
-		cxl_device_lock(&port->dev);
+		device_lock(&port->dev);
 		ep = find_ep(port, &cxlmd->dev);
 		dev_dbg(&cxlmd->dev, "disconnect %s from %s\n",
 			ep ? dev_name(ep->ep) : "", dev_name(&port->dev));
@@ -948,7 +948,7 @@ static void cxl_detach_ep(void *data)
 			port->dead = true;
 			list_splice_init(&port->dports, &reap_dports);
 		}
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 
 		if (!list_empty(&reap_dports)) {
 			dev_dbg(&cxlmd->dev, "delete %s\n",
@@ -956,7 +956,7 @@ static void cxl_detach_ep(void *data)
 			delete_switch_port(port, &reap_dports);
 		}
 		put_device(&port->dev);
-		cxl_device_unlock(&parent_port->dev);
+		device_unlock(&parent_port->dev);
 	}
 }
 
@@ -1004,7 +1004,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 		return -EAGAIN;
 	}
 
-	cxl_device_lock(&parent_port->dev);
+	device_lock(&parent_port->dev);
 	if (!parent_port->dev.driver) {
 		dev_warn(&cxlmd->dev,
 			 "port %s:%s disabled, failed to enumerate CXL.mem\n",
@@ -1022,7 +1022,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 			get_device(&port->dev);
 	}
 out:
-	cxl_device_unlock(&parent_port->dev);
+	device_unlock(&parent_port->dev);
 
 	if (IS_ERR(port))
 		rc = PTR_ERR(port);
@@ -1133,14 +1133,14 @@ struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
 {
 	struct cxl_dport *dport;
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	list_for_each_entry(dport, &port->dports, list)
 		if (dport->dport == dev) {
-			cxl_device_unlock(&port->dev);
+			device_unlock(&port->dev);
 			return dport;
 		}
 
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 	return NULL;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_find_dport_by_dev, CXL);
@@ -1379,9 +1379,9 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 
 	port = to_cxl_port(cxld->dev.parent);
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	rc = cxl_decoder_add_locked(cxld, target_map);
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 
 	return rc;
 }
@@ -1452,13 +1452,7 @@ static int cxl_bus_probe(struct device *dev)
 {
 	int rc;
 
-	/*
-	 * Take the CXL nested lock since the driver core only holds
-	 * @dev->mutex and not @dev->lockdep_mutex.
-	 */
-	cxl_nested_lock(dev);
 	rc = to_cxl_drv(dev->driver)->probe(dev);
-	cxl_nested_unlock(dev);
 
 	dev_dbg(dev, "probe: %d\n", rc);
 	return rc;
@@ -1468,10 +1462,8 @@ static void cxl_bus_remove(struct device *dev)
 {
 	struct cxl_driver *cxl_drv = to_cxl_drv(dev->driver);
 
-	cxl_nested_lock(dev);
 	if (cxl_drv->remove)
 		cxl_drv->remove(dev);
-	cxl_nested_unlock(dev);
 }
 
 static struct workqueue_struct *cxl_bus_wq;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index c9fda9304c54..a6c1a027e389 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -439,51 +439,5 @@ static inline int cxl_lock_class(struct device *dev)
 	else
 		return CXL_ANON_LOCK;
 }
-
-static inline void cxl_nested_lock(struct device *dev)
-{
-	mutex_lock_nested(&dev->lockdep_mutex, cxl_lock_class(dev));
-}
-
-static inline void cxl_nested_unlock(struct device *dev)
-{
-	mutex_unlock(&dev->lockdep_mutex);
-}
-
-static inline void cxl_device_lock(struct device *dev)
-{
-	/*
-	 * For double lock errors the lockup will happen before lockdep
-	 * warns at cxl_nested_lock(), so assert explicitly.
-	 */
-	lockdep_assert_not_held(&dev->lockdep_mutex);
-
-	device_lock(dev);
-	cxl_nested_lock(dev);
-}
-
-static inline void cxl_device_unlock(struct device *dev)
-{
-	cxl_nested_unlock(dev);
-	device_unlock(dev);
-}
-#else
-static inline void cxl_nested_lock(struct device *dev)
-{
-}
-
-static inline void cxl_nested_unlock(struct device *dev)
-{
-}
-
-static inline void cxl_device_lock(struct device *dev)
-{
-	device_lock(dev);
-}
-
-static inline void cxl_device_unlock(struct device *dev)
-{
-	device_unlock(dev);
-}
 #endif
 #endif /* __CXL_H__ */
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 49a4b1c47299..91fb8d5b21a7 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -195,7 +195,7 @@ static int cxl_mem_probe(struct device *dev)
 		return -ENXIO;
 	}
 
-	cxl_device_lock(&parent_port->dev);
+	device_lock(&parent_port->dev);
 	if (!parent_port->dev.driver) {
 		dev_err(dev, "CXL port topology %s not enabled\n",
 			dev_name(&parent_port->dev));
@@ -205,7 +205,7 @@ static int cxl_mem_probe(struct device *dev)
 
 	rc = create_endpoint(cxlmd, parent_port);
 out:
-	cxl_device_unlock(&parent_port->dev);
+	device_unlock(&parent_port->dev);
 	put_device(&parent_port->dev);
 	return rc;
 }
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 15ad666ab03e..b65a272a2d6d 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -43,7 +43,7 @@ static int cxl_nvdimm_probe(struct device *dev)
 	if (!cxl_nvb)
 		return -ENXIO;
 
-	cxl_device_lock(&cxl_nvb->dev);
+	device_lock(&cxl_nvb->dev);
 	if (!cxl_nvb->nvdimm_bus) {
 		rc = -ENXIO;
 		goto out;
@@ -68,7 +68,7 @@ static int cxl_nvdimm_probe(struct device *dev)
 	dev_set_drvdata(dev, nvdimm);
 	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
 out:
-	cxl_device_unlock(&cxl_nvb->dev);
+	device_unlock(&cxl_nvb->dev);
 	put_device(&cxl_nvb->dev);
 
 	return rc;
@@ -233,7 +233,7 @@ static void cxl_nvb_update_state(struct work_struct *work)
 	struct nvdimm_bus *victim_bus = NULL;
 	bool release = false, rescan = false;
 
-	cxl_device_lock(&cxl_nvb->dev);
+	device_lock(&cxl_nvb->dev);
 	switch (cxl_nvb->state) {
 	case CXL_NVB_ONLINE:
 		if (!online_nvdimm_bus(cxl_nvb)) {
@@ -251,7 +251,7 @@ static void cxl_nvb_update_state(struct work_struct *work)
 	default:
 		break;
 	}
-	cxl_device_unlock(&cxl_nvb->dev);
+	device_unlock(&cxl_nvb->dev);
 
 	if (release)
 		device_release_driver(&cxl_nvb->dev);
@@ -327,9 +327,9 @@ static int cxl_nvdimm_bridge_reset(struct device *dev, void *data)
 		return 0;
 
 	cxl_nvb = to_cxl_nvdimm_bridge(dev);
-	cxl_device_lock(dev);
+	device_lock(dev);
 	cxl_nvb->state = CXL_NVB_NEW;
-	cxl_device_unlock(dev);
+	device_unlock(dev);
 
 	return 0;
 }
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 075cd25363ac..333d41da7621 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1563,7 +1563,7 @@ config PROVE_CXL_LOCKING
 	bool "CXL"
 	depends on CXL_BUS
 	help
-	  Enable lockdep to validate cxl_device_lock() usage.
+	  Enable lockdep to validate CXL subsystem usage of the device lock.
 
 endchoice
 


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 05/12] cxl/core: Clamp max lock_class
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (3 preceding siblings ...)
  2022-04-13  6:01 ` [PATCH v2 04/12] cxl/core: Remove cxl_device_lock() Dan Williams
@ 2022-04-13  6:01 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation Dan Williams
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Ben Widawsky,
	Dave Jiang, Kevin Tian, peterz, gregkh, linux-kernel, nvdimm

MAX_LOCKDEP_SUBCLASSES limits the depth of the CXL topology that can be
validated by lockdep. Given that the cxl_test topology is already at
this limit collapse some of the levels and clamp the max depth.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/cxl.h |   25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a6c1a027e389..b86aac8cde4f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -410,20 +410,37 @@ enum cxl_lock_class {
 	CXL_ANON_LOCK,
 	CXL_NVDIMM_LOCK,
 	CXL_NVDIMM_BRIDGE_LOCK,
-	CXL_PORT_LOCK,
+	/*
+	 * Collapse the compatible port and nvdimm-bridge lock classes
+	 * to save space
+	 */
+	CXL_PORT_LOCK = CXL_NVDIMM_BRIDGE_LOCK,
 	/*
 	 * Be careful to add new lock classes here, CXL_PORT_LOCK is
 	 * extended by the port depth, so a maximum CXL port topology
-	 * depth would need to be defined first.
+	 * depth would need to be defined first. Also, the max
+	 * validation depth is limited by MAX_LOCKDEP_SUBCLASSES.
 	 */
 };
 
+static inline int clamp_lock_class(struct device *dev, int lock_class)
+{
+	if (lock_class >= MAX_LOCKDEP_SUBCLASSES) {
+		dev_warn_once(dev,
+			      "depth: %d, disabling lockdep for this device\n",
+			      lock_class);
+		return -1;
+	}
+
+	return lock_class;
+}
+
 static inline int cxl_lock_class(struct device *dev)
 {
 	if (is_cxl_port(dev)) {
 		struct cxl_port *port = to_cxl_port(dev);
 
-		return CXL_PORT_LOCK + port->depth;
+		return clamp_lock_class(dev, CXL_PORT_LOCK + port->depth);
 	} else if (is_cxl_decoder(dev)) {
 		struct cxl_port *port = to_cxl_port(dev->parent);
 
@@ -431,7 +448,7 @@ static inline int cxl_lock_class(struct device *dev)
 		 * A decoder is the immediate child of a port, so set
 		 * its lock class equal to other child device siblings.
 		 */
-		return CXL_PORT_LOCK + port->depth + 1;
+		return clamp_lock_class(dev, CXL_PORT_LOCK + port->depth + 1);
 	} else if (is_cxl_nvdimm_bridge(dev))
 		return CXL_NVDIMM_BRIDGE_LOCK;
 	else if (is_cxl_nvdimm(dev))


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (4 preceding siblings ...)
  2022-04-13  6:01 ` [PATCH v2 05/12] cxl/core: Clamp max lock_class Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device Dan Williams
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Ben Widawsky,
	Dave Jiang, Kevin Tian, peterz, gregkh, linux-kernel, nvdimm

Update CONFIG_PROVE_CXL_LOCKING to use the common device-core helpers
for device_lock validation.

When CONFIG_PROVE_LOCKING is enabled, and device_set_lock_class() is
passed a non-zero lock class , the core acquires the 'struct device'
@lockdep_mutex everywhere it acquires the device_lock. Where
lockdep_mutex does not skip lockdep validation like device_lock.

cxl_set_lock_class() arranges to turn off CXL lock validation depending
on the value of CONFIG_PROVE_CXL_LOCKING, which is a Kconfig 'choice'
between the all subsystems that want to use dev->lockdep_mutex for
lockdep validation. For now, that 'choice' is just between the NVDIMM
and CXL subsystems.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |    1 +
 drivers/cxl/core/pmem.c   |    2 ++
 drivers/cxl/core/port.c   |    2 ++
 drivers/cxl/cxl.h         |    9 +++++++++
 4 files changed, 14 insertions(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 1f76b28f9826..ad8c9f61c38a 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -343,6 +343,7 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
 	cxlmd->cxlds = cxlds;
 
 	cdev = &cxlmd->cdev;
+	cxl_set_lock_class(dev);
 	rc = cdev_device_add(cdev, dev);
 	if (rc)
 		goto err;
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 60f71caa5dbd..b0eb0b795140 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -165,6 +165,7 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 	if (rc)
 		goto err;
 
+	cxl_set_lock_class(dev);
 	rc = device_add(dev);
 	if (rc)
 		goto err;
@@ -261,6 +262,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
 	if (rc)
 		goto err;
 
+	cxl_set_lock_class(dev);
 	rc = device_add(dev);
 	if (rc)
 		goto err;
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 437e8a71e5dc..eb450397104b 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -469,6 +469,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 	if (rc)
 		goto err;
 
+	cxl_set_lock_class(dev);
 	rc = device_add(dev);
 	if (rc)
 		goto err;
@@ -1349,6 +1350,7 @@ int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
 	if (is_root_decoder(dev))
 		cxld->platform_res.name = dev_name(dev);
 
+	cxl_set_lock_class(dev);
 	return device_add(dev);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b86aac8cde4f..fddbcb380e84 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -456,5 +456,14 @@ static inline int cxl_lock_class(struct device *dev)
 	else
 		return CXL_ANON_LOCK;
 }
+
+static inline void cxl_set_lock_class(struct device *dev)
+{
+	device_set_lock_class(dev, cxl_lock_class(dev));
+}
+#else
+static inline void cxl_set_lock_class(struct device *dev)
+{
+}
 #endif
 #endif /* __CXL_H__ */


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (5 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper Dan Williams
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Ben Widawsky,
	Dave Jiang, Kevin Tian, peterz, gregkh, linux-kernel, nvdimm

Now that the device-core can start validating lockdep usage after the
device has been added, use that capability to validate usage of
device_lock() against the ACPI0017 device relative to other subsystem
locks.

The 'enum cxl_lock_class' definition moves outside of the ifdef guard to
support device_set_lock_class() called from cxl_acpi_probe().

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c |    1 +
 drivers/cxl/cxl.h  |    3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d15a6aec0331..ef5c3252bdb2 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -283,6 +283,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	struct acpi_device *adev = ACPI_COMPANION(host);
 	struct cxl_cfmws_context ctx;
 
+	device_set_lock_class(&pdev->dev, CXL_ROOT_LOCK);
 	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
 	if (IS_ERR(root_port))
 		return PTR_ERR(root_port);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index fddbcb380e84..05dc4c081ad2 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -405,9 +405,9 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
 #define __mock static
 #endif
 
-#ifdef CONFIG_PROVE_CXL_LOCKING
 enum cxl_lock_class {
 	CXL_ANON_LOCK,
+	CXL_ROOT_LOCK,
 	CXL_NVDIMM_LOCK,
 	CXL_NVDIMM_BRIDGE_LOCK,
 	/*
@@ -423,6 +423,7 @@ enum cxl_lock_class {
 	 */
 };
 
+#ifdef CONFIG_PROVE_CXL_LOCKING
 static inline int clamp_lock_class(struct device *dev, int lock_class)
 {
 	if (lock_class >= MAX_LOCKDEP_SUBCLASSES) {


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (6 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Vishal Verma, Dave Jiang, Ira Weiny, Dave Jiang, Kevin Tian,
	peterz, alison.schofield, gregkh, linux-kernel, nvdimm

In preparation for moving to the device-core device_lock lockdep
validation, refactor an nvdimm_lock_class() helper to be used with
device_set_lock_class().

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/nd-core.h |   21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index 448f9dcb4bb7..deb3d047571e 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -174,22 +174,27 @@ enum {
 	LOCK_CLAIM,
 };
 
-static inline void debug_nvdimm_lock(struct device *dev)
+static inline int nvdimm_lock_class(struct device *dev)
 {
 	if (is_nd_region(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_REGION);
+		return LOCK_REGION;
 	else if (is_nvdimm(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_DIMM);
+		return LOCK_DIMM;
 	else if (is_nd_btt(dev) || is_nd_pfn(dev) || is_nd_dax(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_CLAIM);
+		return LOCK_CLAIM;
 	else if (dev->parent && (is_nd_region(dev->parent)))
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_NAMESPACE);
+		return LOCK_NAMESPACE;
 	else if (is_nvdimm_bus(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_BUS);
+		return LOCK_BUS;
 	else if (dev->class && dev->class == nd_class)
-		mutex_lock_nested(&dev->lockdep_mutex, LOCK_NDCTL);
+		return LOCK_NDCTL;
 	else
-		dev_WARN(dev, "unknown lock level\n");
+		return -1;
+}
+
+static inline void debug_nvdimm_lock(struct device *dev)
+{
+	mutex_lock_nested(&dev->lockdep_mutex, nvdimm_lock_class(dev));
 }
 
 static inline void debug_nvdimm_unlock(struct device *dev)


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock()
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (7 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 10/12] libnvdimm: Drop nd_device_lock() Dan Williams
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Vishal Verma, Dave Jiang, Ira Weiny, Rafael J. Wysocki,
	Dave Jiang, Kevin Tian, peterz, alison.schofield, gregkh,
	linux-kernel, nvdimm

In preparation for the libnvdimm subsystem switching to device-core
common lockdep validation. Delete nfit_device_lock() which will need to
be replaced with an implementation that specifies a non-zero lock class.

Note this reverts back to the default state of unvalidated
device_lock(), until a lock class for ACPI NFIT devices is added.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |   30 +++++++++++++++---------------
 drivers/acpi/nfit/nfit.h |   24 ------------------------
 2 files changed, 15 insertions(+), 39 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index fe61f617a943..ae5f4acf2675 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1230,7 +1230,7 @@ static ssize_t hw_error_scrub_store(struct device *dev,
 	if (rc)
 		return rc;
 
-	nfit_device_lock(dev);
+	device_lock(dev);
 	nd_desc = dev_get_drvdata(dev);
 	if (nd_desc) {
 		struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
@@ -1247,7 +1247,7 @@ static ssize_t hw_error_scrub_store(struct device *dev,
 			break;
 		}
 	}
-	nfit_device_unlock(dev);
+	device_unlock(dev);
 	if (rc)
 		return rc;
 	return size;
@@ -1267,10 +1267,10 @@ static ssize_t scrub_show(struct device *dev,
 	ssize_t rc = -ENXIO;
 	bool busy;
 
-	nfit_device_lock(dev);
+	device_lock(dev);
 	nd_desc = dev_get_drvdata(dev);
 	if (!nd_desc) {
-		nfit_device_unlock(dev);
+		device_unlock(dev);
 		return rc;
 	}
 	acpi_desc = to_acpi_desc(nd_desc);
@@ -1287,7 +1287,7 @@ static ssize_t scrub_show(struct device *dev,
 	}
 
 	mutex_unlock(&acpi_desc->init_mutex);
-	nfit_device_unlock(dev);
+	device_unlock(dev);
 	return rc;
 }
 
@@ -1304,14 +1304,14 @@ static ssize_t scrub_store(struct device *dev,
 	if (val != 1)
 		return -EINVAL;
 
-	nfit_device_lock(dev);
+	device_lock(dev);
 	nd_desc = dev_get_drvdata(dev);
 	if (nd_desc) {
 		struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
 
 		rc = acpi_nfit_ars_rescan(acpi_desc, ARS_REQ_LONG);
 	}
-	nfit_device_unlock(dev);
+	device_unlock(dev);
 	if (rc)
 		return rc;
 	return size;
@@ -1697,9 +1697,9 @@ static void acpi_nvdimm_notify(acpi_handle handle, u32 event, void *data)
 	struct acpi_device *adev = data;
 	struct device *dev = &adev->dev;
 
-	nfit_device_lock(dev->parent);
+	device_lock(dev->parent);
 	__acpi_nvdimm_notify(dev, event);
-	nfit_device_unlock(dev->parent);
+	device_unlock(dev->parent);
 }
 
 static bool acpi_nvdimm_has_method(struct acpi_device *adev, char *method)
@@ -3152,8 +3152,8 @@ static int acpi_nfit_flush_probe(struct nvdimm_bus_descriptor *nd_desc)
 	struct device *dev = acpi_desc->dev;
 
 	/* Bounce the device lock to flush acpi_nfit_add / acpi_nfit_notify */
-	nfit_device_lock(dev);
-	nfit_device_unlock(dev);
+	device_lock(dev);
+	device_unlock(dev);
 
 	/* Bounce the init_mutex to complete initial registration */
 	mutex_lock(&acpi_desc->init_mutex);
@@ -3305,8 +3305,8 @@ void acpi_nfit_shutdown(void *data)
 	 * acpi_nfit_ars_rescan() submissions have had a chance to
 	 * either submit or see ->cancel set.
 	 */
-	nfit_device_lock(bus_dev);
-	nfit_device_unlock(bus_dev);
+	device_lock(bus_dev);
+	device_unlock(bus_dev);
 
 	flush_workqueue(nfit_wq);
 }
@@ -3449,9 +3449,9 @@ EXPORT_SYMBOL_GPL(__acpi_nfit_notify);
 
 static void acpi_nfit_notify(struct acpi_device *adev, u32 event)
 {
-	nfit_device_lock(&adev->dev);
+	device_lock(&adev->dev);
 	__acpi_nfit_notify(&adev->dev, adev->handle, event);
-	nfit_device_unlock(&adev->dev);
+	device_unlock(&adev->dev);
 }
 
 static const struct acpi_device_id acpi_nfit_ids[] = {
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 50882bdbeb96..6023ad61831a 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -337,30 +337,6 @@ static inline struct acpi_nfit_desc *to_acpi_desc(
 	return container_of(nd_desc, struct acpi_nfit_desc, nd_desc);
 }
 
-#ifdef CONFIG_PROVE_LOCKING
-static inline void nfit_device_lock(struct device *dev)
-{
-	device_lock(dev);
-	mutex_lock(&dev->lockdep_mutex);
-}
-
-static inline void nfit_device_unlock(struct device *dev)
-{
-	mutex_unlock(&dev->lockdep_mutex);
-	device_unlock(dev);
-}
-#else
-static inline void nfit_device_lock(struct device *dev)
-{
-	device_lock(dev);
-}
-
-static inline void nfit_device_unlock(struct device *dev)
-{
-	device_unlock(dev);
-}
-#endif
-
 const guid_t *to_nfit_uuid(enum nfit_uuids id);
 int acpi_nfit_init(struct acpi_nfit_desc *acpi_desc, void *nfit, acpi_size sz);
 void acpi_nfit_shutdown(void *data);


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 10/12] libnvdimm: Drop nd_device_lock()
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (8 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 11/12] libnvdimm: Enable lockdep validation Dan Williams
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Vishal Verma, Dave Jiang, Ira Weiny, Dave Jiang, Kevin Tian,
	peterz, alison.schofield, gregkh, linux-kernel, nvdimm

In preparation for switching to the core device_lock lockdep validation
scheme, convert nd_device_lock() calls back to device_lock().

Note that this temporarily reverts the code back to the typical no
validation of device_lock() until a follow-on patch to set
dev->lock_class.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/btt_devs.c       |   16 ++++++++--------
 drivers/nvdimm/bus.c            |   23 +++++++++-------------
 drivers/nvdimm/core.c           |   10 +++++-----
 drivers/nvdimm/dimm_devs.c      |    8 ++++----
 drivers/nvdimm/namespace_devs.c |   36 ++++++++++++++++++-----------------
 drivers/nvdimm/nd-core.h        |   40 ---------------------------------------
 drivers/nvdimm/pfn_devs.c       |   24 ++++++++++++-----------
 drivers/nvdimm/pmem.c           |    2 +-
 drivers/nvdimm/region.c         |    2 +-
 drivers/nvdimm/region_devs.c    |   16 ++++++++--------
 lib/Kconfig.debug               |    3 ++-
 11 files changed, 68 insertions(+), 112 deletions(-)

diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
index e5a58520d398..60c19b988bc4 100644
--- a/drivers/nvdimm/btt_devs.c
+++ b/drivers/nvdimm/btt_devs.c
@@ -50,14 +50,14 @@ static ssize_t sector_size_store(struct device *dev,
 	struct nd_btt *nd_btt = to_nd_btt(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	rc = nd_size_select_store(dev, buf, &nd_btt->lbasize,
 			btt_lbasize_supported);
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -79,11 +79,11 @@ static ssize_t uuid_store(struct device *dev,
 	struct nd_btt *nd_btt = to_nd_btt(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	rc = nd_uuid_store(dev, &nd_btt->uuid, buf, len);
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -108,13 +108,13 @@ static ssize_t namespace_store(struct device *dev,
 	struct nd_btt *nd_btt = to_nd_btt(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	rc = nd_namespace_store(dev, &nd_btt->ndns, buf, len);
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -126,14 +126,14 @@ static ssize_t size_show(struct device *dev,
 	struct nd_btt *nd_btt = to_nd_btt(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (dev->driver)
 		rc = sprintf(buf, "%llu\n", nd_btt->size);
 	else {
 		/* no size to convey if the btt instance is disabled */
 		rc = -ENXIO;
 	}
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 7b0d1443217a..b5a1317c30dd 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -88,9 +88,7 @@ static int nvdimm_bus_probe(struct device *dev)
 			dev->driver->name, dev_name(dev));
 
 	nvdimm_bus_probe_start(nvdimm_bus);
-	debug_nvdimm_lock(dev);
 	rc = nd_drv->probe(dev);
-	debug_nvdimm_unlock(dev);
 
 	if ((rc == 0 || rc == -EOPNOTSUPP) &&
 			dev->parent && is_nd_region(dev->parent))
@@ -111,11 +109,8 @@ static void nvdimm_bus_remove(struct device *dev)
 	struct module *provider = to_bus_provider(dev);
 	struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
 
-	if (nd_drv->remove) {
-		debug_nvdimm_lock(dev);
+	if (nd_drv->remove)
 		nd_drv->remove(dev);
-		debug_nvdimm_unlock(dev);
-	}
 
 	dev_dbg(&nvdimm_bus->dev, "%s.remove(%s)\n", dev->driver->name,
 			dev_name(dev));
@@ -139,7 +134,7 @@ static void nvdimm_bus_shutdown(struct device *dev)
 
 void nd_device_notify(struct device *dev, enum nvdimm_event event)
 {
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (dev->driver) {
 		struct nd_device_driver *nd_drv;
 
@@ -147,7 +142,7 @@ void nd_device_notify(struct device *dev, enum nvdimm_event event)
 		if (nd_drv->notify)
 			nd_drv->notify(dev, event);
 	}
-	nd_device_unlock(dev);
+	device_unlock(dev);
 }
 EXPORT_SYMBOL(nd_device_notify);
 
@@ -572,9 +567,9 @@ void nd_device_unregister(struct device *dev, enum nd_async_mode mode)
 		 * or otherwise let the async path handle it if the
 		 * unregistration was already queued.
 		 */
-		nd_device_lock(dev);
+		device_lock(dev);
 		killed = kill_device(dev);
-		nd_device_unlock(dev);
+		device_unlock(dev);
 
 		if (!killed)
 			return;
@@ -930,10 +925,10 @@ void wait_nvdimm_bus_probe_idle(struct device *dev)
 		if (nvdimm_bus->probe_active == 0)
 			break;
 		nvdimm_bus_unlock(dev);
-		nd_device_unlock(dev);
+		device_unlock(dev);
 		wait_event(nvdimm_bus->wait,
 				nvdimm_bus->probe_active == 0);
-		nd_device_lock(dev);
+		device_lock(dev);
 		nvdimm_bus_lock(dev);
 	} while (true);
 }
@@ -1167,7 +1162,7 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
 		goto out;
 	}
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	rc = nd_cmd_clear_to_send(nvdimm_bus, nvdimm, func, buf);
 	if (rc)
@@ -1189,7 +1184,7 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
 
 out_unlock:
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 out:
 	kfree(in_env);
 	kfree(out_env);
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 69a03358817f..144926b7451c 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -215,7 +215,7 @@ EXPORT_SYMBOL_GPL(to_nvdimm_bus_dev);
  *
  * Enforce that uuids can only be changed while the device is disabled
  * (driver detached)
- * LOCKING: expects nd_device_lock() is held on entry
+ * LOCKING: expects device_lock() is held on entry
  */
 int nd_uuid_store(struct device *dev, uuid_t **uuid_out, const char *buf,
 		size_t len)
@@ -316,15 +316,15 @@ static DEVICE_ATTR_RO(provider);
 
 static int flush_namespaces(struct device *dev, void *data)
 {
-	nd_device_lock(dev);
-	nd_device_unlock(dev);
+	device_lock(dev);
+	device_unlock(dev);
 	return 0;
 }
 
 static int flush_regions_dimms(struct device *dev, void *data)
 {
-	nd_device_lock(dev);
-	nd_device_unlock(dev);
+	device_lock(dev);
+	device_unlock(dev);
 	device_for_each_child(dev, NULL, flush_namespaces);
 	return 0;
 }
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index ee507eed42b5..93113c0c4dc8 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -341,9 +341,9 @@ static ssize_t available_slots_show(struct device *dev,
 {
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	rc = __available_slots_show(dev_get_drvdata(dev), buf);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -386,12 +386,12 @@ static ssize_t security_store(struct device *dev,
 	 * done while probing is idle and the DIMM is not in active use
 	 * in any region.
 	 */
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	rc = nvdimm_security_store(dev, buf, len);
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 62b83b2e26e3..bc578dbea472 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -264,7 +264,7 @@ static ssize_t alt_name_store(struct device *dev,
 	struct nd_region *nd_region = to_nd_region(dev->parent);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	rc = __alt_name_store(dev, buf, len);
@@ -272,7 +272,7 @@ static ssize_t alt_name_store(struct device *dev,
 		rc = nd_namespace_label_update(nd_region, dev);
 	dev_dbg(dev, "%s(%zd)\n", rc < 0 ? "fail " : "", rc);
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc < 0 ? rc : len;
 }
@@ -846,7 +846,7 @@ static ssize_t size_store(struct device *dev,
 	if (rc)
 		return rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	rc = __size_store(dev, val);
@@ -868,7 +868,7 @@ static ssize_t size_store(struct device *dev,
 	dev_dbg(dev, "%llx %s (%d)\n", val, rc < 0 ? "fail" : "success", rc);
 
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc < 0 ? rc : len;
 }
@@ -1043,7 +1043,7 @@ static ssize_t uuid_store(struct device *dev,
 	} else
 		return -ENXIO;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	if (to_ndns(dev)->claim)
@@ -1059,7 +1059,7 @@ static ssize_t uuid_store(struct device *dev,
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc < 0 ? rc : len;
 }
@@ -1118,7 +1118,7 @@ static ssize_t sector_size_store(struct device *dev,
 	} else
 		return -ENXIO;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	if (to_ndns(dev)->claim)
 		rc = -EBUSY;
@@ -1129,7 +1129,7 @@ static ssize_t sector_size_store(struct device *dev,
 	dev_dbg(dev, "result: %zd %s: %s%s", rc, rc < 0 ? "tried" : "wrote",
 			buf, buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -1239,9 +1239,9 @@ static ssize_t holder_show(struct device *dev,
 	struct nd_namespace_common *ndns = to_ndns(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	rc = sprintf(buf, "%s\n", ndns->claim ? dev_name(ndns->claim) : "");
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -1278,7 +1278,7 @@ static ssize_t holder_class_store(struct device *dev,
 	struct nd_region *nd_region = to_nd_region(dev->parent);
 	int rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	rc = __holder_class_store(dev, buf);
@@ -1286,7 +1286,7 @@ static ssize_t holder_class_store(struct device *dev,
 		rc = nd_namespace_label_update(nd_region, dev);
 	dev_dbg(dev, "%s(%d)\n", rc < 0 ? "fail " : "", rc);
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc < 0 ? rc : len;
 }
@@ -1297,7 +1297,7 @@ static ssize_t holder_class_show(struct device *dev,
 	struct nd_namespace_common *ndns = to_ndns(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (ndns->claim_class == NVDIMM_CCLASS_NONE)
 		rc = sprintf(buf, "\n");
 	else if ((ndns->claim_class == NVDIMM_CCLASS_BTT) ||
@@ -1309,7 +1309,7 @@ static ssize_t holder_class_show(struct device *dev,
 		rc = sprintf(buf, "dax\n");
 	else
 		rc = sprintf(buf, "<unknown>\n");
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -1323,7 +1323,7 @@ static ssize_t mode_show(struct device *dev,
 	char *mode;
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	claim = ndns->claim;
 	if (claim && is_nd_btt(claim))
 		mode = "safe";
@@ -1336,7 +1336,7 @@ static ssize_t mode_show(struct device *dev,
 	else
 		mode = "raw";
 	rc = sprintf(buf, "%s\n", mode);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -1456,8 +1456,8 @@ struct nd_namespace_common *nvdimm_namespace_common_probe(struct device *dev)
 		 * Flush any in-progess probes / removals in the driver
 		 * for the raw personality of this namespace.
 		 */
-		nd_device_lock(&ndns->dev);
-		nd_device_unlock(&ndns->dev);
+		device_lock(&ndns->dev);
+		device_unlock(&ndns->dev);
 		if (ndns->dev.driver) {
 			dev_dbg(&ndns->dev, "is active, can't bind %s\n",
 					dev_name(dev));
diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index deb3d047571e..1668a10e41ba 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -191,45 +191,5 @@ static inline int nvdimm_lock_class(struct device *dev)
 	else
 		return -1;
 }
-
-static inline void debug_nvdimm_lock(struct device *dev)
-{
-	mutex_lock_nested(&dev->lockdep_mutex, nvdimm_lock_class(dev));
-}
-
-static inline void debug_nvdimm_unlock(struct device *dev)
-{
-	mutex_unlock(&dev->lockdep_mutex);
-}
-
-static inline void nd_device_lock(struct device *dev)
-{
-	device_lock(dev);
-	debug_nvdimm_lock(dev);
-}
-
-static inline void nd_device_unlock(struct device *dev)
-{
-	debug_nvdimm_unlock(dev);
-	device_unlock(dev);
-}
-#else
-static inline void nd_device_lock(struct device *dev)
-{
-	device_lock(dev);
-}
-
-static inline void nd_device_unlock(struct device *dev)
-{
-	device_unlock(dev);
-}
-
-static inline void debug_nvdimm_lock(struct device *dev)
-{
-}
-
-static inline void debug_nvdimm_unlock(struct device *dev)
-{
-}
 #endif
 #endif /* __ND_CORE_H__ */
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index c31e184bfa45..1672d139f708 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -55,7 +55,7 @@ static ssize_t mode_store(struct device *dev,
 	struct nd_pfn *nd_pfn = to_nd_pfn_safe(dev);
 	ssize_t rc = 0;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	if (dev->driver)
 		rc = -EBUSY;
@@ -77,7 +77,7 @@ static ssize_t mode_store(struct device *dev,
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -123,14 +123,14 @@ static ssize_t align_store(struct device *dev,
 	unsigned long aligns[MAX_NVDIMM_ALIGN] = { [0] = 0, };
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	rc = nd_size_select_store(dev, buf, &nd_pfn->align,
 			nd_pfn_supported_alignments(aligns));
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -152,11 +152,11 @@ static ssize_t uuid_store(struct device *dev,
 	struct nd_pfn *nd_pfn = to_nd_pfn_safe(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	rc = nd_uuid_store(dev, &nd_pfn->uuid, buf, len);
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc ? rc : len;
 }
@@ -181,13 +181,13 @@ static ssize_t namespace_store(struct device *dev,
 	struct nd_pfn *nd_pfn = to_nd_pfn_safe(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	rc = nd_namespace_store(dev, &nd_pfn->ndns, buf, len);
 	dev_dbg(dev, "result: %zd wrote: %s%s", rc, buf,
 			buf[len - 1] == '\n' ? "" : "\n");
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -199,7 +199,7 @@ static ssize_t resource_show(struct device *dev,
 	struct nd_pfn *nd_pfn = to_nd_pfn_safe(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (dev->driver) {
 		struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
 		u64 offset = __le64_to_cpu(pfn_sb->dataoff);
@@ -213,7 +213,7 @@ static ssize_t resource_show(struct device *dev,
 		/* no address to convey if the pfn instance is disabled */
 		rc = -ENXIO;
 	}
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
@@ -225,7 +225,7 @@ static ssize_t size_show(struct device *dev,
 	struct nd_pfn *nd_pfn = to_nd_pfn_safe(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (dev->driver) {
 		struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
 		u64 offset = __le64_to_cpu(pfn_sb->dataoff);
@@ -241,7 +241,7 @@ static ssize_t size_show(struct device *dev,
 		/* no size to convey if the pfn instance is disabled */
 		rc = -ENXIO;
 	}
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 58d95242a836..3992521c151f 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -573,7 +573,7 @@ static void nd_pmem_remove(struct device *dev)
 		nvdimm_namespace_detach_btt(to_nd_btt(dev));
 	else {
 		/*
-		 * Note, this assumes nd_device_lock() context to not
+		 * Note, this assumes device_lock() context to not
 		 * race nd_pmem_notify()
 		 */
 		sysfs_put(pmem->bb_state);
diff --git a/drivers/nvdimm/region.c b/drivers/nvdimm/region.c
index 188560b1c110..390123d293ea 100644
--- a/drivers/nvdimm/region.c
+++ b/drivers/nvdimm/region.c
@@ -95,7 +95,7 @@ static void nd_region_remove(struct device *dev)
 	nvdimm_bus_unlock(dev);
 
 	/*
-	 * Note, this assumes nd_device_lock() context to not race
+	 * Note, this assumes device_lock() context to not race
 	 * nd_region_notify()
 	 */
 	sysfs_put(nd_region->bb_state);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 0cb274c2b508..83c1ce7f26a6 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -279,7 +279,7 @@ static ssize_t set_cookie_show(struct device *dev,
 	 * the v1.1 namespace label cookie definition. To read all this
 	 * data we need to wait for probing to settle.
 	 */
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	if (nd_region->ndr_mappings) {
@@ -296,7 +296,7 @@ static ssize_t set_cookie_show(struct device *dev,
 		}
 	}
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	if (rc)
 		return rc;
@@ -353,12 +353,12 @@ static ssize_t available_size_show(struct device *dev,
 	 * memory nvdimm_bus_lock() is dropped, but that's userspace's
 	 * problem to not race itself.
 	 */
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	available = nd_region_available_dpa(nd_region);
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return sprintf(buf, "%llu\n", available);
 }
@@ -370,12 +370,12 @@ static ssize_t max_available_extent_show(struct device *dev,
 	struct nd_region *nd_region = to_nd_region(dev);
 	unsigned long long available = 0;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	nvdimm_bus_lock(dev);
 	wait_nvdimm_bus_probe_idle(dev);
 	available = nd_region_allocatable_dpa(nd_region);
 	nvdimm_bus_unlock(dev);
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return sprintf(buf, "%llu\n", available);
 }
@@ -549,12 +549,12 @@ static ssize_t region_badblocks_show(struct device *dev,
 	struct nd_region *nd_region = to_nd_region(dev);
 	ssize_t rc;
 
-	nd_device_lock(dev);
+	device_lock(dev);
 	if (dev->driver)
 		rc = badblocks_show(&nd_region->bb, buf, 0);
 	else
 		rc = -ENXIO;
-	nd_device_unlock(dev);
+	device_unlock(dev);
 
 	return rc;
 }
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 333d41da7621..1f8db59bdddd 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1557,7 +1557,8 @@ config PROVE_NVDIMM_LOCKING
 	bool "NVDIMM"
 	depends on LIBNVDIMM
 	help
-	  Enable lockdep to validate nd_device_lock() usage.
+	  Enable lockdep to validate libnvdimm subsystem usage of the
+	  device lock.
 
 config PROVE_CXL_LOCKING
 	bool "CXL"


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 11/12] libnvdimm: Enable lockdep validation
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (9 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 10/12] libnvdimm: Drop nd_device_lock() Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13  6:02 ` [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() " Dan Williams
  2022-04-13 14:02 ` [PATCH v2 00/12] device-core: Enable " Waiman Long
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Vishal Verma, Dave Jiang, Ira Weiny, Dave Jiang, Kevin Tian,
	peterz, alison.schofield, gregkh, linux-kernel, nvdimm

Register libnvdimm subsystem devices with a non-zero lock_class to
enable the device-core to track lock dependencies.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/bus.c     |    3 +++
 drivers/nvdimm/nd-core.h |    9 +++++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index b5a1317c30dd..f2ae6825f533 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -360,6 +360,7 @@ struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
 	if (rc)
 		goto err;
 
+	nvdimm_set_lock_class(&nvdimm_bus->dev);
 	rc = device_add(&nvdimm_bus->dev);
 	if (rc) {
 		dev_dbg(&nvdimm_bus->dev, "registration failed: %d\n", rc);
@@ -485,6 +486,7 @@ static void nd_async_device_register(void *d, async_cookie_t cookie)
 {
 	struct device *dev = d;
 
+	nvdimm_set_lock_class(dev);
 	if (device_add(dev) != 0) {
 		dev_err(dev, "%s: failed\n", __func__);
 		put_device(dev);
@@ -738,6 +740,7 @@ int nvdimm_bus_create_ndctl(struct nvdimm_bus *nvdimm_bus)
 	if (rc)
 		goto err;
 
+	nvdimm_set_lock_class(dev);
 	rc = device_add(dev);
 	if (rc) {
 		dev_dbg(&nvdimm_bus->dev, "failed to register ndctl%d: %d\n",
diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index 1668a10e41ba..75892e43b2c9 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -191,5 +191,14 @@ static inline int nvdimm_lock_class(struct device *dev)
 	else
 		return -1;
 }
+
+static inline void nvdimm_set_lock_class(struct device *dev)
+{
+	device_set_lock_class(dev, nvdimm_lock_class(dev));
+}
+#else
+static inline void nvdimm_set_lock_class(struct device *dev)
+{
+}
 #endif
 #endif /* __ND_CORE_H__ */


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() lockdep validation
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (10 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 11/12] libnvdimm: Enable lockdep validation Dan Williams
@ 2022-04-13  6:02 ` Dan Williams
  2022-04-13 14:02 ` [PATCH v2 00/12] device-core: Enable " Waiman Long
  12 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Pierre-Louis Bossart,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Dave Jiang, Kevin Tian, vishal.l.verma,
	alison.schofield, gregkh, linux-kernel, nvdimm

While device_set_lock_class() allows device_lock() to take a nested lock
per the requirements of a given subsystem, it does not allow multiple
subsystems to enable lockdep validation at the same time. For example if
CXL does:

device_set_lock_class(&cxlmd->dev, 1);

...and LIBNVDIMM does:

device_set_lock_class(&nvdimm->dev, 1);

...lockdep will treat those as identical locks and flag false positive
reports depending on how those 2 subsystems interact. Given that there
are not enough available lock class ids for each subsystem to own a
slice of the number space, the only other way for lockdep to see that
those are different locks is to literally make them different locks.

Update device_set_lock_class() to also take a subsystem id which also
acts as an index into a new array of lockdep_mutex instances. So now CXL
and NVDIMM can do:

device_set_lock_class(&cxlmd->dev, DEVICE_LOCK_CXL, 1);
device_set_lock_class(&nvdimm->dev, DEVICE_LOCK_NVDIMM, 1);

...and lockdep will manage their dependency chains individually per
expectations.

Note that the reason the mutex_init() for the various subsystem
device-locks is unrolled in device_lockdep_init(), vs a loop over [0 ..
DEVICE_LOCK_MAX], is to give each lock a unique @lock_class_key and name
in reports. While the approach is not elegant, and requires manual
enabling, it seems the best that can be done given lockdep's
expectations. Otherwise this validation has already proven useful in
keeping device_lock() deadlocks from landing upstream.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c       |    2 -
 drivers/cxl/cxl.h        |    4 +-
 drivers/cxl/port.c       |    2 -
 drivers/nvdimm/nd-core.h |    5 +-
 include/linux/device.h   |   95 ++++++++++++++++++++++++++++++++++++----------
 lib/Kconfig.debug        |   24 ------------
 6 files changed, 82 insertions(+), 50 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index ef5c3252bdb2..ed2fdb45b424 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -283,7 +283,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	struct acpi_device *adev = ACPI_COMPANION(host);
 	struct cxl_cfmws_context ctx;
 
-	device_set_lock_class(&pdev->dev, CXL_ROOT_LOCK);
+	device_set_lock_class(&pdev->dev, DEVICE_LOCK_CXL, CXL_ROOT_LOCK);
 	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
 	if (IS_ERR(root_port))
 		return PTR_ERR(root_port);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 05dc4c081ad2..2528d4ab2e74 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -423,7 +423,7 @@ enum cxl_lock_class {
 	 */
 };
 
-#ifdef CONFIG_PROVE_CXL_LOCKING
+#ifdef CONFIG_PROVE_LOCKING
 static inline int clamp_lock_class(struct device *dev, int lock_class)
 {
 	if (lock_class >= MAX_LOCKDEP_SUBCLASSES) {
@@ -460,7 +460,7 @@ static inline int cxl_lock_class(struct device *dev)
 
 static inline void cxl_set_lock_class(struct device *dev)
 {
-	device_set_lock_class(dev, cxl_lock_class(dev));
+	device_set_lock_class(dev, DEVICE_LOCK_CXL, cxl_lock_class(dev));
 }
 #else
 static inline void cxl_set_lock_class(struct device *dev)
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index d420da5fc39c..493a195a9f5a 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -17,7 +17,7 @@
  * firmware) are managed in this drivers context. Each driver instance
  * is responsible for tearing down the driver context of immediate
  * descendant ports. The locking for this is validated by
- * CONFIG_PROVE_CXL_LOCKING.
+ * CONFIG_PROVE_LOCKING.
  *
  * The primary service this driver provides is presenting APIs to other
  * drivers to utilize the decoders, and indicating to userspace (via bind
diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index 75892e43b2c9..56a2e56e552b 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -162,7 +162,7 @@ static inline void devm_nsio_disable(struct device *dev,
 }
 #endif
 
-#ifdef CONFIG_PROVE_NVDIMM_LOCKING
+#ifdef CONFIG_PROVE_LOCKING
 extern struct class *nd_class;
 
 enum {
@@ -194,7 +194,8 @@ static inline int nvdimm_lock_class(struct device *dev)
 
 static inline void nvdimm_set_lock_class(struct device *dev)
 {
-	device_set_lock_class(dev, nvdimm_lock_class(dev));
+	device_set_lock_class(dev, DEVICE_LOCK_NVDIMM,
+			      nvdimm_lock_class(dev));
 }
 #else
 static inline void nvdimm_set_lock_class(struct device *dev)
diff --git a/include/linux/device.h b/include/linux/device.h
index 6083e757e804..12c675edb28d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -386,6 +386,17 @@ struct dev_msi_info {
 #endif
 };
 
+enum device_lock_subsys {
+	DEVICE_LOCK_NONE = -1,
+#if IS_ENABLED(CONFIG_LIBNVDIMM)
+	DEVICE_LOCK_NVDIMM,
+#endif
+#if IS_ENABLED(CONFIG_CXL_BUS)
+	DEVICE_LOCK_CXL,
+#endif
+	DEVICE_LOCK_MAX,
+};
+
 /**
  * struct device - The basic device structure
  * @parent:	The device's "parent" device, the device to which it is attached.
@@ -400,8 +411,8 @@ struct dev_msi_info {
  * 		This identifies the device type and carries type-specific
  * 		information.
  * @mutex:	Mutex to synchronize calls to its driver.
- * @lockdep_mutex: An optional debug lock that a subsystem can use as a
- * 		peer lock to gain localized lockdep coverage of the device_lock.
+ * @lockdep_mutex: A set of optional debug locks that subsystems can use as a
+ *		peer lock to @mutex to gain lockdep coverage of device_lock().
  * @lock_class: per-subsystem annotated device lock class
  * @bus:	Type of bus device is on.
  * @driver:	Which driver has allocated this
@@ -501,8 +512,9 @@ struct device {
 	void		*driver_data;	/* Driver data, set and get with
 					   dev_set_drvdata/dev_get_drvdata */
 #ifdef CONFIG_PROVE_LOCKING
-	struct mutex		lockdep_mutex;
+	struct mutex		lockdep_mutex[DEVICE_LOCK_MAX];
 	int			lock_class;
+	enum device_lock_subsys	lock_subsys;
 #endif
 	struct mutex		mutex;	/* mutex to synchronize calls to
 					 * its driver.
@@ -772,45 +784,77 @@ static inline void device_lock_assert(struct device *dev)
 #ifdef CONFIG_PROVE_LOCKING
 static inline void device_lockdep_init(struct device *dev)
 {
-	mutex_init(&dev->lockdep_mutex);
+#if IS_ENABLED(CONFIG_CXL_BUS)
+	mutex_init(&dev->lockdep_mutex[DEVICE_LOCK_CXL]);
+#endif
+#if IS_ENABLED(CONFIG_LIBNVDIMM)
+	mutex_init(&dev->lockdep_mutex[DEVICE_LOCK_NVDIMM]);
+#endif
+	dev->lock_subsys = DEVICE_LOCK_NONE;
 	dev->lock_class = -1;
 	lockdep_set_novalidate_class(&dev->mutex);
 }
 
+static inline bool subsys_valid(enum device_lock_subsys subsys)
+{
+	if (DEVICE_LOCK_MAX == 0)
+		return false;
+	if (subsys <= DEVICE_LOCK_NONE)
+		return false;
+	if (subsys >= DEVICE_LOCK_MAX)
+		return false;
+	return true;
+}
+
+static inline struct mutex *device_lockdep_mutex(struct device *dev)
+{
+	if (!subsys_valid(dev->lock_subsys))
+		return NULL;
+	if (dev->lock_class < 0)
+		return NULL;
+	return &dev->lockdep_mutex[dev->lock_subsys];
+}
+
 static inline void device_lock(struct device *dev)
 {
+	struct mutex *lockdep_mutex = device_lockdep_mutex(dev);
+
 	/*
 	 * For double-lock programming errors the kernel will hang
 	 * trying to acquire @dev->mutex before lockdep can report the
-	 * problem acquiring @dev->lockdep_mutex, so manually assert
-	 * before that hang.
+	 * problem acquiring @lockdep_mutex, so manually assert before
+	 * that hang.
 	 */
-	lockdep_assert_not_held(&dev->lockdep_mutex);
+	if (lockdep_mutex)
+		lockdep_assert_not_held(lockdep_mutex);
 
 	mutex_lock(&dev->mutex);
-	if (dev->lock_class >= 0)
-		mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
+	if (lockdep_mutex)
+		mutex_lock_nested(lockdep_mutex, dev->lock_class);
 }
 
 static inline int device_lock_interruptible(struct device *dev)
 {
+	struct mutex *lockdep_mutex = device_lockdep_mutex(dev);
 	int rc;
 
-	lockdep_assert_not_held(&dev->lockdep_mutex);
+	if (lockdep_mutex)
+		lockdep_assert_not_held(lockdep_mutex);
 
 	rc = mutex_lock_interruptible(&dev->mutex);
-	if (rc || dev->lock_class < 0)
+	if (rc || !lockdep_mutex)
 		return rc;
 
-	return mutex_lock_interruptible_nested(&dev->lockdep_mutex,
-					       dev->lock_class);
+	return mutex_lock_interruptible_nested(lockdep_mutex, dev->lock_class);
 }
 
 static inline int device_trylock(struct device *dev)
 {
+	struct mutex *lockdep_mutex = device_lockdep_mutex(dev);
+
 	if (mutex_trylock(&dev->mutex)) {
-		if (dev->lock_class >= 0)
-			mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
+		if (lockdep_mutex)
+			mutex_lock_nested(lockdep_mutex, dev->lock_class);
 		return 1;
 	}
 
@@ -819,8 +863,10 @@ static inline int device_trylock(struct device *dev)
 
 static inline void device_unlock(struct device *dev)
 {
+	struct mutex *lockdep_mutex = device_lockdep_mutex(dev);
+
 	if (dev->lock_class >= 0)
-		mutex_unlock(&dev->lockdep_mutex);
+		mutex_unlock(lockdep_mutex);
 	mutex_unlock(&dev->mutex);
 }
 
@@ -829,8 +875,13 @@ static inline void device_unlock(struct device *dev)
  * from entry to exit. There is no support for changing lockdep
  * validation classes, only enabling and disabling validation.
  */
-static inline void device_set_lock_class(struct device *dev, int lock_class)
+static inline void device_set_lock_class(struct device *dev,
+					 enum device_lock_subsys subsys,
+					 int lock_class)
 {
+	if (!subsys_valid(subsys))
+		return;
+
 	/*
 	 * Allow for setting or clearing the lock class while the
 	 * device_lock() is held, in which case the paired nested lock
@@ -840,11 +891,12 @@ static inline void device_set_lock_class(struct device *dev, int lock_class)
 	if (dev->lock_class < 0 && lock_class >= 0) {
 		/* Enabling lockdep validation... */
 		if (mutex_is_locked(&dev->mutex))
-			mutex_lock_nested(&dev->lockdep_mutex, lock_class);
+			mutex_lock_nested(&dev->lockdep_mutex[subsys],
+					  lock_class);
 	} else if (dev->lock_class >= 0 && lock_class < 0) {
 		/* Disabling lockdep validation... */
 		if (mutex_is_locked(&dev->mutex))
-			mutex_unlock(&dev->lockdep_mutex);
+			mutex_unlock(&dev->lockdep_mutex[subsys]);
 	} else {
 		dev_warn(dev,
 			 "%s: failed to change lock_class from: %d to %d\n",
@@ -852,6 +904,7 @@ static inline void device_set_lock_class(struct device *dev, int lock_class)
 		return;
 	}
 	dev->lock_class = lock_class;
+	dev->lock_subsys = subsys;
 }
 #else /* !CONFIG_PROVE_LOCKING */
 static inline void device_lockdep_init(struct device *dev)
@@ -879,7 +932,9 @@ static inline void device_unlock(struct device *dev)
 	mutex_unlock(&dev->mutex);
 }
 
-static inline void device_set_lock_class(struct device *dev, int lock_class)
+static inline void device_set_lock_class(struct device *dev,
+					 enum device_lock_subsys subssys,
+					 int lock_class)
 {
 }
 #endif /* CONFIG_PROVE_LOCKING */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 1f8db59bdddd..cfe3b092c31d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1544,30 +1544,6 @@ config CSD_LOCK_WAIT_DEBUG
 	  include the IPI handler function currently executing (if any)
 	  and relevant stack traces.
 
-choice
-	prompt "Lock debugging: prove subsystem device_lock() correctness"
-	depends on PROVE_LOCKING
-	help
-	  For subsystems that have instrumented their usage of the device_lock()
-	  with nested annotations, enable lock dependency checking. The locking
-	  hierarchy 'subclass' identifiers are not compatible across
-	  sub-systems, so only one can be enabled at a time.
-
-config PROVE_NVDIMM_LOCKING
-	bool "NVDIMM"
-	depends on LIBNVDIMM
-	help
-	  Enable lockdep to validate libnvdimm subsystem usage of the
-	  device lock.
-
-config PROVE_CXL_LOCKING
-	bool "CXL"
-	depends on CXL_BUS
-	help
-	  Enable lockdep to validate CXL subsystem usage of the device lock.
-
-endchoice
-
 endmenu # lock debugging
 
 config TRACE_IRQFLAGS


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
@ 2022-04-13  8:43   ` Peter Zijlstra
  2022-04-13 22:01     ` Dan Williams
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2022-04-13  8:43 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, vishal.l.verma, alison.schofield, linux-kernel,
	nvdimm

On Tue, Apr 12, 2022 at 11:01:38PM -0700, Dan Williams wrote:
> The device_lock() is hidden from lockdep by default because, for
> example, a device subsystem may do something like:
> 
> ---
> device_add(dev1);
> ...in driver core...
> device_lock(dev1);
> bus->probe(dev1); /* where bus->probe() calls driver1_probe() */
> 
> driver1_probe(struct device *dev)
> {
> 	...do some enumeration...
> 	dev2->parent = dev;
> 	/* this triggers probe under device_lock(dev2); */
> 	device_add(dev2);
> }
> ---
> 
> To lockdep, that device_lock(dev2) looks like a deadlock because lockdep

Recursion, you're meaning to say it looks like same lock recursion.

> only sees lock classes, not individual lock instances. All device_lock()
> instances across the entire kernel are the same class. However, this is
> not a deadlock in practice because the locking is strictly hierarchical.
> I.e. device_lock(dev1) is held over device_lock(dev2), but never the
> reverse.

I have some very vague memories from a conversation with Alan Stern,
some maybe 10 years ago, where I think he was explaining to me this was
not in fact a simple hierarchy.

> In order for lockdep to be satisfied and see that it is
> hierarchical in practice the mutex_lock() call in device_lock() needs to
> be moved to mutex_lock_nested() where the @subclass argument to
> mutex_lock_nested() represents the nesting level, i.e.:

That's not an obvious conclusion; lockdep has lots of funny annotations,
subclasses is just one.

I think the big new development in lockdep since that time with Alan
Stern is that lockdep now has support for dynamic keys; that is lock
keys in heap memory (as opposed to static storage).

> s/device_lock(dev1)/mutex_lock_nested(&dev1->mutex, 1)/
> 
> s/device_lock(dev2)/mutex_lock_nested(&dev2->mutex, 2)/
> 
> Now, what if the internals of the device_lock() could be annotated with
> the right @subclass argument to call mutex_lock_nested()?
> 
> With device_set_lock_class() a subsystem can optionally add that
> metadata. The device_lock() still takes dev->mutex, but when
> dev->lock_class is >= 0 it additionally takes dev->lockdep_mutex with
> the proper nesting. Unlike dev->mutex, dev->lockdep_mutex is not marked
> lockdep_set_novalidate_class() and lockdep will become useful... at
> least for one subsystem at a time.
> 
> It is still the case that only one subsystem can be using lockdep with
> lockdep_mutex at a time because different subsystems will collide class
> numbers. You might say "well, how about subsystem1 gets class ids 0 to 9
> and subsystem2 gets class ids 10 to 20?". MAX_LOCKDEP_SUBCLASSES is 8,
> and 8 is just enough class ids for one subsystem of moderate complexity.

Again, that doesn't seem like an obvious suggestion at all. Why not give
each subsystem a different lock key?


> diff --git a/include/linux/device.h b/include/linux/device.h
> index af2576ace130..6083e757e804 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -402,6 +402,7 @@ struct dev_msi_info {
>   * @mutex:	Mutex to synchronize calls to its driver.
>   * @lockdep_mutex: An optional debug lock that a subsystem can use as a
>   * 		peer lock to gain localized lockdep coverage of the device_lock.
> + * @lock_class: per-subsystem annotated device lock class
>   * @bus:	Type of bus device is on.
>   * @driver:	Which driver has allocated this
>   * @platform_data: Platform data specific to the device.
> @@ -501,6 +502,7 @@ struct device {
>  					   dev_set_drvdata/dev_get_drvdata */
>  #ifdef CONFIG_PROVE_LOCKING
>  	struct mutex		lockdep_mutex;
> +	int			lock_class;
>  #endif
>  	struct mutex		mutex;	/* mutex to synchronize calls to
>  					 * its driver.
> @@ -762,18 +764,100 @@ static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
>  	return !!(dev->power.driver_flags & flags);
>  }
>  
> +static inline void device_lock_assert(struct device *dev)
> +{
> +	lockdep_assert_held(&dev->mutex);
> +}
> +
>  #ifdef CONFIG_PROVE_LOCKING
>  static inline void device_lockdep_init(struct device *dev)
>  {
>  	mutex_init(&dev->lockdep_mutex);
> +	dev->lock_class = -1;
>  	lockdep_set_novalidate_class(&dev->mutex);
>  }
> -#else
> +
> +static inline void device_lock(struct device *dev)
> +{
> +	/*
> +	 * For double-lock programming errors the kernel will hang
> +	 * trying to acquire @dev->mutex before lockdep can report the
> +	 * problem acquiring @dev->lockdep_mutex, so manually assert
> +	 * before that hang.
> +	 */
> +	lockdep_assert_not_held(&dev->lockdep_mutex);
> +
> +	mutex_lock(&dev->mutex);
> +	if (dev->lock_class >= 0)
> +		mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
> +}
> +
> +static inline int device_lock_interruptible(struct device *dev)
> +{
> +	int rc;
> +
> +	lockdep_assert_not_held(&dev->lockdep_mutex);
> +
> +	rc = mutex_lock_interruptible(&dev->mutex);
> +	if (rc || dev->lock_class < 0)
> +		return rc;
> +
> +	return mutex_lock_interruptible_nested(&dev->lockdep_mutex,
> +					       dev->lock_class);
> +}
> +
> +static inline int device_trylock(struct device *dev)
> +{
> +	if (mutex_trylock(&dev->mutex)) {
> +		if (dev->lock_class >= 0)
> +			mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);

This must be the weirdest stuff I've seen in a while.

> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +static inline void device_unlock(struct device *dev)
> +{
> +	if (dev->lock_class >= 0)
> +		mutex_unlock(&dev->lockdep_mutex);
> +	mutex_unlock(&dev->mutex);
> +}
> +
> +/*
> + * Note: this routine expects that the state of @dev->mutex is stable
> + * from entry to exit. There is no support for changing lockdep
> + * validation classes, only enabling and disabling validation.
> + */
> +static inline void device_set_lock_class(struct device *dev, int lock_class)
> +{
> +	/*
> +	 * Allow for setting or clearing the lock class while the
> +	 * device_lock() is held, in which case the paired nested lock
> +	 * might need to be acquired or released now to accommodate the
> +	 * next device_unlock().
> +	 */
> +	if (dev->lock_class < 0 && lock_class >= 0) {
> +		/* Enabling lockdep validation... */
> +		if (mutex_is_locked(&dev->mutex))
> +			mutex_lock_nested(&dev->lockdep_mutex, lock_class);
> +	} else if (dev->lock_class >= 0 && lock_class < 0) {
> +		/* Disabling lockdep validation... */
> +		if (mutex_is_locked(&dev->mutex))
> +			mutex_unlock(&dev->lockdep_mutex);
> +	} else {
> +		dev_warn(dev,
> +			 "%s: failed to change lock_class from: %d to %d\n",
> +			 __func__, dev->lock_class, lock_class);
> +		return;
> +	}
> +	dev->lock_class = lock_class;
> +}
> +#else /* !CONFIG_PROVE_LOCKING */

This all reads like something utterly surreal... *WHAT*!?!?

If you want lockdep validation for one (or more) dev->mutex instances,
why not pull them out of the no_validate class and use the normal
locking?

This is all quite insane.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock()
  2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
@ 2022-04-13  9:39   ` Peter Zijlstra
  2022-04-13 22:05     ` Dan Williams
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2022-04-13  9:39 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Alison Schofield, Vishal Verma, Ira Weiny,
	Ben Widawsky, Jonathan Cameron, Dave Jiang, Kevin Tian, gregkh,
	linux-kernel, nvdimm


*Completely* untested (I wouldn't know where to begin and probably odn't
have the hardware anyway), and known incomplete.

What's wrong with something like this then?

---
 drivers/cxl/core/pmem.c |  8 +++++
 drivers/cxl/core/port.c | 58 +++++++++++++++++++++---------------
 drivers/cxl/cxl.h       | 78 -------------------------------------------------
 3 files changed, 42 insertions(+), 102 deletions(-)

diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 8de240c4d96b..aca0dd5eefeb 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -80,6 +80,8 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_find_nvdimm_bridge, CXL);
 
+static lock_class_key cxl_nvdimm_bridge_key;
+
 static struct cxl_nvdimm_bridge *cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb;
@@ -104,6 +106,8 @@ static struct cxl_nvdimm_bridge *cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_nvdimm_bridge_type;
 
+	lockdep_set_class(&dev->mutex, &cxl_nvdimm_bridge_key);
+
 	return cxl_nvb;
 
 err:
@@ -214,6 +218,8 @@ struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev)
 }
 EXPORT_SYMBOL_NS_GPL(to_cxl_nvdimm, CXL);
 
+static struct lock_class_key cxl_nvdimm_key;
+
 static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
 {
 	struct cxl_nvdimm *cxl_nvd;
@@ -231,6 +237,8 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_nvdimm_type;
 
+	lockdep_set_class(&dev->mutex, &cxl_nvdimm_key);
+
 	return cxl_nvd;
 }
 
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 2ab1ba4499b3..cae88404612f 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -312,10 +312,10 @@ static void cxl_port_release(struct device *dev)
 	struct cxl_port *port = to_cxl_port(dev);
 	struct cxl_ep *ep, *_e;
 
-	cxl_device_lock(dev);
+	device_lock(dev);
 	list_for_each_entry_safe(ep, _e, &port->endpoints, list)
 		cxl_ep_release(ep);
-	cxl_device_unlock(dev);
+	device_unlock(dev);
 	ida_free(&cxl_port_ida, port->id);
 	kfree(port);
 }
@@ -391,6 +391,8 @@ static int devm_cxl_link_uport(struct device *host, struct cxl_port *port)
 	return devm_add_action_or_reset(host, cxl_unlink_uport, port);
 }
 
+static struct lock_class_key cxl_port_key;
+
 static struct cxl_port *cxl_port_alloc(struct device *uport,
 				       resource_size_t component_reg_phys,
 				       struct cxl_port *parent_port)
@@ -431,6 +433,8 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_port_type;
 
+	lockdep_set_class(&dev->mutex, &cxl_port_key);
+
 	return port;
 
 err:
@@ -457,8 +461,10 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 	if (IS_ERR(port))
 		return port;
 
-	if (parent_port)
+	if (parent_port) {
 		port->depth = parent_port->depth + 1;
+		lockdep_set_class_and_subclass(&port->dev->mutex, &cxl_port, port->depth);
+	}
 	dev = &port->dev;
 	if (is_cxl_memdev(uport))
 		rc = dev_set_name(dev, "endpoint%d", port->id);
@@ -554,7 +560,7 @@ static int match_root_child(struct device *dev, const void *match)
 		return 0;
 
 	port = to_cxl_port(dev);
-	cxl_device_lock(dev);
+	device_lock(dev);
 	list_for_each_entry(dport, &port->dports, list) {
 		iter = match;
 		while (iter) {
@@ -564,7 +570,7 @@ static int match_root_child(struct device *dev, const void *match)
 		}
 	}
 out:
-	cxl_device_unlock(dev);
+	device_unlock(dev);
 
 	return !!iter;
 }
@@ -623,13 +629,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
 static void cond_cxl_root_lock(struct cxl_port *port)
 {
 	if (is_cxl_root(port))
-		cxl_device_lock(&port->dev);
+		device_lock(&port->dev);
 }
 
 static void cond_cxl_root_unlock(struct cxl_port *port)
 {
 	if (is_cxl_root(port))
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 }
 
 static void cxl_dport_remove(void *data)
@@ -736,15 +742,15 @@ static int add_ep(struct cxl_port *port, struct cxl_ep *new)
 {
 	struct cxl_ep *dup;
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	if (port->dead) {
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 		return -ENXIO;
 	}
 	dup = find_ep(port, new->ep);
 	if (!dup)
 		list_add_tail(&new->list, &port->endpoints);
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 
 	return dup ? -EEXIST : 0;
 }
@@ -854,12 +860,12 @@ static void delete_endpoint(void *data)
 		goto out;
 	parent = &parent_port->dev;
 
-	cxl_device_lock(parent);
+	device_lock(parent);
 	if (parent->driver && endpoint->uport) {
 		devm_release_action(parent, cxl_unlink_uport, endpoint);
 		devm_release_action(parent, unregister_port, endpoint);
 	}
-	cxl_device_unlock(parent);
+	device_unlock(parent);
 	put_device(parent);
 out:
 	put_device(&endpoint->dev);
@@ -920,7 +926,7 @@ static void cxl_detach_ep(void *data)
 		}
 
 		parent_port = to_cxl_port(port->dev.parent);
-		cxl_device_lock(&parent_port->dev);
+		device_lock(&parent_port->dev);
 		if (!parent_port->dev.driver) {
 			/*
 			 * The bottom-up race to delete the port lost to a
@@ -928,12 +934,12 @@ static void cxl_detach_ep(void *data)
 			 * parent_port ->remove() will have cleaned up all
 			 * descendants.
 			 */
-			cxl_device_unlock(&parent_port->dev);
+			device_unlock(&parent_port->dev);
 			put_device(&port->dev);
 			continue;
 		}
 
-		cxl_device_lock(&port->dev);
+		device_lock(&port->dev);
 		ep = find_ep(port, &cxlmd->dev);
 		dev_dbg(&cxlmd->dev, "disconnect %s from %s\n",
 			ep ? dev_name(ep->ep) : "", dev_name(&port->dev));
@@ -948,7 +954,7 @@ static void cxl_detach_ep(void *data)
 			port->dead = true;
 			list_splice_init(&port->dports, &reap_dports);
 		}
-		cxl_device_unlock(&port->dev);
+		device_unlock(&port->dev);
 
 		if (!list_empty(&reap_dports)) {
 			dev_dbg(&cxlmd->dev, "delete %s\n",
@@ -956,7 +962,7 @@ static void cxl_detach_ep(void *data)
 			delete_switch_port(port, &reap_dports);
 		}
 		put_device(&port->dev);
-		cxl_device_unlock(&parent_port->dev);
+		device_unlock(&parent_port->dev);
 	}
 }
 
@@ -1004,7 +1010,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 		return -EAGAIN;
 	}
 
-	cxl_device_lock(&parent_port->dev);
+	device_lock(&parent_port->dev);
 	if (!parent_port->dev.driver) {
 		dev_warn(&cxlmd->dev,
 			 "port %s:%s disabled, failed to enumerate CXL.mem\n",
@@ -1022,7 +1028,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 			get_device(&port->dev);
 	}
 out:
-	cxl_device_unlock(&parent_port->dev);
+	device_unlock(&parent_port->dev);
 
 	if (IS_ERR(port))
 		rc = PTR_ERR(port);
@@ -1133,14 +1139,14 @@ struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
 {
 	struct cxl_dport *dport;
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	list_for_each_entry(dport, &port->dports, list)
 		if (dport->dport == dev) {
-			cxl_device_unlock(&port->dev);
+			device_unlock(&port->dev);
 			return dport;
 		}
 
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 	return NULL;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_find_dport_by_dev, CXL);
@@ -1173,6 +1179,8 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 	return rc;
 }
 
+static struct lock_class_key cxl_decoder_key;
+
 /**
  * cxl_decoder_alloc - Allocate a new CXL decoder
  * @port: owning port of this decoder
@@ -1224,6 +1232,8 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 	else
 		cxld->dev.type = &cxl_decoder_switch_type;
 
+	lockdep_set_class(&dev->mutex, cxl_decoder_key);
+
 	/* Pre initialize an "empty" decoder */
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = PAGE_SIZE;
@@ -1379,9 +1389,9 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 
 	port = to_cxl_port(cxld->dev.parent);
 
-	cxl_device_lock(&port->dev);
+	device_lock(&port->dev);
 	rc = cxl_decoder_add_locked(cxld, target_map);
-	cxl_device_unlock(&port->dev);
+	device_unlock(&port->dev);
 
 	return rc;
 }
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 990b6670222e..140dc3278cde 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -405,82 +405,4 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
 #define __mock static
 #endif
 
-#ifdef CONFIG_PROVE_CXL_LOCKING
-enum cxl_lock_class {
-	CXL_ANON_LOCK,
-	CXL_NVDIMM_LOCK,
-	CXL_NVDIMM_BRIDGE_LOCK,
-	CXL_PORT_LOCK,
-	/*
-	 * Be careful to add new lock classes here, CXL_PORT_LOCK is
-	 * extended by the port depth, so a maximum CXL port topology
-	 * depth would need to be defined first.
-	 */
-};
-
-static inline void cxl_nested_lock(struct device *dev)
-{
-	if (is_cxl_port(dev)) {
-		struct cxl_port *port = to_cxl_port(dev);
-
-		mutex_lock_nested(&dev->lockdep_mutex,
-				  CXL_PORT_LOCK + port->depth);
-	} else if (is_cxl_decoder(dev)) {
-		struct cxl_port *port = to_cxl_port(dev->parent);
-
-		/*
-		 * A decoder is the immediate child of a port, so set
-		 * its lock class equal to other child device siblings.
-		 */
-		mutex_lock_nested(&dev->lockdep_mutex,
-				  CXL_PORT_LOCK + port->depth + 1);
-	} else if (is_cxl_nvdimm_bridge(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_BRIDGE_LOCK);
-	else if (is_cxl_nvdimm(dev))
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_LOCK);
-	else
-		mutex_lock_nested(&dev->lockdep_mutex, CXL_ANON_LOCK);
-}
-
-static inline void cxl_nested_unlock(struct device *dev)
-{
-	mutex_unlock(&dev->lockdep_mutex);
-}
-
-static inline void cxl_device_lock(struct device *dev)
-{
-	/*
-	 * For double lock errors the lockup will happen before lockdep
-	 * warns at cxl_nested_lock(), so assert explicitly.
-	 */
-	lockdep_assert_not_held(&dev->lockdep_mutex);
-
-	device_lock(dev);
-	cxl_nested_lock(dev);
-}
-
-static inline void cxl_device_unlock(struct device *dev)
-{
-	cxl_nested_unlock(dev);
-	device_unlock(dev);
-}
-#else
-static inline void cxl_nested_lock(struct device *dev)
-{
-}
-
-static inline void cxl_nested_unlock(struct device *dev)
-{
-}
-
-static inline void cxl_device_lock(struct device *dev)
-{
-	device_lock(dev);
-}
-
-static inline void cxl_device_unlock(struct device *dev)
-{
-	device_unlock(dev);
-}
-#endif
 #endif /* __CXL_H__ */

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation
  2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
                   ` (11 preceding siblings ...)
  2022-04-13  6:02 ` [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() " Dan Williams
@ 2022-04-13 14:02 ` Waiman Long
  12 siblings, 0 replies; 23+ messages in thread
From: Waiman Long @ 2022-04-13 14:02 UTC (permalink / raw)
  To: Dan Williams, linux-cxl
  Cc: Ira Weiny, Dave Jiang, Peter Zijlstra, Jonathan Cameron,
	Vishal Verma, Ben Widawsky, Kevin Tian, Pierre-Louis Bossart,
	Alison Schofield, Boqun Feng, Ingo Molnar, Greg Kroah-Hartman,
	Will Deacon, Rafael J. Wysocki, linux-kernel, nvdimm

On 4/13/22 02:01, Dan Williams wrote:
> Changes since v1 [1]:
> - Improve the clarity of the cover letter and changelogs of the
>    major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave)
> - Fix device_lock_interruptible() false negative deadlock detection
>    (Kevin)
> - Fix off-by-one error in the device_set_lock_class() enable case (Kevin)
> - Spelling fixes in Patch2 changelog (Pierre)
> - Compilation fixes when both CONFIG_CXL_BUS=n and
>    CONFIG_LIBNVDIMM=n. (0day robot)
>
> [1]: https://lore.kernel.org/all/164610292916.2682974.12924748003366352335.stgit@dwillia2-desk3.amr.corp.intel.com/
>
> ---
>
> The device_lock() is why the lockdep_set_novalidate_class() API exists.
> The lock is taken in too many disparate contexts, and lockdep by design
> assumes that all device_lock() acquisitions are identical. The lack of
> lockdep coverage leads to deadlock scenarios landing upstream. To
> mitigate that problem the lockdep_mutex was added [2].
>
> The lockdep_mutex lets a subsystem mirror device_lock() acquisitions
> without lockdep_set_novalidate_class() to gain some limited lockdep
> coverage. The mirroring approach is limited to taking the device_lock()
> after-the-fact in a subsystem's 'struct bus_type' operations and fails
> to cover device_lock() acquisition in the driver-core. It also can only
> track the needs of one subsystem at a time so, for example the kernel
> needs to be recompiled between CONFIG_PROVE_NVDIMM_LOCKING and
> CONFIG_PROVE_CXL_LOCKING depending on which subsystem is being
> regression tested. Obviously that also means that intra-subsystem
> locking dependencies can not be validated.

Instead of using a fake lockdep_mutex, maybe you can just use a unique 
lockdep key for each subsystem and call lockdep_set_class() in the 
device_initialize() if such key is present or 
lockdep_set_novalidate_class() otherwise. The unique key can be passed 
either as a parameter to device_initialize() or as part of the device 
structure. It is certainly less cumbersome that having a fake 
lockdep_mutex array in the device structure.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-13  8:43   ` Peter Zijlstra
@ 2022-04-13 22:01     ` Dan Williams
  2022-04-14 10:16       ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Dan Williams @ 2022-04-13 22:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Wed, Apr 13, 2022 at 1:43 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Apr 12, 2022 at 11:01:38PM -0700, Dan Williams wrote:
> > The device_lock() is hidden from lockdep by default because, for
> > example, a device subsystem may do something like:
> >
> > ---
> > device_add(dev1);
> > ...in driver core...
> > device_lock(dev1);
> > bus->probe(dev1); /* where bus->probe() calls driver1_probe() */
> >
> > driver1_probe(struct device *dev)
> > {
> >       ...do some enumeration...
> >       dev2->parent = dev;
> >       /* this triggers probe under device_lock(dev2); */
> >       device_add(dev2);
> > }
> > ---
> >
> > To lockdep, that device_lock(dev2) looks like a deadlock because lockdep
>
> Recursion, you're meaning to say it looks like same lock recursion.

Yes, wrong terminology on my part.

>
> > only sees lock classes, not individual lock instances. All device_lock()
> > instances across the entire kernel are the same class. However, this is
> > not a deadlock in practice because the locking is strictly hierarchical.
> > I.e. device_lock(dev1) is held over device_lock(dev2), but never the
> > reverse.
>
> I have some very vague memories from a conversation with Alan Stern,
> some maybe 10 years ago, where I think he was explaining to me this was
> not in fact a simple hierarchy.
>
> > In order for lockdep to be satisfied and see that it is
> > hierarchical in practice the mutex_lock() call in device_lock() needs to
> > be moved to mutex_lock_nested() where the @subclass argument to
> > mutex_lock_nested() represents the nesting level, i.e.:
>
> That's not an obvious conclusion; lockdep has lots of funny annotations,
> subclasses is just one.
>
> I think the big new development in lockdep since that time with Alan
> Stern is that lockdep now has support for dynamic keys; that is lock
> keys in heap memory (as opposed to static storage).

Ah, I was not aware of that, that should allow for deep cleanups of
this proposal.

>
> > s/device_lock(dev1)/mutex_lock_nested(&dev1->mutex, 1)/
> >
> > s/device_lock(dev2)/mutex_lock_nested(&dev2->mutex, 2)/
> >
> > Now, what if the internals of the device_lock() could be annotated with
> > the right @subclass argument to call mutex_lock_nested()?
> >
> > With device_set_lock_class() a subsystem can optionally add that
> > metadata. The device_lock() still takes dev->mutex, but when
> > dev->lock_class is >= 0 it additionally takes dev->lockdep_mutex with
> > the proper nesting. Unlike dev->mutex, dev->lockdep_mutex is not marked
> > lockdep_set_novalidate_class() and lockdep will become useful... at
> > least for one subsystem at a time.
> >
> > It is still the case that only one subsystem can be using lockdep with
> > lockdep_mutex at a time because different subsystems will collide class
> > numbers. You might say "well, how about subsystem1 gets class ids 0 to 9
> > and subsystem2 gets class ids 10 to 20?". MAX_LOCKDEP_SUBCLASSES is 8,
> > and 8 is just enough class ids for one subsystem of moderate complexity.
>
> Again, that doesn't seem like an obvious suggestion at all. Why not give
> each subsystem a different lock key?
>

Yes, that would also save a source of merge conflicts if every
subsystem needed to add conditional extensions to 'struct device' for
an array of lock metadata.

>
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index af2576ace130..6083e757e804 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -402,6 +402,7 @@ struct dev_msi_info {
> >   * @mutex:   Mutex to synchronize calls to its driver.
> >   * @lockdep_mutex: An optional debug lock that a subsystem can use as a
> >   *           peer lock to gain localized lockdep coverage of the device_lock.
> > + * @lock_class: per-subsystem annotated device lock class
> >   * @bus:     Type of bus device is on.
> >   * @driver:  Which driver has allocated this
> >   * @platform_data: Platform data specific to the device.
> > @@ -501,6 +502,7 @@ struct device {
> >                                          dev_set_drvdata/dev_get_drvdata */
> >  #ifdef CONFIG_PROVE_LOCKING
> >       struct mutex            lockdep_mutex;
> > +     int                     lock_class;
> >  #endif
> >       struct mutex            mutex;  /* mutex to synchronize calls to
> >                                        * its driver.
> > @@ -762,18 +764,100 @@ static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
> >       return !!(dev->power.driver_flags & flags);
> >  }
> >
> > +static inline void device_lock_assert(struct device *dev)
> > +{
> > +     lockdep_assert_held(&dev->mutex);
> > +}
> > +
> >  #ifdef CONFIG_PROVE_LOCKING
> >  static inline void device_lockdep_init(struct device *dev)
> >  {
> >       mutex_init(&dev->lockdep_mutex);
> > +     dev->lock_class = -1;
> >       lockdep_set_novalidate_class(&dev->mutex);
> >  }
> > -#else
> > +
> > +static inline void device_lock(struct device *dev)
> > +{
> > +     /*
> > +      * For double-lock programming errors the kernel will hang
> > +      * trying to acquire @dev->mutex before lockdep can report the
> > +      * problem acquiring @dev->lockdep_mutex, so manually assert
> > +      * before that hang.
> > +      */
> > +     lockdep_assert_not_held(&dev->lockdep_mutex);
> > +
> > +     mutex_lock(&dev->mutex);
> > +     if (dev->lock_class >= 0)
> > +             mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
> > +}
> > +
> > +static inline int device_lock_interruptible(struct device *dev)
> > +{
> > +     int rc;
> > +
> > +     lockdep_assert_not_held(&dev->lockdep_mutex);
> > +
> > +     rc = mutex_lock_interruptible(&dev->mutex);
> > +     if (rc || dev->lock_class < 0)
> > +             return rc;
> > +
> > +     return mutex_lock_interruptible_nested(&dev->lockdep_mutex,
> > +                                            dev->lock_class);
> > +}
> > +
> > +static inline int device_trylock(struct device *dev)
> > +{
> > +     if (mutex_trylock(&dev->mutex)) {
> > +             if (dev->lock_class >= 0)
> > +                     mutex_lock_nested(&dev->lockdep_mutex, dev->lock_class);
>
> This must be the weirdest stuff I've seen in a while.
>
> > +             return 1;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static inline void device_unlock(struct device *dev)
> > +{
> > +     if (dev->lock_class >= 0)
> > +             mutex_unlock(&dev->lockdep_mutex);
> > +     mutex_unlock(&dev->mutex);
> > +}
> > +
> > +/*
> > + * Note: this routine expects that the state of @dev->mutex is stable
> > + * from entry to exit. There is no support for changing lockdep
> > + * validation classes, only enabling and disabling validation.
> > + */
> > +static inline void device_set_lock_class(struct device *dev, int lock_class)
> > +{
> > +     /*
> > +      * Allow for setting or clearing the lock class while the
> > +      * device_lock() is held, in which case the paired nested lock
> > +      * might need to be acquired or released now to accommodate the
> > +      * next device_unlock().
> > +      */
> > +     if (dev->lock_class < 0 && lock_class >= 0) {
> > +             /* Enabling lockdep validation... */
> > +             if (mutex_is_locked(&dev->mutex))
> > +                     mutex_lock_nested(&dev->lockdep_mutex, lock_class);
> > +     } else if (dev->lock_class >= 0 && lock_class < 0) {
> > +             /* Disabling lockdep validation... */
> > +             if (mutex_is_locked(&dev->mutex))
> > +                     mutex_unlock(&dev->lockdep_mutex);
> > +     } else {
> > +             dev_warn(dev,
> > +                      "%s: failed to change lock_class from: %d to %d\n",
> > +                      __func__, dev->lock_class, lock_class);
> > +             return;
> > +     }
> > +     dev->lock_class = lock_class;
> > +}
> > +#else /* !CONFIG_PROVE_LOCKING */
>
> This all reads like something utterly surreal... *WHAT*!?!?

Pile of hacks to workaround cases where the lock_class needed to be
specified after the fact. For example, ACPI does not annotate its
locks, but CXL knows that an "ACPI0017" device will always be the root
of a CXL topology. So CXL subsystem can specify that as lock_class 0.

> If you want lockdep validation for one (or more) dev->mutex instances,
> why not pull them out of the no_validate class and use the normal
> locking?

Sounds perfect, just didn't know how to do that with my current
understanding of how to communicate this to lockdep.

>
> This is all quite insane.

Yes, certainly in comparison to your suggestion on the next patch.
That looks much more sane, and even better I think it allows for
optional lockdep validation without even needing to touch
include/linux/device.h.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock()
  2022-04-13  9:39   ` Peter Zijlstra
@ 2022-04-13 22:05     ` Dan Williams
  0 siblings, 0 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13 22:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-cxl, Alison Schofield, Vishal Verma, Ira Weiny,
	Ben Widawsky, Jonathan Cameron, Dave Jiang, Kevin Tian, Greg KH,
	Linux Kernel Mailing List, Linux NVDIMM

On Wed, Apr 13, 2022 at 2:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
>
> *Completely* untested (I wouldn't know where to begin and probably odn't
> have the hardware anyway), and known incomplete.
>
> What's wrong with something like this then?
>

I'll give it a shot, I think it solves all the cringeworthy aspects of
what I proposed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-13 22:01     ` Dan Williams
@ 2022-04-14 10:16       ` Peter Zijlstra
  2022-04-14 17:17         ` Dan Williams
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2022-04-14 10:16 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Wed, Apr 13, 2022 at 03:01:21PM -0700, Dan Williams wrote:

> > That's not an obvious conclusion; lockdep has lots of funny annotations,
> > subclasses is just one.
> >
> > I think the big new development in lockdep since that time with Alan
> > Stern is that lockdep now has support for dynamic keys; that is lock
> > keys in heap memory (as opposed to static storage).
> 
> Ah, I was not aware of that, that should allow for deep cleanups of
> this proposal.

> > If you want lockdep validation for one (or more) dev->mutex instances,
> > why not pull them out of the no_validate class and use the normal
> > locking?
> 
> Sounds perfect, just didn't know how to do that with my current
> understanding of how to communicate this to lockdep.
> 
> >
> > This is all quite insane.
> 
> Yes, certainly in comparison to your suggestion on the next patch.
> That looks much more sane, and even better I think it allows for
> optional lockdep validation without even needing to touch
> include/linux/device.h.

Right, so lockdep has:

 - classes, based off of lock_class_key address;

   * lock_class_key should be static storage; except now we also have:

       lockdep_{,un}register_key()

     which allows dynamic memory (aka. heap) to be used for classes,
     important to note that lockdep memory usage is still static storage
     because the memory allocators use locks too. So if you register too
     many dynamic keys, you'll run out of lockdep memory etc.. so be
     careful.

   * things like mutex_init() have a static lock_class_key per site
     and hence every lock initialized by the same code ends up in the
     same class by default.

   * can be trivially changed at any time, assuming the lock isn't held,
     using lockdep_set_class*() family.

     (extensively used all over the kernel, for example by the vfs to
      give each filesystem type their own locking order rules)

   * lockdep_set_no_validate_class() is a magical variant of
     lockdep_set_class() that sets a magical lock_class_key.

   * can be changed while held using lock_set_class()

     ( from a lockdep pov it unlocks the held stack,
       changes the class of your lock, and re-locks the
       held stack, now with a different class nesting ).

     Be carefule! It doesn't forget the old nesting order, so you
     can trivally generate cycles.

 - subclasses, basically distinct addresses inside above mentioned
   lock_class_key object, limited to 8. Normally used with
   *lock_nested() family of functions. Typically used to lock multiple
   instances of a single lock class where there is defined order between
   instances (see for instance: double_rq_lock()).

 - nest_lock; eg. mutex_lock_nest_lock(), which allows expressing things
   like: multiple locks of class A can be taken in any order, provided
   we hold lock B.

With many of these, it's possible to get it wrong and annotate real
deadlocks away, so be careful :-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-14 10:16       ` Peter Zijlstra
@ 2022-04-14 17:17         ` Dan Williams
  2022-04-14 19:33           ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Dan Williams @ 2022-04-14 17:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Thu, Apr 14, 2022 at 3:16 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Apr 13, 2022 at 03:01:21PM -0700, Dan Williams wrote:
>
> > > That's not an obvious conclusion; lockdep has lots of funny annotations,
> > > subclasses is just one.
> > >
> > > I think the big new development in lockdep since that time with Alan
> > > Stern is that lockdep now has support for dynamic keys; that is lock
> > > keys in heap memory (as opposed to static storage).
> >
> > Ah, I was not aware of that, that should allow for deep cleanups of
> > this proposal.
>
> > > If you want lockdep validation for one (or more) dev->mutex instances,
> > > why not pull them out of the no_validate class and use the normal
> > > locking?
> >
> > Sounds perfect, just didn't know how to do that with my current
> > understanding of how to communicate this to lockdep.
> >
> > >
> > > This is all quite insane.
> >
> > Yes, certainly in comparison to your suggestion on the next patch.
> > That looks much more sane, and even better I think it allows for
> > optional lockdep validation without even needing to touch
> > include/linux/device.h.
>
> Right, so lockdep has:
>
>  - classes, based off of lock_class_key address;
>
>    * lock_class_key should be static storage; except now we also have:
>
>        lockdep_{,un}register_key()
>
>      which allows dynamic memory (aka. heap) to be used for classes,
>      important to note that lockdep memory usage is still static storage
>      because the memory allocators use locks too. So if you register too
>      many dynamic keys, you'll run out of lockdep memory etc.. so be
>      careful.
>
>    * things like mutex_init() have a static lock_class_key per site
>      and hence every lock initialized by the same code ends up in the
>      same class by default.
>
>    * can be trivially changed at any time, assuming the lock isn't held,
>      using lockdep_set_class*() family.
>
>      (extensively used all over the kernel, for example by the vfs to
>       give each filesystem type their own locking order rules)
>
>    * lockdep_set_no_validate_class() is a magical variant of
>      lockdep_set_class() that sets a magical lock_class_key.

Golden, thanks Peter!

>
>    * can be changed while held using lock_set_class()

One more sanity check... So in driver subsystems there are cases where
a device on busA hosts a topology on busB. When that happens there's a
need to set the lock class late in a driver since busA knows nothing
about the locking rules of busB. Since the device has a longer
lifetime than a driver when the driver exits it must set dev->mutex
back to the novalidate class, otherwise it sets up a use after free of
the static lock_class_key. I came up with this and it seems to work,
just want to make sure I'm properly using the lock_set_class() API and
it is ok to transition back and forth from the novalidate case:

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 990b6670222e..32673e1a736d 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -405,6 +405,29 @@ struct cxl_nvdimm_bridge
*cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
 #define __mock static
 #endif

+#ifdef CONFIG_PROVE_LOCKING
+static inline void cxl_lock_reset_class(void *_dev)
+{
+       struct device *dev = _dev;
+
+       lock_set_class(&dev->mutex.dep_map, "__lockdep_no_validate__",
+                      &__lockdep_no_validate__, 0, _THIS_IP_);
+}
+
+static inline int cxl_lock_set_class(struct device *dev, const char *name,
+                                    struct lock_class_key *key)
+{
+       lock_set_class(&dev->mutex.dep_map, name, key, 0, _THIS_IP_);
+       return devm_add_action_or_reset(dev, cxl_lock_reset_class, dev);
+}
+#else
+static inline int cxl_lock_set_class(struct device *dev, const char *name,
+                                    struct lock_class_key *key)
+{
+       return 0;
+}
+#endif
+
 #ifdef CONFIG_PROVE_CXL_LOCKING
 enum cxl_lock_class {
        CXL_ANON_LOCK,

>      ( from a lockdep pov it unlocks the held stack,
>        changes the class of your lock, and re-locks the
>        held stack, now with a different class nesting ).
>
>      Be carefule! It doesn't forget the old nesting order, so you
>      can trivally generate cycles.
>
>  - subclasses, basically distinct addresses inside above mentioned
>    lock_class_key object, limited to 8. Normally used with
>    *lock_nested() family of functions. Typically used to lock multiple
>    instances of a single lock class where there is defined order between
>    instances (see for instance: double_rq_lock()).
>
>  - nest_lock; eg. mutex_lock_nest_lock(), which allows expressing things
>    like: multiple locks of class A can be taken in any order, provided
>    we hold lock B.
>
> With many of these, it's possible to get it wrong and annotate real
> deadlocks away, so be careful :-)

Noted.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-14 17:17         ` Dan Williams
@ 2022-04-14 19:33           ` Peter Zijlstra
  2022-04-14 19:43             ` Dan Williams
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2022-04-14 19:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Thu, Apr 14, 2022 at 10:17:13AM -0700, Dan Williams wrote:

> One more sanity check... So in driver subsystems there are cases where
> a device on busA hosts a topology on busB. When that happens there's a
> need to set the lock class late in a driver since busA knows nothing
> about the locking rules of busB.

I'll pretend I konw what you're talking about ;-)

> Since the device has a longer lifetime than a driver when the driver
> exits it must set dev->mutex back to the novalidate class, otherwise
> it sets up a use after free of the static lock_class_key.

I'm not following, static storage has infinite lifetime.

> I came up with this and it seems to work, just want to make sure I'm
> properly using the lock_set_class() API and it is ok to transition
> back and forth from the novalidate case:
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 990b6670222e..32673e1a736d 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -405,6 +405,29 @@ struct cxl_nvdimm_bridge
> *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
>  #define __mock static
>  #endif
> 
> +#ifdef CONFIG_PROVE_LOCKING
> +static inline void cxl_lock_reset_class(void *_dev)
> +{
> +       struct device *dev = _dev;
> +
> +       lock_set_class(&dev->mutex.dep_map, "__lockdep_no_validate__",
> +                      &__lockdep_no_validate__, 0, _THIS_IP_);
> +}
> +
> +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> +                                    struct lock_class_key *key)
> +{
> +       lock_set_class(&dev->mutex.dep_map, name, key, 0, _THIS_IP_);
> +       return devm_add_action_or_reset(dev, cxl_lock_reset_class, dev);
> +}
> +#else
> +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> +                                    struct lock_class_key *key)
> +{
> +       return 0;
> +}
> +#endif

Under the assumption that the lock is held (lock_set_class() will
actually barf if @lock isn't held) this should indeed work as expected
(although I think you got the @name part 'wrong', I think that's
canonically something like "&dev->mutex" or something).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-14 19:33           ` Peter Zijlstra
@ 2022-04-14 19:43             ` Dan Williams
  2022-04-15  7:53               ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Dan Williams @ 2022-04-14 19:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Thu, Apr 14, 2022 at 12:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Apr 14, 2022 at 10:17:13AM -0700, Dan Williams wrote:
>
> > One more sanity check... So in driver subsystems there are cases where
> > a device on busA hosts a topology on busB. When that happens there's a
> > need to set the lock class late in a driver since busA knows nothing
> > about the locking rules of busB.
>
> I'll pretend I konw what you're talking about ;-)
>
> > Since the device has a longer lifetime than a driver when the driver
> > exits it must set dev->mutex back to the novalidate class, otherwise
> > it sets up a use after free of the static lock_class_key.
>
> I'm not following, static storage has infinite lifetime.

Not static storage in a driver module.

modprobe -r fancy_lockdep_using_driver.ko

Any use of device_lock() by the core on a device that a driver in this
module was driving will de-reference a now invalid pointer into
whatever memory was vmalloc'd for the module static data.

>
> > I came up with this and it seems to work, just want to make sure I'm
> > properly using the lock_set_class() API and it is ok to transition
> > back and forth from the novalidate case:
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 990b6670222e..32673e1a736d 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -405,6 +405,29 @@ struct cxl_nvdimm_bridge
> > *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
> >  #define __mock static
> >  #endif
> >
> > +#ifdef CONFIG_PROVE_LOCKING
> > +static inline void cxl_lock_reset_class(void *_dev)
> > +{
> > +       struct device *dev = _dev;
> > +
> > +       lock_set_class(&dev->mutex.dep_map, "__lockdep_no_validate__",
> > +                      &__lockdep_no_validate__, 0, _THIS_IP_);
> > +}
> > +
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > +                                    struct lock_class_key *key)
> > +{
> > +       lock_set_class(&dev->mutex.dep_map, name, key, 0, _THIS_IP_);
> > +       return devm_add_action_or_reset(dev, cxl_lock_reset_class, dev);
> > +}
> > +#else
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > +                                    struct lock_class_key *key)
> > +{
> > +       return 0;
> > +}
> > +#endif
>
> Under the assumption that the lock is held (lock_set_class() will
> actually barf if @lock isn't held) this should indeed work as expected

Nice.

> (although I think you got the @name part 'wrong', I think that's
> canonically something like "&dev->mutex" or something).

Ah, yes, I got it wrong for restoring the same name that results from
lockdep_set_novalidate_class(), will fix.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation
  2022-04-14 19:43             ` Dan Williams
@ 2022-04-15  7:53               ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2022-04-15  7:53 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Greg Kroah-Hartman, Rafael J. Wysocki, Dave Jiang,
	Kevin Tian, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List, Linux NVDIMM

On Thu, Apr 14, 2022 at 12:43:31PM -0700, Dan Williams wrote:
> On Thu, Apr 14, 2022 at 12:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Apr 14, 2022 at 10:17:13AM -0700, Dan Williams wrote:
> >
> > > One more sanity check... So in driver subsystems there are cases where
> > > a device on busA hosts a topology on busB. When that happens there's a
> > > need to set the lock class late in a driver since busA knows nothing
> > > about the locking rules of busB.
> >
> > I'll pretend I konw what you're talking about ;-)
> >
> > > Since the device has a longer lifetime than a driver when the driver
> > > exits it must set dev->mutex back to the novalidate class, otherwise
> > > it sets up a use after free of the static lock_class_key.
> >
> > I'm not following, static storage has infinite lifetime.
> 
> Not static storage in a driver module.
> 
> modprobe -r fancy_lockdep_using_driver.ko
> 
> Any use of device_lock() by the core on a device that a driver in this
> module was driving will de-reference a now invalid pointer into
> whatever memory was vmalloc'd for the module static data.

Ooh, modules (I always, conveniently, forget they exist). Yes, setting a
lock instance from the core kernel to a key that's inside a module and
then taking the module out will be 'interesting'.

Most likely you'll get a splat from lockdep when doing this.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-04-15  7:53 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
2022-04-13  8:43   ` Peter Zijlstra
2022-04-13 22:01     ` Dan Williams
2022-04-14 10:16       ` Peter Zijlstra
2022-04-14 17:17         ` Dan Williams
2022-04-14 19:33           ` Peter Zijlstra
2022-04-14 19:43             ` Dan Williams
2022-04-15  7:53               ` Peter Zijlstra
2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
2022-04-13  9:39   ` Peter Zijlstra
2022-04-13 22:05     ` Dan Williams
2022-04-13  6:01 ` [PATCH v2 04/12] cxl/core: Remove cxl_device_lock() Dan Williams
2022-04-13  6:01 ` [PATCH v2 05/12] cxl/core: Clamp max lock_class Dan Williams
2022-04-13  6:02 ` [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device Dan Williams
2022-04-13  6:02 ` [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper Dan Williams
2022-04-13  6:02 ` [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 10/12] libnvdimm: Drop nd_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 11/12] libnvdimm: Enable lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() " Dan Williams
2022-04-13 14:02 ` [PATCH v2 00/12] device-core: Enable " Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.