All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation
@ 2022-04-13  6:01 Dan Williams
  2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Dan Williams @ 2022-04-13  6:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ira Weiny, Dave Jiang, Peter Zijlstra, Jonathan Cameron,
	Vishal Verma, Ben Widawsky, Kevin Tian, Pierre-Louis Bossart,
	Alison Schofield, Boqun Feng, Ingo Molnar, Greg Kroah-Hartman,
	Will Deacon, Waiman Long, Rafael J. Wysocki, gregkh,
	linux-kernel, nvdimm

Changes since v1 [1]:
- Improve the clarity of the cover letter and changelogs of the
  major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave)
- Fix device_lock_interruptible() false negative deadlock detection
  (Kevin)
- Fix off-by-one error in the device_set_lock_class() enable case (Kevin)
- Spelling fixes in Patch2 changelog (Pierre)
- Compilation fixes when both CONFIG_CXL_BUS=n and
  CONFIG_LIBNVDIMM=n. (0day robot)

[1]: https://lore.kernel.org/all/164610292916.2682974.12924748003366352335.stgit@dwillia2-desk3.amr.corp.intel.com/

---

The device_lock() is why the lockdep_set_novalidate_class() API exists.
The lock is taken in too many disparate contexts, and lockdep by design
assumes that all device_lock() acquisitions are identical. The lack of
lockdep coverage leads to deadlock scenarios landing upstream. To
mitigate that problem the lockdep_mutex was added [2].

The lockdep_mutex lets a subsystem mirror device_lock() acquisitions
without lockdep_set_novalidate_class() to gain some limited lockdep
coverage. The mirroring approach is limited to taking the device_lock()
after-the-fact in a subsystem's 'struct bus_type' operations and fails
to cover device_lock() acquisition in the driver-core. It also can only
track the needs of one subsystem at a time so, for example the kernel
needs to be recompiled between CONFIG_PROVE_NVDIMM_LOCKING and
CONFIG_PROVE_CXL_LOCKING depending on which subsystem is being
regression tested. Obviously that also means that intra-subsystem
locking dependencies can not be validated.

Two enhancements are proposed to improve the current state of
device_lock() lockdep validation:

1/ Communicate a lock class to the device-core and let it acquire
   dev->lockdep_mutex per the subsystem's nested locking expectations.

2/ Go further and provide a lockdep_mutex per-subsystem so each 
   has the full span of MAX_LOCKDEP_SUBCLASSES available for its use.

This enabling has already prevented at least one device_lock() deadlock
from making its way upstream.

[2]: commit 87a30e1f05d7 ("driver-core, libnvdimm: Let device subsystems add local lockdep coverage")

---

Dan Williams (12):
      device-core: Move device_lock() lockdep init to a helper
      device-core: Add dev->lock_class to enable device_lock() lockdep validation
      cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock()
      cxl/core: Remove cxl_device_lock()
      cxl/core: Clamp max lock_class
      cxl/core: Use dev->lock_class for device_lock() lockdep validation
      cxl/acpi: Add a device_lock() lock class for the root platform device
      libnvdimm: Refactor an nvdimm_lock_class() helper
      ACPI: NFIT: Drop nfit_device_lock()
      libnvdimm: Drop nd_device_lock()
      libnvdimm: Enable lockdep validation
      device-core: Enable multi-subsystem device_lock() lockdep validation


 drivers/acpi/nfit/core.c        |   30 ++++---
 drivers/acpi/nfit/nfit.h        |   24 ------
 drivers/base/core.c             |    5 -
 drivers/cxl/acpi.c              |    1 
 drivers/cxl/core/memdev.c       |    1 
 drivers/cxl/core/pmem.c         |    6 +
 drivers/cxl/core/port.c         |   56 ++++++-------
 drivers/cxl/cxl.h               |   76 +++++++-----------
 drivers/cxl/mem.c               |    4 -
 drivers/cxl/pmem.c              |   12 +--
 drivers/cxl/port.c              |    2 
 drivers/nvdimm/btt_devs.c       |   16 ++--
 drivers/nvdimm/bus.c            |   26 +++---
 drivers/nvdimm/core.c           |   10 +-
 drivers/nvdimm/dimm_devs.c      |    8 +-
 drivers/nvdimm/namespace_devs.c |   36 ++++-----
 drivers/nvdimm/nd-core.h        |   51 +++---------
 drivers/nvdimm/pfn_devs.c       |   24 +++---
 drivers/nvdimm/pmem.c           |    2 
 drivers/nvdimm/region.c         |    2 
 drivers/nvdimm/region_devs.c    |   16 ++--
 include/linux/device.h          |  162 ++++++++++++++++++++++++++++++++++++++-
 lib/Kconfig.debug               |   23 ------
 23 files changed, 325 insertions(+), 268 deletions(-)

--

base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-04-15  7:53 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13  6:01 [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation Dan Williams
2022-04-13  6:01 ` [PATCH v2 01/12] device-core: Move device_lock() lockdep init to a helper Dan Williams
2022-04-13  6:01 ` [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation Dan Williams
2022-04-13  8:43   ` Peter Zijlstra
2022-04-13 22:01     ` Dan Williams
2022-04-14 10:16       ` Peter Zijlstra
2022-04-14 17:17         ` Dan Williams
2022-04-14 19:33           ` Peter Zijlstra
2022-04-14 19:43             ` Dan Williams
2022-04-15  7:53               ` Peter Zijlstra
2022-04-13  6:01 ` [PATCH v2 03/12] cxl/core: Refactor a cxl_lock_class() out of cxl_nested_lock() Dan Williams
2022-04-13  9:39   ` Peter Zijlstra
2022-04-13 22:05     ` Dan Williams
2022-04-13  6:01 ` [PATCH v2 04/12] cxl/core: Remove cxl_device_lock() Dan Williams
2022-04-13  6:01 ` [PATCH v2 05/12] cxl/core: Clamp max lock_class Dan Williams
2022-04-13  6:02 ` [PATCH v2 06/12] cxl/core: Use dev->lock_class for device_lock() lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 07/12] cxl/acpi: Add a device_lock() lock class for the root platform device Dan Williams
2022-04-13  6:02 ` [PATCH v2 08/12] libnvdimm: Refactor an nvdimm_lock_class() helper Dan Williams
2022-04-13  6:02 ` [PATCH v2 09/12] ACPI: NFIT: Drop nfit_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 10/12] libnvdimm: Drop nd_device_lock() Dan Williams
2022-04-13  6:02 ` [PATCH v2 11/12] libnvdimm: Enable lockdep validation Dan Williams
2022-04-13  6:02 ` [PATCH v2 12/12] device-core: Enable multi-subsystem device_lock() " Dan Williams
2022-04-13 14:02 ` [PATCH v2 00/12] device-core: Enable " Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.