linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls
@ 2018-12-13  0:44 Alexander Duyck
  2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:44 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

This patch set provides functionality that will help to improve the
locality of the async_schedule calls used to provide deferred
initialization.

This patch set originally started out focused on just the one call to
async_schedule_domain in the nvdimm tree that was being used to defer the
device_add call however after doing some digging I realized the scope of
this was much broader than I had originally planned. As such I went
through and reworked the underlying infrastructure down to replacing the
queue_work call itself with a function of my own and opted to try and
provide a NUMA aware solution that would work for a broader audience.

In addition I have added several tweaks and/or clean-ups to the front of the
patch set. Patches 1 through 3 address a number of issues that actually were
causing the existing async_schedule calls to not show the performance that
they could due to either not scaling on a per device basis, or due to issues
that could result in a potential race. For example, patch 3 addresses the
fact that we were calling async_schedule once per driver instead of once
per device, and as a result we would have still ended up with devices
being probed on a non-local node without addressing this first.

I have also updated the kernel module used to test async driver probing so
that it can expose the original issue I was attempting to address.
It will fail on a system of asynchronous work either takes longer than it
takes to load a single device and a single driver with a device already
added. It will also fail if the NUMA node that the driver is loaded on does
not match the NUMA node the device is associated with.

RFC->v1:
    Dropped nvdimm patch to submit later.
        It relies on code in libnvdimm development tree.
    Simplified queue_work_near to just convert node into a CPU.
    Split up drivers core and PM core patches.
v1->v2:
    Renamed queue_work_near to queue_work_node
    Added WARN_ON_ONCE if we use queue_work_node with per-cpu workqueue
v2->v3:
    Added Acked-by for queue_work_node patch
    Continued rename from _near to _node to be consistent with queue_work_node
        Renamed async_schedule_near_domain to async_schedule_node_domain
        Renamed async_schedule_near to async_schedule_node
    Added kerneldoc for new async_schedule_XXX functions
    Updated patch description for patch 4 to include data on potential gains
v3->v4
    Added patch to consolidate use of need_parent_lock
    Make asynchronous driver probing explicit about use of drvdata
v4->v5
    Added patch to move async_synchronize_full to address deadlock
    Added bit async_probe to act as mutex for probe/remove calls
    Added back nvdimm patch as code it relies on is now in Linus's tree
    Incorporated review comments on parent & device locking consolidation
    Rebased on latest linux-next
v5->v6:
    Drop the "This patch" or "This change" from start of patch descriptions.
    Drop unnecessary parenthesis in first patch
    Use same wording for "selecting a CPU" in comments added in first patch
    Added kernel documentation for async_probe member of device
    Fixed up comments for async_schedule calls in patch 2
    Moved code related setting async driver out of device.h and into dd.c
    Added Reviewed-by for several patches
v6->v7:
    Fixed typo which had kernel doc refer to "lock" when I meant "unlock"
    Dropped "bool X:1" to "u8 X:1" from patch description
    Added async_driver to device_private structure to store driver
    Dropped unecessary code shuffle from async_probe patch
    Reordered patches to move fixes up to front
    Added Reviewed-by for several patches
    Updated cover page and patch descriptions throughout the set
v7->v8:
    Replaced async_probe value with dead, only apply dead in device_del
    Dropped Reviewed-by from patch 2 due to significant changes
    Added Reviewed-by for patches reviewed by Luis Chamberlain
v8->v9:
    Dropped patch 1 as it was applied, shifted remaining patches by 1
    Added new patch 9 that adds test framework for NUMA and sequential init
    Tweaked what is now patch 1, and added Reviewed-by from Dan Williams


---

Alexander Duyck (9):
      driver core: Establish order of operations for device_add and device_del via bitflag
      device core: Consolidate locking and unlocking of parent and device
      driver core: Probe devices asynchronously instead of the driver
      workqueue: Provide queue_work_node to queue work near a given NUMA node
      async: Add support for queueing on specific NUMA node
      driver core: Attach devices on CPU local to device node
      PM core: Use new async_schedule_dev command
      libnvdimm: Schedule device registration on node local to the device
      driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity


 drivers/base/base.h                         |    4 
 drivers/base/bus.c                          |   46 +----
 drivers/base/core.c                         |   11 +
 drivers/base/dd.c                           |  160 +++++++++++++----
 drivers/base/power/main.c                   |   12 +
 drivers/base/test/test_async_driver_probe.c |  261 +++++++++++++++++++++------
 drivers/nvdimm/bus.c                        |   11 +
 include/linux/async.h                       |   82 ++++++++
 include/linux/device.h                      |    5 +
 include/linux/workqueue.h                   |    2 
 kernel/async.c                              |   53 +++--
 kernel/workqueue.c                          |   84 +++++++++
 12 files changed, 565 insertions(+), 166 deletions(-)

--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
@ 2018-12-13  0:44 ` Alexander Duyck
  2018-12-19 14:27   ` Rafael J. Wysocki
  2019-01-18 15:54   ` Greg KH
  2018-12-13  0:45 ` [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device Alexander Duyck
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:44 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Add an additional bit flag to the device struct named "dead".

This additional flag provides a guarantee that when a device_del is
executed on a given interface an async worker will not attempt to attach
the driver following the earlier device_del call. Previously this
guarantee was not present and could result in the device_del call
attempting to remove a driver from an interface only to have the async
worker attempt to probe the driver later when it finally completes the
asynchronous probe call.

One additional change added was that I pulled the check for dev->driver
out of the __device_attach_driver call and instead placed it in the
__device_attach_async_helper call. This was motivated by the fact that the
only other caller of this, __device_attach, had already taken the
device_lock() and checked for dev->driver. Instead of testing for this
twice in this path it makes more sense to just consolidate the dev->dead
and dev->driver checks together into one set of checks.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/core.c    |   11 +++++++++++
 drivers/base/dd.c      |   22 +++++++++++-----------
 include/linux/device.h |    5 +++++
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 0073b09bb99f..950e25495726 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2080,6 +2080,17 @@ void device_del(struct device *dev)
 	struct kobject *glue_dir = NULL;
 	struct class_interface *class_intf;
 
+	/*
+	 * Hold the device lock and set the "dead" flag to guarantee that
+	 * the update behavior is consistent with the other bitfields near
+	 * it and that we cannot have an asynchronous probe routine trying
+	 * to run while we are tearing out the bus/class/sysfs from
+	 * underneath the device.
+	 */
+	device_lock(dev);
+	dev->dead = true;
+	device_unlock(dev);
+
 	/* Notify clients of device removal.  This call must come
 	 * before dpm_sysfs_remove().
 	 */
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 88713f182086..74c194ac99df 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -731,15 +731,6 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
 	bool async_allowed;
 	int ret;
 
-	/*
-	 * Check if device has already been claimed. This may
-	 * happen with driver loading, device discovery/registration,
-	 * and deferred probe processing happens all at once with
-	 * multiple threads.
-	 */
-	if (dev->driver)
-		return -EBUSY;
-
 	ret = driver_match_device(drv, dev);
 	if (ret == 0) {
 		/* no match */
@@ -774,6 +765,15 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
 
 	device_lock(dev);
 
+	/*
+	 * Check if device has already been removed or claimed. This may
+	 * happen with driver loading, device discovery/registration,
+	 * and deferred probe processing happens all at once with
+	 * multiple threads.
+	 */
+	if (dev->dead || dev->driver)
+		goto out_unlock;
+
 	if (dev->parent)
 		pm_runtime_get_sync(dev->parent);
 
@@ -784,7 +784,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
 
 	if (dev->parent)
 		pm_runtime_put(dev->parent);
-
+out_unlock:
 	device_unlock(dev);
 
 	put_device(dev);
@@ -897,7 +897,7 @@ static int __driver_attach(struct device *dev, void *data)
 	if (dev->parent && dev->bus->need_parent_lock)
 		device_lock(dev->parent);
 	device_lock(dev);
-	if (!dev->driver)
+	if (!dev->dead && !dev->driver)
 		driver_probe_device(drv, dev);
 	device_unlock(dev);
 	if (dev->parent && dev->bus->need_parent_lock)
diff --git a/include/linux/device.h b/include/linux/device.h
index 1b25c7a43f4c..f73dad81e811 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -957,6 +957,10 @@ struct dev_links_info {
  *              device.
  * @dma_coherent: this particular device is dma coherent, even if the
  *		architecture supports non-coherent devices.
+ * @dead:	This device is currently either in the process of or has
+ *		been removed from the system. Any asynchronous events
+ *		scheduled for this device should exit without taking any
+ *		action.
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -1051,6 +1055,7 @@ struct device {
     defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
 	bool			dma_coherent:1;
 #endif
+	bool			dead:1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
  2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-14 10:40   ` Rafael J. Wysocki
  2018-12-13  0:45 ` [driver-core PATCH v9 3/9] driver core: Probe devices asynchronously instead of the driver Alexander Duyck
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Try to consolidate all of the locking and unlocking of both the parent and
device when attaching or removing a driver from a given device.

To do that I first consolidated the lock pattern into two functions
__device_driver_lock and __device_driver_unlock. After doing that I then
created functions specific to attaching and detaching the driver while
acquiring these locks. By doing this I was able to reduce the number of
spots where we touch need_parent_lock from 12 down to 4.

This patch should produce no functional changes, it is meant to be a code
clean-up/consolidation only.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/base.h |    2 +
 drivers/base/bus.c  |   23 ++----------
 drivers/base/dd.c   |   95 ++++++++++++++++++++++++++++++++++++++++-----------
 3 files changed, 81 insertions(+), 39 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 7a419a7a6235..3f22ebd6117a 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -124,6 +124,8 @@ extern int driver_add_groups(struct device_driver *drv,
 			     const struct attribute_group **groups);
 extern void driver_remove_groups(struct device_driver *drv,
 				 const struct attribute_group **groups);
+int device_driver_attach(struct device_driver *drv, struct device *dev);
+void device_driver_detach(struct device *dev);
 
 extern char *make_class_name(const char *name, struct kobject *kobj);
 
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index b886b15cb53b..74054481007d 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -184,11 +184,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf,
 
 	dev = bus_find_device_by_name(bus, NULL, buf);
 	if (dev && dev->driver == drv) {
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_lock(dev->parent);
-		device_release_driver(dev);
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_unlock(dev->parent);
+		device_driver_detach(dev);
 		err = count;
 	}
 	put_device(dev);
@@ -211,13 +207,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
 
 	dev = bus_find_device_by_name(bus, NULL, buf);
 	if (dev && dev->driver == NULL && driver_match_device(drv, dev)) {
-		if (dev->parent && bus->need_parent_lock)
-			device_lock(dev->parent);
-		device_lock(dev);
-		err = driver_probe_device(drv, dev);
-		device_unlock(dev);
-		if (dev->parent && bus->need_parent_lock)
-			device_unlock(dev->parent);
+		err = device_driver_attach(drv, dev);
 
 		if (err > 0) {
 			/* success */
@@ -771,13 +761,8 @@ EXPORT_SYMBOL_GPL(bus_rescan_devices);
  */
 int device_reprobe(struct device *dev)
 {
-	if (dev->driver) {
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_lock(dev->parent);
-		device_release_driver(dev);
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_unlock(dev->parent);
-	}
+	if (dev->driver)
+		device_driver_detach(dev);
 	return bus_rescan_devices_helper(dev, NULL);
 }
 EXPORT_SYMBOL_GPL(device_reprobe);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 74c194ac99df..f07c16277ed9 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -867,6 +867,64 @@ void device_initial_probe(struct device *dev)
 	__device_attach(dev, true);
 }
 
+/*
+ * __device_driver_lock - acquire locks needed to manipulate dev->drv
+ * @dev: Device we will update driver info for
+ * @parent: Parent device. Needed if the bus requires parent lock
+ *
+ * This function will take the required locks for manipulating dev->drv.
+ * Normally this will just be the @dev lock, but when called for a USB
+ * interface, @parent lock will be held as well.
+ */
+static void __device_driver_lock(struct device *dev, struct device *parent)
+{
+	if (parent && dev->bus->need_parent_lock)
+		device_lock(parent);
+	device_lock(dev);
+}
+
+/*
+ * __device_driver_unlock - release locks needed to manipulate dev->drv
+ * @dev: Device we will update driver info for
+ * @parent: Parent device. Needed if the bus requires parent lock
+ *
+ * This function will release the required locks for manipulating dev->drv.
+ * Normally this will just be the the @dev lock, but when called for a
+ * USB interface, @parent lock will be released as well.
+ */
+static void __device_driver_unlock(struct device *dev, struct device *parent)
+{
+	device_unlock(dev);
+	if (parent && dev->bus->need_parent_lock)
+		device_unlock(parent);
+}
+
+/**
+ * device_driver_attach - attach a specific driver to a specific device
+ * @drv: Driver to attach
+ * @dev: Device to attach it to
+ *
+ * Manually attach driver to a device. Will acquire both @dev lock and
+ * @dev->parent lock if needed.
+ */
+int device_driver_attach(struct device_driver *drv, struct device *dev)
+{
+	int ret = 0;
+
+	__device_driver_lock(dev, dev->parent);
+
+	/*
+	 * If device has been removed or someone has already successfully
+	 * bound a driver before us just skip the driver probe call.
+	 */
+	if (!dev->dead && !dev->driver)
+		ret = driver_probe_device(drv, dev);
+
+	__device_driver_unlock(dev, dev->parent);
+
+	return ret;
+}
+
 static int __driver_attach(struct device *dev, void *data)
 {
 	struct device_driver *drv = data;
@@ -894,14 +952,7 @@ static int __driver_attach(struct device *dev, void *data)
 		return ret;
 	} /* ret > 0 means positive match */
 
-	if (dev->parent && dev->bus->need_parent_lock)
-		device_lock(dev->parent);
-	device_lock(dev);
-	if (!dev->dead && !dev->driver)
-		driver_probe_device(drv, dev);
-	device_unlock(dev);
-	if (dev->parent && dev->bus->need_parent_lock)
-		device_unlock(dev->parent);
+	device_driver_attach(drv, dev);
 
 	return 0;
 }
@@ -932,15 +983,11 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 	drv = dev->driver;
 	if (drv) {
 		while (device_links_busy(dev)) {
-			device_unlock(dev);
-			if (parent)
-				device_unlock(parent);
+			__device_driver_unlock(dev, parent);
 
 			device_links_unbind_consumers(dev);
-			if (parent)
-				device_lock(parent);
 
-			device_lock(dev);
+			__device_driver_lock(dev, parent);
 			/*
 			 * A concurrent invocation of the same function might
 			 * have released the driver successfully while this one
@@ -993,16 +1040,12 @@ void device_release_driver_internal(struct device *dev,
 				    struct device_driver *drv,
 				    struct device *parent)
 {
-	if (parent && dev->bus->need_parent_lock)
-		device_lock(parent);
+	__device_driver_lock(dev, parent);
 
-	device_lock(dev);
 	if (!drv || drv == dev->driver)
 		__device_release_driver(dev, parent);
 
-	device_unlock(dev);
-	if (parent && dev->bus->need_parent_lock)
-		device_unlock(parent);
+	__device_driver_unlock(dev, parent);
 }
 
 /**
@@ -1027,6 +1070,18 @@ void device_release_driver(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(device_release_driver);
 
+/**
+ * device_driver_detach - detach driver from a specific device
+ * @dev: device to detach driver from
+ *
+ * Detach driver from device. Will acquire both @dev lock and @dev->parent
+ * lock if needed.
+ */
+void device_driver_detach(struct device *dev)
+{
+	device_release_driver_internal(dev, NULL, dev->parent);
+}
+
 /**
  * driver_detach - detach driver from all devices it controls.
  * @drv: driver.


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 3/9] driver core: Probe devices asynchronously instead of the driver
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
  2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node Alexander Duyck
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Probe devices asynchronously instead of the driver. This results in us
seeing the same behavior if the device is registered before the driver or
after. This way we can avoid serializing the initialization should the
driver not be loaded until after the devices have already been added.

The motivation behind this is that if we have a set of devices that
take a significant amount of time to load we can greatly reduce the time to
load by processing them in parallel instead of one at a time. In addition,
each device can exist on a different node so placing a single thread on one
CPU to initialize all of the devices for a given driver can result in poor
performance on a system with multiple nodes.

This approach can reduce the time needed to scan SCSI LUNs significantly.
The only way to realize that speedup is by enabling more concurrency which
is what is achieved with this patch.

To achieve this it was necessary to add a new member "async_driver" to the
device_private structure to store the driver pointer while we wait on the
deferred probe call.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/base.h |    2 ++
 drivers/base/bus.c  |   23 +++--------------------
 drivers/base/dd.c   |   43 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 3f22ebd6117a..c95384a8e53c 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -64,6 +64,7 @@ struct driver_private {
  *	binding of drivers which were unable to get all the resources needed by
  *	the device; typically because it depends on another driver getting
  *	probed first.
+ * @async_driver - pointer to device driver awaiting probe via async_probe
  * @device - pointer back to the struct device that this structure is
  * associated with.
  *
@@ -75,6 +76,7 @@ struct device_private {
 	struct klist_node knode_driver;
 	struct klist_node knode_bus;
 	struct list_head deferred_probe;
+	struct device_driver *async_driver;
 	struct device *device;
 };
 #define to_device_private_parent(obj)	\
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 74054481007d..be69eb2b07e6 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -608,17 +608,6 @@ static ssize_t uevent_store(struct device_driver *drv, const char *buf,
 }
 static DRIVER_ATTR_WO(uevent);
 
-static void driver_attach_async(void *_drv, async_cookie_t cookie)
-{
-	struct device_driver *drv = _drv;
-	int ret;
-
-	ret = driver_attach(drv);
-
-	pr_debug("bus: '%s': driver %s async attach completed: %d\n",
-		 drv->bus->name, drv->name, ret);
-}
-
 /**
  * bus_add_driver - Add a driver to the bus.
  * @drv: driver.
@@ -651,15 +640,9 @@ int bus_add_driver(struct device_driver *drv)
 
 	klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers);
 	if (drv->bus->p->drivers_autoprobe) {
-		if (driver_allows_async_probing(drv)) {
-			pr_debug("bus: '%s': probing driver %s asynchronously\n",
-				drv->bus->name, drv->name);
-			async_schedule(driver_attach_async, drv);
-		} else {
-			error = driver_attach(drv);
-			if (error)
-				goto out_unregister;
-		}
+		error = driver_attach(drv);
+		if (error)
+			goto out_unregister;
 	}
 	module_add_driver(drv->owner, drv);
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f07c16277ed9..3353f654861b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -925,6 +925,30 @@ int device_driver_attach(struct device_driver *drv, struct device *dev)
 	return ret;
 }
 
+static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie)
+{
+	struct device *dev = _dev;
+	struct device_driver *drv;
+	int ret = 0;
+
+	__device_driver_lock(dev, dev->parent);
+
+	drv = dev->p->async_driver;
+
+	/*
+	 * If device has been removed or someone has already successfully
+	 * bound a driver before us just skip the driver probe call.
+	 */
+	if (!dev->dead && !dev->driver)
+		ret = driver_probe_device(drv, dev);
+
+	__device_driver_unlock(dev, dev->parent);
+
+	dev_dbg(dev, "driver %s async attach completed: %d\n", drv->name, ret);
+
+	put_device(dev);
+}
+
 static int __driver_attach(struct device *dev, void *data)
 {
 	struct device_driver *drv = data;
@@ -952,6 +976,25 @@ static int __driver_attach(struct device *dev, void *data)
 		return ret;
 	} /* ret > 0 means positive match */
 
+	if (driver_allows_async_probing(drv)) {
+		/*
+		 * Instead of probing the device synchronously we will
+		 * probe it asynchronously to allow for more parallelism.
+		 *
+		 * We only take the device lock here in order to guarantee
+		 * that the dev->driver and async_driver fields are protected
+		 */
+		dev_dbg(dev, "probing driver %s asynchronously\n", drv->name);
+		device_lock(dev);
+		if (!dev->driver) {
+			get_device(dev);
+			dev->p->async_driver = drv;
+			async_schedule(__driver_attach_async_helper, dev);
+		}
+		device_unlock(dev);
+		return 0;
+	}
+
 	device_driver_attach(drv, dev);
 
 	return 0;


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (2 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 3/9] driver core: Probe devices asynchronously instead of the driver Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 5/9] async: Add support for queueing on specific " Alexander Duyck
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Provide a new function, queue_work_node, which is meant to schedule work on
a "random" CPU of the requested NUMA node. The main motivation for this is
to help assist asynchronous init to better improve boot times for devices
that are local to a specific node.

For now we just default to the first CPU that is in the intersection of the
cpumask of the node and the online cpumask. The only exception is if the
CPU is local to the node we will just use the current CPU. This should work
for our purposes as we are currently only using this for unbound work so
the CPU will be translated to a node anyway instead of being directly used.

As we are only using the first CPU to represent the NUMA node for now I am
limiting the scope of the function so that it can only be used with unbound
workqueues.

Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 include/linux/workqueue.h |    2 +
 kernel/workqueue.c        |   84 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 60d673e15632..1f50c1e586e7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -463,6 +463,8 @@ int workqueue_set_unbound_cpumask(cpumask_var_t cpumask);
 
 extern bool queue_work_on(int cpu, struct workqueue_struct *wq,
 			struct work_struct *work);
+extern bool queue_work_node(int node, struct workqueue_struct *wq,
+			    struct work_struct *work);
 extern bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 			struct delayed_work *work, unsigned long delay);
 extern bool mod_delayed_work_on(int cpu, struct workqueue_struct *wq,
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 392be4b252f6..d5a26e456f7a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1492,6 +1492,90 @@ bool queue_work_on(int cpu, struct workqueue_struct *wq,
 }
 EXPORT_SYMBOL(queue_work_on);
 
+/**
+ * workqueue_select_cpu_near - Select a CPU based on NUMA node
+ * @node: NUMA node ID that we want to select a CPU from
+ *
+ * This function will attempt to find a "random" cpu available on a given
+ * node. If there are no CPUs available on the given node it will return
+ * WORK_CPU_UNBOUND indicating that we should just schedule to any
+ * available CPU if we need to schedule this work.
+ */
+static int workqueue_select_cpu_near(int node)
+{
+	int cpu;
+
+	/* No point in doing this if NUMA isn't enabled for workqueues */
+	if (!wq_numa_enabled)
+		return WORK_CPU_UNBOUND;
+
+	/* Delay binding to CPU if node is not valid or online */
+	if (node < 0 || node >= MAX_NUMNODES || !node_online(node))
+		return WORK_CPU_UNBOUND;
+
+	/* Use local node/cpu if we are already there */
+	cpu = raw_smp_processor_id();
+	if (node == cpu_to_node(cpu))
+		return cpu;
+
+	/* Use "random" otherwise know as "first" online CPU of node */
+	cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
+
+	/* If CPU is valid return that, otherwise just defer */
+	return cpu < nr_cpu_ids ? cpu : WORK_CPU_UNBOUND;
+}
+
+/**
+ * queue_work_node - queue work on a "random" cpu for a given NUMA node
+ * @node: NUMA node that we are targeting the work for
+ * @wq: workqueue to use
+ * @work: work to queue
+ *
+ * We queue the work to a "random" CPU within a given NUMA node. The basic
+ * idea here is to provide a way to somehow associate work with a given
+ * NUMA node.
+ *
+ * This function will only make a best effort attempt at getting this onto
+ * the right NUMA node. If no node is requested or the requested node is
+ * offline then we just fall back to standard queue_work behavior.
+ *
+ * Currently the "random" CPU ends up being the first available CPU in the
+ * intersection of cpu_online_mask and the cpumask of the node, unless we
+ * are running on the node. In that case we just use the current CPU.
+ *
+ * Return: %false if @work was already on a queue, %true otherwise.
+ */
+bool queue_work_node(int node, struct workqueue_struct *wq,
+		     struct work_struct *work)
+{
+	unsigned long flags;
+	bool ret = false;
+
+	/*
+	 * This current implementation is specific to unbound workqueues.
+	 * Specifically we only return the first available CPU for a given
+	 * node instead of cycling through individual CPUs within the node.
+	 *
+	 * If this is used with a per-cpu workqueue then the logic in
+	 * workqueue_select_cpu_near would need to be updated to allow for
+	 * some round robin type logic.
+	 */
+	WARN_ON_ONCE(!(wq->flags & WQ_UNBOUND));
+
+	local_irq_save(flags);
+
+	if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
+		int cpu = workqueue_select_cpu_near(node);
+
+		__queue_work(cpu, wq, work);
+		ret = true;
+	}
+
+	local_irq_restore(flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(queue_work_node);
+
 void delayed_work_timer_fn(struct timer_list *t)
 {
 	struct delayed_work *dwork = from_timer(dwork, t, timer);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 5/9] async: Add support for queueing on specific NUMA node
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (3 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 6/9] driver core: Attach devices on CPU local to device node Alexander Duyck
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Introduce four new variants of the async_schedule_ functions that allow
scheduling on a specific NUMA node.

The first two functions are async_schedule_near and
async_schedule_near_domain end up mapping to async_schedule and
async_schedule_domain, but provide NUMA node specific functionality. They
replace the original functions which were moved to inline function
definitions that call the new functions while passing NUMA_NO_NODE.

The second two functions are async_schedule_dev and
async_schedule_dev_domain which provide NUMA specific functionality when
passing a device as the data member and that device has a NUMA node other
than NUMA_NO_NODE.

The main motivation behind this is to address the need to be able to
schedule device specific init work on specific NUMA nodes in order to
improve performance of memory initialization.

I have seen a significant improvement in initialziation time for persistent
memory as a result of this approach. In the case of 3TB of memory on a
single node the initialization time in the worst case went from 36s down to
about 26s for a 10s improvement. As such the data shows a general benefit
for affinitizing the async work to the node local to the device.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 include/linux/async.h |   82 +++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/async.c        |   53 +++++++++++++++++---------------
 2 files changed, 108 insertions(+), 27 deletions(-)

diff --git a/include/linux/async.h b/include/linux/async.h
index 6b0226bdaadc..f81d6dbffe68 100644
--- a/include/linux/async.h
+++ b/include/linux/async.h
@@ -14,6 +14,8 @@
 
 #include <linux/types.h>
 #include <linux/list.h>
+#include <linux/numa.h>
+#include <linux/device.h>
 
 typedef u64 async_cookie_t;
 typedef void (*async_func_t) (void *data, async_cookie_t cookie);
@@ -37,9 +39,83 @@ struct async_domain {
 	struct async_domain _name = { .pending = LIST_HEAD_INIT(_name.pending), \
 				      .registered = 0 }
 
-extern async_cookie_t async_schedule(async_func_t func, void *data);
-extern async_cookie_t async_schedule_domain(async_func_t func, void *data,
-					    struct async_domain *domain);
+async_cookie_t async_schedule_node(async_func_t func, void *data,
+				   int node);
+async_cookie_t async_schedule_node_domain(async_func_t func, void *data,
+					  int node,
+					  struct async_domain *domain);
+
+/**
+ * async_schedule - schedule a function for asynchronous execution
+ * @func: function to execute asynchronously
+ * @data: data pointer to pass to the function
+ *
+ * Returns an async_cookie_t that may be used for checkpointing later.
+ * Note: This function may be called from atomic or non-atomic contexts.
+ */
+static inline async_cookie_t async_schedule(async_func_t func, void *data)
+{
+	return async_schedule_node(func, data, NUMA_NO_NODE);
+}
+
+/**
+ * async_schedule_domain - schedule a function for asynchronous execution within a certain domain
+ * @func: function to execute asynchronously
+ * @data: data pointer to pass to the function
+ * @domain: the domain
+ *
+ * Returns an async_cookie_t that may be used for checkpointing later.
+ * @domain may be used in the async_synchronize_*_domain() functions to
+ * wait within a certain synchronization domain rather than globally.
+ * Note: This function may be called from atomic or non-atomic contexts.
+ */
+static inline async_cookie_t
+async_schedule_domain(async_func_t func, void *data,
+		      struct async_domain *domain)
+{
+	return async_schedule_node_domain(func, data, NUMA_NO_NODE, domain);
+}
+
+/**
+ * async_schedule_dev - A device specific version of async_schedule
+ * @func: function to execute asynchronously
+ * @dev: device argument to be passed to function
+ *
+ * Returns an async_cookie_t that may be used for checkpointing later.
+ * @dev is used as both the argument for the function and to provide NUMA
+ * context for where to run the function. By doing this we can try to
+ * provide for the best possible outcome by operating on the device on the
+ * CPUs closest to the device.
+ * Note: This function may be called from atomic or non-atomic contexts.
+ */
+static inline async_cookie_t
+async_schedule_dev(async_func_t func, struct device *dev)
+{
+	return async_schedule_node(func, dev, dev_to_node(dev));
+}
+
+/**
+ * async_schedule_dev_domain - A device specific version of async_schedule_domain
+ * @func: function to execute asynchronously
+ * @dev: device argument to be passed to function
+ * @domain: the domain
+ *
+ * Returns an async_cookie_t that may be used for checkpointing later.
+ * @dev is used as both the argument for the function and to provide NUMA
+ * context for where to run the function. By doing this we can try to
+ * provide for the best possible outcome by operating on the device on the
+ * CPUs closest to the device.
+ * @domain may be used in the async_synchronize_*_domain() functions to
+ * wait within a certain synchronization domain rather than globally.
+ * Note: This function may be called from atomic or non-atomic contexts.
+ */
+static inline async_cookie_t
+async_schedule_dev_domain(async_func_t func, struct device *dev,
+			  struct async_domain *domain)
+{
+	return async_schedule_node_domain(func, dev, dev_to_node(dev), domain);
+}
+
 void async_unregister_domain(struct async_domain *domain);
 extern void async_synchronize_full(void);
 extern void async_synchronize_full_domain(struct async_domain *domain);
diff --git a/kernel/async.c b/kernel/async.c
index 4932e9193fa3..59ce133eef11 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -148,7 +148,25 @@ static void async_run_entry_fn(struct work_struct *work)
 	wake_up(&async_done);
 }
 
-static async_cookie_t __async_schedule(async_func_t func, void *data, struct async_domain *domain)
+/**
+ * async_schedule_node_domain - NUMA specific version of async_schedule_domain
+ * @func: function to execute asynchronously
+ * @data: data pointer to pass to the function
+ * @node: NUMA node that we want to schedule this on or close to
+ * @domain: the domain
+ *
+ * Returns an async_cookie_t that may be used for checkpointing later.
+ * @domain may be used in the async_synchronize_*_domain() functions to
+ * wait within a certain synchronization domain rather than globally.
+ *
+ * Note: This function may be called from atomic or non-atomic contexts.
+ *
+ * The node requested will be honored on a best effort basis. If the node
+ * has no CPUs associated with it then the work is distributed among all
+ * available CPUs.
+ */
+async_cookie_t async_schedule_node_domain(async_func_t func, void *data,
+					  int node, struct async_domain *domain)
 {
 	struct async_entry *entry;
 	unsigned long flags;
@@ -194,43 +212,30 @@ static async_cookie_t __async_schedule(async_func_t func, void *data, struct asy
 	current->flags |= PF_USED_ASYNC;
 
 	/* schedule for execution */
-	queue_work(system_unbound_wq, &entry->work);
+	queue_work_node(node, system_unbound_wq, &entry->work);
 
 	return newcookie;
 }
+EXPORT_SYMBOL_GPL(async_schedule_node_domain);
 
 /**
- * async_schedule - schedule a function for asynchronous execution
+ * async_schedule_node - NUMA specific version of async_schedule
  * @func: function to execute asynchronously
  * @data: data pointer to pass to the function
+ * @node: NUMA node that we want to schedule this on or close to
  *
  * Returns an async_cookie_t that may be used for checkpointing later.
  * Note: This function may be called from atomic or non-atomic contexts.
- */
-async_cookie_t async_schedule(async_func_t func, void *data)
-{
-	return __async_schedule(func, data, &async_dfl_domain);
-}
-EXPORT_SYMBOL_GPL(async_schedule);
-
-/**
- * async_schedule_domain - schedule a function for asynchronous execution within a certain domain
- * @func: function to execute asynchronously
- * @data: data pointer to pass to the function
- * @domain: the domain
  *
- * Returns an async_cookie_t that may be used for checkpointing later.
- * @domain may be used in the async_synchronize_*_domain() functions to
- * wait within a certain synchronization domain rather than globally.  A
- * synchronization domain is specified via @domain.  Note: This function
- * may be called from atomic or non-atomic contexts.
+ * The node requested will be honored on a best effort basis. If the node
+ * has no CPUs associated with it then the work is distributed among all
+ * available CPUs.
  */
-async_cookie_t async_schedule_domain(async_func_t func, void *data,
-				     struct async_domain *domain)
+async_cookie_t async_schedule_node(async_func_t func, void *data, int node)
 {
-	return __async_schedule(func, data, domain);
+	return async_schedule_node_domain(func, data, node, &async_dfl_domain);
 }
-EXPORT_SYMBOL_GPL(async_schedule_domain);
+EXPORT_SYMBOL_GPL(async_schedule_node);
 
 /**
  * async_synchronize_full - synchronize all asynchronous function calls


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 6/9] driver core: Attach devices on CPU local to device node
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (4 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 5/9] async: Add support for queueing on specific " Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 7/9] PM core: Use new async_schedule_dev command Alexander Duyck
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Call the asynchronous probe routines on a CPU local to the device node. By
doing this we should be able to improve our initialization time
significantly as we can avoid having to access the device from a remote
node which may introduce higher latency.

For example, in the case of initializing memory for NVDIMM this can have a
significant impact as initialing 3TB on remote node can take up to 39
seconds while initialing it on a local node only takes 23 seconds. It is
situations like this where we will see the biggest improvement.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/dd.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 3353f654861b..be9040db8321 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -829,7 +829,7 @@ static int __device_attach(struct device *dev, bool allow_async)
 			 */
 			dev_dbg(dev, "scheduling asynchronous probe\n");
 			get_device(dev);
-			async_schedule(__device_attach_async_helper, dev);
+			async_schedule_dev(__device_attach_async_helper, dev);
 		} else {
 			pm_request_idle(dev);
 		}
@@ -989,7 +989,7 @@ static int __driver_attach(struct device *dev, void *data)
 		if (!dev->driver) {
 			get_device(dev);
 			dev->p->async_driver = drv;
-			async_schedule(__driver_attach_async_helper, dev);
+			async_schedule_dev(__driver_attach_async_helper, dev);
 		}
 		device_unlock(dev);
 		return 0;


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 7/9] PM core: Use new async_schedule_dev command
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (5 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 6/9] driver core: Attach devices on CPU local to device node Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 8/9] libnvdimm: Schedule device registration on node local to the device Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity Alexander Duyck
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Use the device specific version of the async_schedule commands to defer
various tasks related to power management. By doing this we should see a
slight improvement in performance as any device that is sensitive to
latency/locality in the setup will now be initializing on the node closest
to the device.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/power/main.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index a690fd400260..ebb8b61b52e9 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -726,7 +726,7 @@ void dpm_noirq_resume_devices(pm_message_t state)
 		reinit_completion(&dev->power.completion);
 		if (is_async(dev)) {
 			get_device(dev);
-			async_schedule(async_resume_noirq, dev);
+			async_schedule_dev(async_resume_noirq, dev);
 		}
 	}
 
@@ -883,7 +883,7 @@ void dpm_resume_early(pm_message_t state)
 		reinit_completion(&dev->power.completion);
 		if (is_async(dev)) {
 			get_device(dev);
-			async_schedule(async_resume_early, dev);
+			async_schedule_dev(async_resume_early, dev);
 		}
 	}
 
@@ -1047,7 +1047,7 @@ void dpm_resume(pm_message_t state)
 		reinit_completion(&dev->power.completion);
 		if (is_async(dev)) {
 			get_device(dev);
-			async_schedule(async_resume, dev);
+			async_schedule_dev(async_resume, dev);
 		}
 	}
 
@@ -1366,7 +1366,7 @@ static int device_suspend_noirq(struct device *dev)
 
 	if (is_async(dev)) {
 		get_device(dev);
-		async_schedule(async_suspend_noirq, dev);
+		async_schedule_dev(async_suspend_noirq, dev);
 		return 0;
 	}
 	return __device_suspend_noirq(dev, pm_transition, false);
@@ -1569,7 +1569,7 @@ static int device_suspend_late(struct device *dev)
 
 	if (is_async(dev)) {
 		get_device(dev);
-		async_schedule(async_suspend_late, dev);
+		async_schedule_dev(async_suspend_late, dev);
 		return 0;
 	}
 
@@ -1833,7 +1833,7 @@ static int device_suspend(struct device *dev)
 
 	if (is_async(dev)) {
 		get_device(dev);
-		async_schedule(async_suspend, dev);
+		async_schedule_dev(async_suspend, dev);
 		return 0;
 	}
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 8/9] libnvdimm: Schedule device registration on node local to the device
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (6 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 7/9] PM core: Use new async_schedule_dev command Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  2018-12-13  0:45 ` [driver-core PATCH v9 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity Alexander Duyck
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

Force the device registration for nvdimm devices to be closer to the actual
device. This is achieved by using either the NUMA node ID of the region, or
of the parent. By doing this we can have everything above the region based
on the region, and everything below the region based on the nvdimm bus.

By guaranteeing NUMA locality I see an improvement of as high as 25% for
per-node init of a system with 12TB of persistent memory.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/nvdimm/bus.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index f1fb39921236..b1e193541874 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -23,6 +23,7 @@
 #include <linux/ndctl.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/cpu.h>
 #include <linux/fs.h>
 #include <linux/io.h>
 #include <linux/mm.h>
@@ -513,11 +514,15 @@ void __nd_device_register(struct device *dev)
 		set_dev_node(dev, to_nd_region(dev)->numa_node);
 
 	dev->bus = &nvdimm_bus_type;
-	if (dev->parent)
+	if (dev->parent) {
 		get_device(dev->parent);
+		if (dev_to_node(dev) == NUMA_NO_NODE)
+			set_dev_node(dev, dev_to_node(dev->parent));
+	}
 	get_device(dev);
-	async_schedule_domain(nd_async_device_register, dev,
-			&nd_async_domain);
+
+	async_schedule_dev_domain(nd_async_device_register, dev,
+				  &nd_async_domain);
 }
 
 void nd_device_register(struct device *dev)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [driver-core PATCH v9 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity
  2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
                   ` (7 preceding siblings ...)
  2018-12-13  0:45 ` [driver-core PATCH v9 8/9] libnvdimm: Schedule device registration on node local to the device Alexander Duyck
@ 2018-12-13  0:45 ` Alexander Duyck
  8 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-13  0:45 UTC (permalink / raw)
  To: linux-kernel, gregkh
  Cc: mcgrof, linux-nvdimm, tj, akpm, linux-pm, jiangshanlai, rafael,
	len.brown, pavel, zwisler, dan.j.williams, dave.jiang,
	bvanassche, alexander.h.duyck

The current async_probe test code is only testing one device allocated
prior to driver load and only loading one device afterwards. Instead of
doing things this way it makes much more sense to load one device per CPU
in order to actually stress the async infrastructure. By doing this we
should see delays significantly increase in the event of devices being
serialized.

In addition I have updated the test to verify that we are trying to place
the work on the correct NUMA node when we are running in async mode. By
doing this we can verify the best possible outcome for device and driver
load times.

I have added a timeout value that is used to disable the sleep and instead
cause the probe routine to report an error indicating it timed out. By
doing this we limit the maximum runtime for the test to 20 seconds or less.

The last major change in this set is that I have gone through and tuned it
for handling the massive number of possible events that will be scheduled.
Instead of reporting the sleep for each individual device it is moved to
only being displayed if we enable debugging.

With this patch applied below are what a failing test and a passing test
should look like. I elided a few hundred lines in the failing test that
were duplicated since the system I was testing on had a massive number of
CPU cores:

-- Failing --
[  243.524697] test_async_driver_probe: registering first set of asynchronous devices...
[  243.535625] test_async_driver_probe: registering asynchronous driver...
[  243.543038] test_async_driver_probe: registration took 0 msecs
[  243.549559] test_async_driver_probe: registering second set of asynchronous devices...
[  243.568350] platform test_async_driver.447: registration took 9 msecs
[  243.575544] test_async_driver_probe: registering first synchronous device...
[  243.583454] test_async_driver_probe: registering synchronous driver...
[  248.825920] test_async_driver_probe: registration took 5235 msecs
[  248.825922] test_async_driver_probe: registering second synchronous device...
[  248.825928] test_async_driver test_async_driver.443: NUMA node mismatch 3 != 1
[  248.825932] test_async_driver test_async_driver.445: NUMA node mismatch 3 != 1
[  248.825935] test_async_driver test_async_driver.446: NUMA node mismatch 3 != 1
[  248.825939] test_async_driver test_async_driver.440: NUMA node mismatch 3 != 1
[  248.825943] test_async_driver test_async_driver.441: NUMA node mismatch 3 != 1
...
[  248.827150] test_async_driver test_async_driver.229: NUMA node mismatch 0 != 1
[  248.827158] test_async_driver test_async_driver.228: NUMA node mismatch 0 != 1
[  248.827220] test_async_driver test_async_driver.281: NUMA node mismatch 2 != 1
[  248.827229] test_async_driver test_async_driver.282: NUMA node mismatch 2 != 1
[  248.827240] test_async_driver test_async_driver.280: NUMA node mismatch 2 != 1
[  253.945834] test_async_driver test_async_driver.1: NUMA node mismatch 0 != 1
[  253.945878] test_sync_driver test_sync_driver.1: registration took 5119 msecs
[  253.961693] test_async_driver_probe: async events still pending, forcing timeout and synchronize
[  259.065839] test_async_driver test_async_driver.2: NUMA node mismatch 0 != 1
[  259.073786] test_async_driver test_async_driver.3: async probe took too long
[  259.081669] test_async_driver test_async_driver.3: NUMA node mismatch 0 != 1
[  259.089569] test_async_driver test_async_driver.4: async probe took too long
[  259.097451] test_async_driver test_async_driver.4: NUMA node mismatch 0 != 1
[  259.105338] test_async_driver test_async_driver.5: async probe took too long
[  259.113204] test_async_driver test_async_driver.5: NUMA node mismatch 0 != 1
[  259.121089] test_async_driver test_async_driver.6: async probe took too long
[  259.128961] test_async_driver test_async_driver.6: NUMA node mismatch 0 != 1
[  259.136850] test_async_driver test_async_driver.7: async probe took too long
...
[  262.124062] test_async_driver test_async_driver.221: async probe took too long
[  262.132130] test_async_driver test_async_driver.221: NUMA node mismatch 3 != 1
[  262.140206] test_async_driver test_async_driver.222: async probe took too long
[  262.148277] test_async_driver test_async_driver.222: NUMA node mismatch 3 != 1
[  262.156351] test_async_driver test_async_driver.223: async probe took too long
[  262.164419] test_async_driver test_async_driver.223: NUMA node mismatch 3 != 1
[  262.172630] test_async_driver_probe: Test failed with 222 errors and 336 warnings

-- Passing --
[  105.419247] test_async_driver_probe: registering first set of asynchronous devices...
[  105.432040] test_async_driver_probe: registering asynchronous driver...
[  105.439718] test_async_driver_probe: registration took 0 msecs
[  105.446239] test_async_driver_probe: registering second set of asynchronous devices...
[  105.477986] platform test_async_driver.447: registration took 22 msecs
[  105.485276] test_async_driver_probe: registering first synchronous device...
[  105.493169] test_async_driver_probe: registering synchronous driver...
[  110.597981] test_async_driver_probe: registration took 5097 msecs
[  110.604806] test_async_driver_probe: registering second synchronous device...
[  115.707490] test_sync_driver test_sync_driver.1: registration took 5094 msecs
[  115.715478] test_async_driver_probe: completed successfully

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 drivers/base/test/test_async_driver_probe.c |  261 +++++++++++++++++++++------
 1 file changed, 201 insertions(+), 60 deletions(-)

diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c
index e7f145d662f0..f4b1d8e54daf 100644
--- a/drivers/base/test/test_async_driver_probe.c
+++ b/drivers/base/test/test_async_driver_probe.c
@@ -11,16 +11,47 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/time.h>
+#include <linux/numa.h>
+#include <linux/nodemask.h>
+#include <linux/topology.h>
 
 #define TEST_PROBE_DELAY	(5 * 1000)	/* 5 sec */
 #define TEST_PROBE_THRESHOLD	(TEST_PROBE_DELAY / 2)
 
+static atomic_t warnings, errors, timeout, async_completed;
+
 static int test_probe(struct platform_device *pdev)
 {
-	dev_info(&pdev->dev, "sleeping for %d msecs in probe\n",
-		 TEST_PROBE_DELAY);
-	msleep(TEST_PROBE_DELAY);
-	dev_info(&pdev->dev, "done sleeping\n");
+	struct device *dev = &pdev->dev;
+
+	/*
+	 * Determine if we have hit the "timeout" limit for the test if we
+	 * have then report it as an error, otherwise we wil sleep for the
+	 * required amount of time and then report completion.
+	 */
+	if (atomic_read(&timeout)) {
+		dev_err(dev, "async probe took too long\n");
+		atomic_inc(&errors);
+	} else {
+		dev_dbg(&pdev->dev, "sleeping for %d msecs in probe\n",
+			 TEST_PROBE_DELAY);
+		msleep(TEST_PROBE_DELAY);
+		dev_dbg(&pdev->dev, "done sleeping\n");
+	}
+
+	/*
+	 * Report NUMA mismatch if device node is set and we are not
+	 * performing an async init on that node.
+	 */
+	if (dev->driver->probe_type == PROBE_PREFER_ASYNCHRONOUS) {
+		if (dev_to_node(dev) != numa_node_id()) {
+			dev_warn(dev, "NUMA node mismatch %d != %d\n",
+				 dev_to_node(dev), numa_node_id());
+			atomic_inc(&warnings);
+		}
+
+		atomic_inc(&async_completed);
+	}
 
 	return 0;
 }
@@ -41,31 +72,64 @@ static struct platform_driver sync_driver = {
 	.probe = test_probe,
 };
 
-static struct platform_device *async_dev_1, *async_dev_2;
-static struct platform_device *sync_dev_1;
+static struct platform_device *async_dev[NR_CPUS * 2];
+static struct platform_device *sync_dev[2];
+
+static struct platform_device *
+test_platform_device_register_node(char *name, int id, int nid)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	pdev = platform_device_alloc(name, id);
+	if (!pdev)
+		return NULL;
+
+	if (nid != NUMA_NO_NODE)
+		set_dev_node(&pdev->dev, nid);
+
+	ret = platform_device_add(pdev);
+	if (ret) {
+		platform_device_put(pdev);
+		return ERR_PTR(ret);
+	}
+
+	return pdev;
+
+}
 
 static int __init test_async_probe_init(void)
 {
-	ktime_t calltime, delta;
+	struct platform_device **pdev = NULL;
+	int async_id = 0, sync_id = 0;
 	unsigned long long duration;
-	int error;
+	ktime_t calltime, delta;
+	int err, nid, cpu;
+
+	pr_info("registering first set of asynchronous devices...\n");
 
-	pr_info("registering first asynchronous device...\n");
+	for_each_online_cpu(cpu) {
+		nid = cpu_to_node(cpu);
+		pdev = &async_dev[async_id];
+		*pdev =	test_platform_device_register_node("test_async_driver",
+							   async_id,
+							   nid);
+		if (IS_ERR(*pdev)) {
+			err = PTR_ERR(*pdev);
+			*pdev = NULL;
+			pr_err("failed to create async_dev: %d\n", err);
+			goto err_unregister_async_devs;
+		}
 
-	async_dev_1 = platform_device_register_simple("test_async_driver", 1,
-						      NULL, 0);
-	if (IS_ERR(async_dev_1)) {
-		error = PTR_ERR(async_dev_1);
-		pr_err("failed to create async_dev_1: %d\n", error);
-		return error;
+		async_id++;
 	}
 
 	pr_info("registering asynchronous driver...\n");
 	calltime = ktime_get();
-	error = platform_driver_register(&async_driver);
-	if (error) {
-		pr_err("Failed to register async_driver: %d\n", error);
-		goto err_unregister_async_dev_1;
+	err = platform_driver_register(&async_driver);
+	if (err) {
+		pr_err("Failed to register async_driver: %d\n", err);
+		goto err_unregister_async_devs;
 	}
 
 	delta = ktime_sub(ktime_get(), calltime);
@@ -73,86 +137,163 @@ static int __init test_async_probe_init(void)
 	pr_info("registration took %lld msecs\n", duration);
 	if (duration > TEST_PROBE_THRESHOLD) {
 		pr_err("test failed: probe took too long\n");
-		error = -ETIMEDOUT;
+		err = -ETIMEDOUT;
 		goto err_unregister_async_driver;
 	}
 
-	pr_info("registering second asynchronous device...\n");
+	pr_info("registering second set of asynchronous devices...\n");
 	calltime = ktime_get();
-	async_dev_2 = platform_device_register_simple("test_async_driver", 2,
-						      NULL, 0);
-	if (IS_ERR(async_dev_2)) {
-		error = PTR_ERR(async_dev_2);
-		pr_err("failed to create async_dev_2: %d\n", error);
-		goto err_unregister_async_driver;
+	for_each_online_cpu(cpu) {
+		nid = cpu_to_node(cpu);
+		pdev = &sync_dev[sync_id];
+
+		*pdev = test_platform_device_register_node("test_async_driver",
+							   async_id,
+							   nid);
+		if (IS_ERR(*pdev)) {
+			err = PTR_ERR(*pdev);
+			*pdev = NULL;
+			pr_err("failed to create async_dev: %d\n", err);
+			goto err_unregister_async_driver;
+		}
+
+		async_id++;
 	}
 
 	delta = ktime_sub(ktime_get(), calltime);
 	duration = (unsigned long long) ktime_to_ms(delta);
-	pr_info("registration took %lld msecs\n", duration);
+	dev_info(&(*pdev)->dev,
+		 "registration took %lld msecs\n", duration);
 	if (duration > TEST_PROBE_THRESHOLD) {
-		pr_err("test failed: probe took too long\n");
-		error = -ETIMEDOUT;
-		goto err_unregister_async_dev_2;
+		dev_err(&(*pdev)->dev,
+			"test failed: probe took too long\n");
+		err = -ETIMEDOUT;
+		goto err_unregister_async_driver;
 	}
 
-	pr_info("registering synchronous driver...\n");
 
-	error = platform_driver_register(&sync_driver);
-	if (error) {
-		pr_err("Failed to register async_driver: %d\n", error);
-		goto err_unregister_async_dev_2;
+	pr_info("registering first synchronous device...\n");
+	nid = cpu_to_node(cpu);
+	pdev = &sync_dev[sync_id];
+
+	*pdev = test_platform_device_register_node("test_sync_driver",
+						   sync_id,
+						   NUMA_NO_NODE);
+	if (IS_ERR(*pdev)) {
+		err = PTR_ERR(*pdev);
+		*pdev = NULL;
+		pr_err("failed to create sync_dev: %d\n", err);
+		goto err_unregister_async_driver;
 	}
 
-	pr_info("registering synchronous device...\n");
+	sync_id++;
+
+	pr_info("registering synchronous driver...\n");
 	calltime = ktime_get();
-	sync_dev_1 = platform_device_register_simple("test_sync_driver", 1,
-						     NULL, 0);
-	if (IS_ERR(sync_dev_1)) {
-		error = PTR_ERR(sync_dev_1);
-		pr_err("failed to create sync_dev_1: %d\n", error);
-		goto err_unregister_sync_driver;
+	err = platform_driver_register(&sync_driver);
+	if (err) {
+		pr_err("Failed to register async_driver: %d\n", err);
+		goto err_unregister_sync_devs;
 	}
 
 	delta = ktime_sub(ktime_get(), calltime);
 	duration = (unsigned long long) ktime_to_ms(delta);
 	pr_info("registration took %lld msecs\n", duration);
 	if (duration < TEST_PROBE_THRESHOLD) {
-		pr_err("test failed: probe was too quick\n");
-		error = -ETIMEDOUT;
-		goto err_unregister_sync_dev_1;
+		dev_err(&(*pdev)->dev,
+			"test failed: probe was too quick\n");
+		err = -ETIMEDOUT;
+		goto err_unregister_sync_driver;
 	}
 
-	pr_info("completed successfully");
+	pr_info("registering second synchronous device...\n");
+	pdev = &sync_dev[sync_id];
+	calltime = ktime_get();
 
-	return 0;
+	*pdev = test_platform_device_register_node("test_sync_driver",
+						   sync_id,
+						   NUMA_NO_NODE);
+	if (IS_ERR(*pdev)) {
+		err = PTR_ERR(*pdev);
+		*pdev = NULL;
+		pr_err("failed to create sync_dev: %d\n", err);
+		goto err_unregister_sync_driver;
+	}
 
-err_unregister_sync_dev_1:
-	platform_device_unregister(sync_dev_1);
+	sync_id++;
 
-err_unregister_sync_driver:
-	platform_driver_unregister(&sync_driver);
+	delta = ktime_sub(ktime_get(), calltime);
+	duration = (unsigned long long) ktime_to_ms(delta);
+	dev_info(&(*pdev)->dev,
+		 "registration took %lld msecs\n", duration);
+	if (duration < TEST_PROBE_THRESHOLD) {
+		dev_err(&(*pdev)->dev,
+			"test failed: probe was too quick\n");
+		err = -ETIMEDOUT;
+		goto err_unregister_sync_driver;
+	}
 
-err_unregister_async_dev_2:
-	platform_device_unregister(async_dev_2);
+	/*
+	 * The async events should have completed while we were taking care
+	 * of the synchronous events. We will now terminate any outstanding
+	 * asynchronous probe calls remaining by forcing timeout and remove
+	 * the driver before we return which should force the flush of the
+	 * pending asynchronous probe calls.
+	 *
+	 * Otherwise if they completed without errors or warnings then
+	 * report successful completion.
+	 */
+	if (atomic_read(&async_completed) != async_id) {
+		pr_err("async events still pending, forcing timeout\n");
+		atomic_inc(&timeout);
+		err = -ETIMEDOUT;
+	} else if (!atomic_read(&errors) && !atomic_read(&warnings)) {
+		pr_info("completed successfully\n");
+		return 0;
+	}
 
+err_unregister_sync_driver:
+	platform_driver_unregister(&sync_driver);
+err_unregister_sync_devs:
+	while (sync_id--)
+		platform_device_unregister(sync_dev[sync_id]);
 err_unregister_async_driver:
 	platform_driver_unregister(&async_driver);
+err_unregister_async_devs:
+	while (async_id--)
+		platform_device_unregister(async_dev[async_id]);
+
+	/*
+	 * If err is already set then count that as an additional error for
+	 * the test. Otherwise we will report an invalid argument error and
+	 * not count that as we should have reached here as a result of
+	 * errors or warnings being reported by the probe routine.
+	 */
+	if (err)
+		atomic_inc(&errors);
+	else
+		err = -EINVAL;
 
-err_unregister_async_dev_1:
-	platform_device_unregister(async_dev_1);
+	pr_err("Test failed with %d errors and %d warnings\n",
+	       atomic_read(&errors), atomic_read(&warnings));
 
-	return error;
+	return err;
 }
 module_init(test_async_probe_init);
 
 static void __exit test_async_probe_exit(void)
 {
+	int id = 2;
+
 	platform_driver_unregister(&async_driver);
 	platform_driver_unregister(&sync_driver);
-	platform_device_unregister(async_dev_1);
-	platform_device_unregister(async_dev_2);
-	platform_device_unregister(sync_dev_1);
+
+	while (id--)
+		platform_device_unregister(sync_dev[id]);
+
+	id = NR_CPUS * 2;
+	while (id--)
+		platform_device_unregister(async_dev[id]);
 }
 module_exit(test_async_probe_exit);
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device
  2018-12-13  0:45 ` [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device Alexander Duyck
@ 2018-12-14 10:40   ` Rafael J. Wysocki
  2018-12-17 16:31     ` Alexander Duyck
  0 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2018-12-14 10:40 UTC (permalink / raw)
  To: alexander.h.duyck
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Luis R. Rodriguez,
	linux-nvdimm, Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan,
	Rafael J. Wysocki, Len Brown, Pavel Machek, zwisler,
	Dan Williams, dave.jiang, bvanassche

 somOn Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
<alexander.h.duyck@linux.intel.com> wrote:
>
> Try to consolidate all of the locking and unlocking of both the parent and
> device when attaching or removing a driver from a given device.
>
> To do that I first consolidated the lock pattern into two functions
> __device_driver_lock and __device_driver_unlock. After doing that I then
> created functions specific to attaching and detaching the driver while
> acquiring these locks. By doing this I was able to reduce the number of
> spots where we touch need_parent_lock from 12 down to 4.
>
> This patch should produce no functional changes, it is meant to be a code
> clean-up/consolidation only.
>
> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> ---
>  drivers/base/base.h |    2 +
>  drivers/base/bus.c  |   23 ++----------
>  drivers/base/dd.c   |   95 ++++++++++++++++++++++++++++++++++++++++-----------
>  3 files changed, 81 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 7a419a7a6235..3f22ebd6117a 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -124,6 +124,8 @@ extern int driver_add_groups(struct device_driver *drv,
>                              const struct attribute_group **groups);
>  extern void driver_remove_groups(struct device_driver *drv,
>                                  const struct attribute_group **groups);
> +int device_driver_attach(struct device_driver *drv, struct device *dev);
> +void device_driver_detach(struct device *dev);
>
>  extern char *make_class_name(const char *name, struct kobject *kobj);
>
> diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> index b886b15cb53b..74054481007d 100644
> --- a/drivers/base/bus.c
> +++ b/drivers/base/bus.c
> @@ -184,11 +184,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf,
>
>         dev = bus_find_device_by_name(bus, NULL, buf);
>         if (dev && dev->driver == drv) {
> -               if (dev->parent && dev->bus->need_parent_lock)
> -                       device_lock(dev->parent);
> -               device_release_driver(dev);
> -               if (dev->parent && dev->bus->need_parent_lock)
> -                       device_unlock(dev->parent);
> +               device_driver_detach(dev);
>                 err = count;
>         }
>         put_device(dev);
> @@ -211,13 +207,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
>
>         dev = bus_find_device_by_name(bus, NULL, buf);
>         if (dev && dev->driver == NULL && driver_match_device(drv, dev)) {
> -               if (dev->parent && bus->need_parent_lock)
> -                       device_lock(dev->parent);
> -               device_lock(dev);
> -               err = driver_probe_device(drv, dev);
> -               device_unlock(dev);
> -               if (dev->parent && bus->need_parent_lock)
> -                       device_unlock(dev->parent);
> +               err = device_driver_attach(drv, dev);
>
>                 if (err > 0) {
>                         /* success */
> @@ -771,13 +761,8 @@ EXPORT_SYMBOL_GPL(bus_rescan_devices);
>   */
>  int device_reprobe(struct device *dev)
>  {
> -       if (dev->driver) {
> -               if (dev->parent && dev->bus->need_parent_lock)
> -                       device_lock(dev->parent);
> -               device_release_driver(dev);
> -               if (dev->parent && dev->bus->need_parent_lock)
> -                       device_unlock(dev->parent);
> -       }
> +       if (dev->driver)
> +               device_driver_detach(dev);
>         return bus_rescan_devices_helper(dev, NULL);
>  }
>  EXPORT_SYMBOL_GPL(device_reprobe);
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 74c194ac99df..f07c16277ed9 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -867,6 +867,64 @@ void device_initial_probe(struct device *dev)
>         __device_attach(dev, true);
>  }
>
> +/*
> + * __device_driver_lock - acquire locks needed to manipulate dev->drv
> + * @dev: Device we will update driver info for
> + * @parent: Parent device. Needed if the bus requires parent lock
> + *
> + * This function will take the required locks for manipulating dev->drv.
> + * Normally this will just be the @dev lock, but when called for a USB
> + * interface, @parent lock will be held as well.
> + */
> +static void __device_driver_lock(struct device *dev, struct device *parent)
> +{
> +       if (parent && dev->bus->need_parent_lock)
> +               device_lock(parent);
> +       device_lock(dev);
> +}
> +
> +/*
> + * __device_driver_unlock - release locks needed to manipulate dev->drv
> + * @dev: Device we will update driver info for
> + * @parent: Parent device. Needed if the bus requires parent lock
> + *
> + * This function will release the required locks for manipulating dev->drv.
> + * Normally this will just be the the @dev lock, but when called for a
> + * USB interface, @parent lock will be released as well.
> + */
> +static void __device_driver_unlock(struct device *dev, struct device *parent)
> +{
> +       device_unlock(dev);
> +       if (parent && dev->bus->need_parent_lock)
> +               device_unlock(parent);
> +}
> +
> +/**
> + * device_driver_attach - attach a specific driver to a specific device
> + * @drv: Driver to attach
> + * @dev: Device to attach it to
> + *
> + * Manually attach driver to a device. Will acquire both @dev lock and
> + * @dev->parent lock if needed.
> + */
> +int device_driver_attach(struct device_driver *drv, struct device *dev)
> +{
> +       int ret = 0;
> +
> +       __device_driver_lock(dev, dev->parent);
> +
> +       /*
> +        * If device has been removed or someone has already successfully
> +        * bound a driver before us just skip the driver probe call.
> +        */
> +       if (!dev->dead && !dev->driver)
> +               ret = driver_probe_device(drv, dev);
> +
> +       __device_driver_unlock(dev, dev->parent);
> +
> +       return ret;
> +}
> +
>  static int __driver_attach(struct device *dev, void *data)
>  {
>         struct device_driver *drv = data;
> @@ -894,14 +952,7 @@ static int __driver_attach(struct device *dev, void *data)
>                 return ret;
>         } /* ret > 0 means positive match */
>
> -       if (dev->parent && dev->bus->need_parent_lock)
> -               device_lock(dev->parent);
> -       device_lock(dev);
> -       if (!dev->dead && !dev->driver)
> -               driver_probe_device(drv, dev);
> -       device_unlock(dev);
> -       if (dev->parent && dev->bus->need_parent_lock)
> -               device_unlock(dev->parent);
> +       device_driver_attach(drv, dev);
>
>         return 0;
>  }
> @@ -932,15 +983,11 @@ static void __device_release_driver(struct device *dev, struct device *parent)
>         drv = dev->driver;
>         if (drv) {
>                 while (device_links_busy(dev)) {
> -                       device_unlock(dev);
> -                       if (parent)
> -                               device_unlock(parent);
> +                       __device_driver_unlock(dev, parent);
>
>                         device_links_unbind_consumers(dev);
> -                       if (parent)
> -                               device_lock(parent);
>
> -                       device_lock(dev);
> +                       __device_driver_lock(dev, parent);
>                         /*
>                          * A concurrent invocation of the same function might
>                          * have released the driver successfully while this one

This change will clash with
https://patchwork.kernel.org/patch/10729571/ somewhat which really
should go it first as a stable-candidate fix.

I would suggest rebasing on top of that one, unless Greg has different
plans here.

> @@ -993,16 +1040,12 @@ void device_release_driver_internal(struct device *dev,
>                                     struct device_driver *drv,
>                                     struct device *parent)
>  {
> -       if (parent && dev->bus->need_parent_lock)
> -               device_lock(parent);
> +       __device_driver_lock(dev, parent);
>
> -       device_lock(dev);
>         if (!drv || drv == dev->driver)
>                 __device_release_driver(dev, parent);
>
> -       device_unlock(dev);
> -       if (parent && dev->bus->need_parent_lock)
> -               device_unlock(parent);
> +       __device_driver_unlock(dev, parent);
>  }
>
>  /**
> @@ -1027,6 +1070,18 @@ void device_release_driver(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(device_release_driver);
>
> +/**
> + * device_driver_detach - detach driver from a specific device
> + * @dev: device to detach driver from
> + *
> + * Detach driver from device. Will acquire both @dev lock and @dev->parent
> + * lock if needed.
> + */
> +void device_driver_detach(struct device *dev)
> +{
> +       device_release_driver_internal(dev, NULL, dev->parent);
> +}
> +
>  /**
>   * driver_detach - detach driver from all devices it controls.
>   * @drv: driver.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device
  2018-12-14 10:40   ` Rafael J. Wysocki
@ 2018-12-17 16:31     ` Alexander Duyck
  0 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-12-17 16:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, Luis R. Rodriguez, linux-nvdimm,
	Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan, Len Brown,
	Pavel Machek, zwisler, Dan Williams, dave.jiang, bvanassche

On Fri, 2018-12-14 at 11:40 +0100, Rafael J. Wysocki wrote:
>  somOn Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
> <alexander.h.duyck@linux.intel.com> wrote:
> > 
> > Try to consolidate all of the locking and unlocking of both the parent and
> > device when attaching or removing a driver from a given device.
> > 
> > To do that I first consolidated the lock pattern into two functions
> > __device_driver_lock and __device_driver_unlock. After doing that I then
> > created functions specific to attaching and detaching the driver while
> > acquiring these locks. By doing this I was able to reduce the number of
> > spots where we touch need_parent_lock from 12 down to 4.
> > 
> > This patch should produce no functional changes, it is meant to be a code
> > clean-up/consolidation only.
> > 
> > Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
> > Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > ---
> >  drivers/base/base.h |    2 +
> >  drivers/base/bus.c  |   23 ++----------
> >  drivers/base/dd.c   |   95 ++++++++++++++++++++++++++++++++++++++++-----------
> >  3 files changed, 81 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index 7a419a7a6235..3f22ebd6117a 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -124,6 +124,8 @@ extern int driver_add_groups(struct device_driver *drv,
> >                              const struct attribute_group **groups);
> >  extern void driver_remove_groups(struct device_driver *drv,
> >                                  const struct attribute_group **groups);
> > +int device_driver_attach(struct device_driver *drv, struct device *dev);
> > +void device_driver_detach(struct device *dev);
> > 
> >  extern char *make_class_name(const char *name, struct kobject *kobj);
> > 
> > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > index b886b15cb53b..74054481007d 100644
> > --- a/drivers/base/bus.c
> > +++ b/drivers/base/bus.c
> > @@ -184,11 +184,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf,
> > 
> >         dev = bus_find_device_by_name(bus, NULL, buf);
> >         if (dev && dev->driver == drv) {
> > -               if (dev->parent && dev->bus->need_parent_lock)
> > -                       device_lock(dev->parent);
> > -               device_release_driver(dev);
> > -               if (dev->parent && dev->bus->need_parent_lock)
> > -                       device_unlock(dev->parent);
> > +               device_driver_detach(dev);
> >                 err = count;
> >         }
> >         put_device(dev);
> > @@ -211,13 +207,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
> > 
> >         dev = bus_find_device_by_name(bus, NULL, buf);
> >         if (dev && dev->driver == NULL && driver_match_device(drv, dev)) {
> > -               if (dev->parent && bus->need_parent_lock)
> > -                       device_lock(dev->parent);
> > -               device_lock(dev);
> > -               err = driver_probe_device(drv, dev);
> > -               device_unlock(dev);
> > -               if (dev->parent && bus->need_parent_lock)
> > -                       device_unlock(dev->parent);
> > +               err = device_driver_attach(drv, dev);
> > 
> >                 if (err > 0) {
> >                         /* success */
> > @@ -771,13 +761,8 @@ EXPORT_SYMBOL_GPL(bus_rescan_devices);
> >   */
> >  int device_reprobe(struct device *dev)
> >  {
> > -       if (dev->driver) {
> > -               if (dev->parent && dev->bus->need_parent_lock)
> > -                       device_lock(dev->parent);
> > -               device_release_driver(dev);
> > -               if (dev->parent && dev->bus->need_parent_lock)
> > -                       device_unlock(dev->parent);
> > -       }
> > +       if (dev->driver)
> > +               device_driver_detach(dev);
> >         return bus_rescan_devices_helper(dev, NULL);
> >  }
> >  EXPORT_SYMBOL_GPL(device_reprobe);
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 74c194ac99df..f07c16277ed9 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -867,6 +867,64 @@ void device_initial_probe(struct device *dev)
> >         __device_attach(dev, true);
> >  }
> > 
> > +/*
> > + * __device_driver_lock - acquire locks needed to manipulate dev->drv
> > + * @dev: Device we will update driver info for
> > + * @parent: Parent device. Needed if the bus requires parent lock
> > + *
> > + * This function will take the required locks for manipulating dev->drv.
> > + * Normally this will just be the @dev lock, but when called for a USB
> > + * interface, @parent lock will be held as well.
> > + */
> > +static void __device_driver_lock(struct device *dev, struct device *parent)
> > +{
> > +       if (parent && dev->bus->need_parent_lock)
> > +               device_lock(parent);
> > +       device_lock(dev);
> > +}
> > +
> > +/*
> > + * __device_driver_unlock - release locks needed to manipulate dev->drv
> > + * @dev: Device we will update driver info for
> > + * @parent: Parent device. Needed if the bus requires parent lock
> > + *
> > + * This function will release the required locks for manipulating dev->drv.
> > + * Normally this will just be the the @dev lock, but when called for a
> > + * USB interface, @parent lock will be released as well.
> > + */
> > +static void __device_driver_unlock(struct device *dev, struct device *parent)
> > +{
> > +       device_unlock(dev);
> > +       if (parent && dev->bus->need_parent_lock)
> > +               device_unlock(parent);
> > +}
> > +
> > +/**
> > + * device_driver_attach - attach a specific driver to a specific device
> > + * @drv: Driver to attach
> > + * @dev: Device to attach it to
> > + *
> > + * Manually attach driver to a device. Will acquire both @dev lock and
> > + * @dev->parent lock if needed.
> > + */
> > +int device_driver_attach(struct device_driver *drv, struct device *dev)
> > +{
> > +       int ret = 0;
> > +
> > +       __device_driver_lock(dev, dev->parent);
> > +
> > +       /*
> > +        * If device has been removed or someone has already successfully
> > +        * bound a driver before us just skip the driver probe call.
> > +        */
> > +       if (!dev->dead && !dev->driver)
> > +               ret = driver_probe_device(drv, dev);
> > +
> > +       __device_driver_unlock(dev, dev->parent);
> > +
> > +       return ret;
> > +}
> > +
> >  static int __driver_attach(struct device *dev, void *data)
> >  {
> >         struct device_driver *drv = data;
> > @@ -894,14 +952,7 @@ static int __driver_attach(struct device *dev, void *data)
> >                 return ret;
> >         } /* ret > 0 means positive match */
> > 
> > -       if (dev->parent && dev->bus->need_parent_lock)
> > -               device_lock(dev->parent);
> > -       device_lock(dev);
> > -       if (!dev->dead && !dev->driver)
> > -               driver_probe_device(drv, dev);
> > -       device_unlock(dev);
> > -       if (dev->parent && dev->bus->need_parent_lock)
> > -               device_unlock(dev->parent);
> > +       device_driver_attach(drv, dev);
> > 
> >         return 0;
> >  }
> > @@ -932,15 +983,11 @@ static void __device_release_driver(struct device *dev, struct device *parent)
> >         drv = dev->driver;
> >         if (drv) {
> >                 while (device_links_busy(dev)) {
> > -                       device_unlock(dev);
> > -                       if (parent)
> > -                               device_unlock(parent);
> > +                       __device_driver_unlock(dev, parent);
> > 
> >                         device_links_unbind_consumers(dev);
> > -                       if (parent)
> > -                               device_lock(parent);
> > 
> > -                       device_lock(dev);
> > +                       __device_driver_lock(dev, parent);
> >                         /*
> >                          * A concurrent invocation of the same function might
> >                          * have released the driver successfully while this one
> 
> This change will clash with
> https://patchwork.kernel.org/patch/10729571/ somewhat which really
> should go it first as a stable-candidate fix.
> 
> I would suggest rebasing on top of that one, unless Greg has different
> plans here.

Greg, any opinion here? From what I can tell the patch pointed out and
this one both address the same issue. The only real difference is that
the patch called out would be a better stable-candidate fix.

I'm just wondering if the patch set is good as is or if I need to be
submitting a v10 w/ rebase on top of this patch.

Thanks.

- Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
@ 2018-12-19 14:27   ` Rafael J. Wysocki
  2018-12-20 15:28     ` Greg Kroah-Hartman
  2019-01-18 15:54   ` Greg KH
  1 sibling, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2018-12-19 14:27 UTC (permalink / raw)
  To: alexander.h.duyck
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Luis R. Rodriguez,
	linux-nvdimm, Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan,
	Rafael J. Wysocki, Len Brown, Pavel Machek, zwisler,
	Dan Williams, dave.jiang, bvanassche

On Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
<alexander.h.duyck@linux.intel.com> wrote:
>
> Add an additional bit flag to the device struct named "dead".
>
> This additional flag provides a guarantee that when a device_del is
> executed on a given interface an async worker will not attempt to attach
> the driver following the earlier device_del call. Previously this
> guarantee was not present and could result in the device_del call
> attempting to remove a driver from an interface only to have the async
> worker attempt to probe the driver later when it finally completes the
> asynchronous probe call.
>
> One additional change added was that I pulled the check for dev->driver
> out of the __device_attach_driver call and instead placed it in the
> __device_attach_async_helper call. This was motivated by the fact that the
> only other caller of this, __device_attach, had already taken the
> device_lock() and checked for dev->driver. Instead of testing for this
> twice in this path it makes more sense to just consolidate the dev->dead
> and dev->driver checks together into one set of checks.
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
>  drivers/base/core.c    |   11 +++++++++++
>  drivers/base/dd.c      |   22 +++++++++++-----------
>  include/linux/device.h |    5 +++++
>  3 files changed, 27 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 0073b09bb99f..950e25495726 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2080,6 +2080,17 @@ void device_del(struct device *dev)
>         struct kobject *glue_dir = NULL;
>         struct class_interface *class_intf;
>
> +       /*
> +        * Hold the device lock and set the "dead" flag to guarantee that
> +        * the update behavior is consistent with the other bitfields near
> +        * it and that we cannot have an asynchronous probe routine trying
> +        * to run while we are tearing out the bus/class/sysfs from
> +        * underneath the device.
> +        */
> +       device_lock(dev);
> +       dev->dead = true;
> +       device_unlock(dev);
> +
>         /* Notify clients of device removal.  This call must come
>          * before dpm_sysfs_remove().
>          */
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 88713f182086..74c194ac99df 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -731,15 +731,6 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
>         bool async_allowed;
>         int ret;
>
> -       /*
> -        * Check if device has already been claimed. This may
> -        * happen with driver loading, device discovery/registration,
> -        * and deferred probe processing happens all at once with
> -        * multiple threads.
> -        */
> -       if (dev->driver)
> -               return -EBUSY;
> -
>         ret = driver_match_device(drv, dev);
>         if (ret == 0) {
>                 /* no match */
> @@ -774,6 +765,15 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
>
>         device_lock(dev);
>
> +       /*
> +        * Check if device has already been removed or claimed. This may
> +        * happen with driver loading, device discovery/registration,
> +        * and deferred probe processing happens all at once with
> +        * multiple threads.
> +        */
> +       if (dev->dead || dev->driver)
> +               goto out_unlock;
> +
>         if (dev->parent)
>                 pm_runtime_get_sync(dev->parent);
>
> @@ -784,7 +784,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
>
>         if (dev->parent)
>                 pm_runtime_put(dev->parent);
> -
> +out_unlock:
>         device_unlock(dev);
>
>         put_device(dev);
> @@ -897,7 +897,7 @@ static int __driver_attach(struct device *dev, void *data)
>         if (dev->parent && dev->bus->need_parent_lock)
>                 device_lock(dev->parent);
>         device_lock(dev);
> -       if (!dev->driver)
> +       if (!dev->dead && !dev->driver)
>                 driver_probe_device(drv, dev);
>         device_unlock(dev);
>         if (dev->parent && dev->bus->need_parent_lock)
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 1b25c7a43f4c..f73dad81e811 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -957,6 +957,10 @@ struct dev_links_info {
>   *              device.
>   * @dma_coherent: this particular device is dma coherent, even if the
>   *             architecture supports non-coherent devices.
> + * @dead:      This device is currently either in the process of or has
> + *             been removed from the system. Any asynchronous events
> + *             scheduled for this device should exit without taking any
> + *             action.
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -1051,6 +1055,7 @@ struct device {
>      defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
>         bool                    dma_coherent:1;
>  #endif
> +       bool                    dead:1;
>  };
>
>  static inline struct device *kobj_to_dev(struct kobject *kobj)
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2018-12-19 14:27   ` Rafael J. Wysocki
@ 2018-12-20 15:28     ` Greg Kroah-Hartman
  2019-01-10 17:37       ` Alexander Duyck
  0 siblings, 1 reply; 18+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-20 15:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: alexander.h.duyck, Linux Kernel Mailing List, Luis R. Rodriguez,
	linux-nvdimm, Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan,
	Len Brown, Pavel Machek, zwisler, Dan Williams, dave.jiang,
	bvanassche

On Wed, Dec 19, 2018 at 03:27:48PM +0100, Rafael J. Wysocki wrote:
> On Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
> <alexander.h.duyck@linux.intel.com> wrote:
> >
> > Add an additional bit flag to the device struct named "dead".
> >
> > This additional flag provides a guarantee that when a device_del is
> > executed on a given interface an async worker will not attempt to attach
> > the driver following the earlier device_del call. Previously this
> > guarantee was not present and could result in the device_del call
> > attempting to remove a driver from an interface only to have the async
> > worker attempt to probe the driver later when it finally completes the
> > asynchronous probe call.
> >
> > One additional change added was that I pulled the check for dev->driver
> > out of the __device_attach_driver call and instead placed it in the
> > __device_attach_async_helper call. This was motivated by the fact that the
> > only other caller of this, __device_attach, had already taken the
> > device_lock() and checked for dev->driver. Instead of testing for this
> > twice in this path it makes more sense to just consolidate the dev->dead
> > and dev->driver checks together into one set of checks.
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> 
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It's too late for 4.21-rc1 as my tree should be closed by now.

So I'll hold on to these in my queue until 4.21-rc1 is out and then
queue them up and see what breaks in linux-next :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2018-12-20 15:28     ` Greg Kroah-Hartman
@ 2019-01-10 17:37       ` Alexander Duyck
  2019-01-18 15:50         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2019-01-10 17:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Luis R. Rodriguez, linux-nvdimm,
	Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan, Len Brown,
	Pavel Machek, zwisler, Dan Williams, dave.jiang, bvanassche

On Thu, 2018-12-20 at 16:28 +0100, Greg Kroah-Hartman wrote:
> On Wed, Dec 19, 2018 at 03:27:48PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
> > <alexander.h.duyck@linux.intel.com> wrote:
> > > 
> > > Add an additional bit flag to the device struct named "dead".
> > > 
> > > This additional flag provides a guarantee that when a device_del is
> > > executed on a given interface an async worker will not attempt to attach
> > > the driver following the earlier device_del call. Previously this
> > > guarantee was not present and could result in the device_del call
> > > attempting to remove a driver from an interface only to have the async
> > > worker attempt to probe the driver later when it finally completes the
> > > asynchronous probe call.
> > > 
> > > One additional change added was that I pulled the check for dev->driver
> > > out of the __device_attach_driver call and instead placed it in the
> > > __device_attach_async_helper call. This was motivated by the fact that the
> > > only other caller of this, __device_attach, had already taken the
> > > device_lock() and checked for dev->driver. Instead of testing for this
> > > twice in this path it makes more sense to just consolidate the dev->dead
> > > and dev->driver checks together into one set of checks.
> > > 
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > 
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It's too late for 4.21-rc1 as my tree should be closed by now.
> 
> So I'll hold on to these in my queue until 4.21-rc1 is out and then
> queue them up and see what breaks in linux-next :)
> 
> thanks,
> 
> greg k-h

I just wanted to check on on this patch set in terms of workflow. Since
it looks like we now have 5.0-rc1 out I was wondering what the ETA for
this patch set being pulled was, or if I need to resubmit the set.

Thanks.

- Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2019-01-10 17:37       ` Alexander Duyck
@ 2019-01-18 15:50         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 18+ messages in thread
From: Greg Kroah-Hartman @ 2019-01-18 15:50 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Luis R. Rodriguez,
	linux-nvdimm, Tejun Heo, Andrew Morton, Linux PM, Lai Jiangshan,
	Len Brown, Pavel Machek, zwisler, Dan Williams, dave.jiang,
	bvanassche

On Thu, Jan 10, 2019 at 09:37:32AM -0800, Alexander Duyck wrote:
> On Thu, 2018-12-20 at 16:28 +0100, Greg Kroah-Hartman wrote:
> > On Wed, Dec 19, 2018 at 03:27:48PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck
> > > <alexander.h.duyck@linux.intel.com> wrote:
> > > > 
> > > > Add an additional bit flag to the device struct named "dead".
> > > > 
> > > > This additional flag provides a guarantee that when a device_del is
> > > > executed on a given interface an async worker will not attempt to attach
> > > > the driver following the earlier device_del call. Previously this
> > > > guarantee was not present and could result in the device_del call
> > > > attempting to remove a driver from an interface only to have the async
> > > > worker attempt to probe the driver later when it finally completes the
> > > > asynchronous probe call.
> > > > 
> > > > One additional change added was that I pulled the check for dev->driver
> > > > out of the __device_attach_driver call and instead placed it in the
> > > > __device_attach_async_helper call. This was motivated by the fact that the
> > > > only other caller of this, __device_attach, had already taken the
> > > > device_lock() and checked for dev->driver. Instead of testing for this
> > > > twice in this path it makes more sense to just consolidate the dev->dead
> > > > and dev->driver checks together into one set of checks.
> > > > 
> > > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > > 
> > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > It's too late for 4.21-rc1 as my tree should be closed by now.
> > 
> > So I'll hold on to these in my queue until 4.21-rc1 is out and then
> > queue them up and see what breaks in linux-next :)
> > 
> > thanks,
> > 
> > greg k-h
> 
> I just wanted to check on on this patch set in terms of workflow. Since
> it looks like we now have 5.0-rc1 out I was wondering what the ETA for
> this patch set being pulled was, or if I need to resubmit the set.

I'm reviewing it now, no need to resend...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
  2018-12-19 14:27   ` Rafael J. Wysocki
@ 2019-01-18 15:54   ` Greg KH
  2019-01-18 19:00     ` Alexander Duyck
  1 sibling, 1 reply; 18+ messages in thread
From: Greg KH @ 2019-01-18 15:54 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: linux-kernel, mcgrof, linux-nvdimm, tj, akpm, linux-pm,
	jiangshanlai, rafael, len.brown, pavel, zwisler, dan.j.williams,
	dave.jiang, bvanassche

On Wed, Dec 12, 2018 at 04:44:58PM -0800, Alexander Duyck wrote:
> Add an additional bit flag to the device struct named "dead".
> 
> This additional flag provides a guarantee that when a device_del is
> executed on a given interface an async worker will not attempt to attach
> the driver following the earlier device_del call. Previously this
> guarantee was not present and could result in the device_del call
> attempting to remove a driver from an interface only to have the async
> worker attempt to probe the driver later when it finally completes the
> asynchronous probe call.
> 
> One additional change added was that I pulled the check for dev->driver
> out of the __device_attach_driver call and instead placed it in the
> __device_attach_async_helper call. This was motivated by the fact that the
> only other caller of this, __device_attach, had already taken the
> device_lock() and checked for dev->driver. Instead of testing for this
> twice in this path it makes more sense to just consolidate the dev->dead
> and dev->driver checks together into one set of checks.
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/core.c    |   11 +++++++++++
>  drivers/base/dd.c      |   22 +++++++++++-----------
>  include/linux/device.h |    5 +++++
>  3 files changed, 27 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 0073b09bb99f..950e25495726 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2080,6 +2080,17 @@ void device_del(struct device *dev)
>  	struct kobject *glue_dir = NULL;
>  	struct class_interface *class_intf;
>  
> +	/*
> +	 * Hold the device lock and set the "dead" flag to guarantee that
> +	 * the update behavior is consistent with the other bitfields near
> +	 * it and that we cannot have an asynchronous probe routine trying
> +	 * to run while we are tearing out the bus/class/sysfs from
> +	 * underneath the device.
> +	 */
> +	device_lock(dev);
> +	dev->dead = true;
> +	device_unlock(dev);
> +
>  	/* Notify clients of device removal.  This call must come
>  	 * before dpm_sysfs_remove().
>  	 */
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 88713f182086..74c194ac99df 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -731,15 +731,6 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
>  	bool async_allowed;
>  	int ret;
>  
> -	/*
> -	 * Check if device has already been claimed. This may
> -	 * happen with driver loading, device discovery/registration,
> -	 * and deferred probe processing happens all at once with
> -	 * multiple threads.
> -	 */
> -	if (dev->driver)
> -		return -EBUSY;
> -
>  	ret = driver_match_device(drv, dev);
>  	if (ret == 0) {
>  		/* no match */
> @@ -774,6 +765,15 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
>  
>  	device_lock(dev);
>  
> +	/*
> +	 * Check if device has already been removed or claimed. This may
> +	 * happen with driver loading, device discovery/registration,
> +	 * and deferred probe processing happens all at once with
> +	 * multiple threads.
> +	 */
> +	if (dev->dead || dev->driver)
> +		goto out_unlock;
> +
>  	if (dev->parent)
>  		pm_runtime_get_sync(dev->parent);
>  
> @@ -784,7 +784,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
>  
>  	if (dev->parent)
>  		pm_runtime_put(dev->parent);
> -
> +out_unlock:
>  	device_unlock(dev);
>  
>  	put_device(dev);
> @@ -897,7 +897,7 @@ static int __driver_attach(struct device *dev, void *data)
>  	if (dev->parent && dev->bus->need_parent_lock)
>  		device_lock(dev->parent);
>  	device_lock(dev);
> -	if (!dev->driver)
> +	if (!dev->dead && !dev->driver)
>  		driver_probe_device(drv, dev);
>  	device_unlock(dev);
>  	if (dev->parent && dev->bus->need_parent_lock)
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 1b25c7a43f4c..f73dad81e811 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -957,6 +957,10 @@ struct dev_links_info {
>   *              device.
>   * @dma_coherent: this particular device is dma coherent, even if the
>   *		architecture supports non-coherent devices.
> + * @dead:	This device is currently either in the process of or has
> + *		been removed from the system. Any asynchronous events
> + *		scheduled for this device should exit without taking any
> + *		action.
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -1051,6 +1055,7 @@ struct device {
>      defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
>  	bool			dma_coherent:1;
>  #endif
> +	bool			dead:1;

This really should live in the struct device_private structure, as
nothing outside of the driver core should care about this, or touch it.

A number of other bitfields should also move there as well, your's isn't
the only one that I missed this for.

So can you make that quick change, and rebase (you needed to for patch 2
anyway), and resend so I can get this into my -next tree for people to
start testing and basing their work on?

sorry this has taken so long, and thanks for sticking with it.

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag
  2019-01-18 15:54   ` Greg KH
@ 2019-01-18 19:00     ` Alexander Duyck
  0 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2019-01-18 19:00 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mcgrof, linux-nvdimm, tj, akpm, linux-pm,
	jiangshanlai, rafael, len.brown, pavel, zwisler, dan.j.williams,
	dave.jiang, bvanassche

On Fri, 2019-01-18 at 16:54 +0100, Greg KH wrote:
> On Wed, Dec 12, 2018 at 04:44:58PM -0800, Alexander Duyck wrote:
> > Add an additional bit flag to the device struct named "dead".
> > 
> > This additional flag provides a guarantee that when a device_del is
> > executed on a given interface an async worker will not attempt to attach
> > the driver following the earlier device_del call. Previously this
> > guarantee was not present and could result in the device_del call
> > attempting to remove a driver from an interface only to have the async
> > worker attempt to probe the driver later when it finally completes the
> > asynchronous probe call.
> > 
> > One additional change added was that I pulled the check for dev->driver
> > out of the __device_attach_driver call and instead placed it in the
> > __device_attach_async_helper call. This was motivated by the fact that the
> > only other caller of this, __device_attach, had already taken the
> > device_lock() and checked for dev->driver. Instead of testing for this
> > twice in this path it makes more sense to just consolidate the dev->dead
> > and dev->driver checks together into one set of checks.
> > 
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/base/core.c    |   11 +++++++++++
> >  drivers/base/dd.c      |   22 +++++++++++-----------
> >  include/linux/device.h |    5 +++++
> >  3 files changed, 27 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 0073b09bb99f..950e25495726 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -2080,6 +2080,17 @@ void device_del(struct device *dev)
> >  	struct kobject *glue_dir = NULL;
> >  	struct class_interface *class_intf;
> >  
> > +	/*
> > +	 * Hold the device lock and set the "dead" flag to guarantee that
> > +	 * the update behavior is consistent with the other bitfields near
> > +	 * it and that we cannot have an asynchronous probe routine trying
> > +	 * to run while we are tearing out the bus/class/sysfs from
> > +	 * underneath the device.
> > +	 */
> > +	device_lock(dev);
> > +	dev->dead = true;
> > +	device_unlock(dev);
> > +
> >  	/* Notify clients of device removal.  This call must come
> >  	 * before dpm_sysfs_remove().
> >  	 */
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 88713f182086..74c194ac99df 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -731,15 +731,6 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
> >  	bool async_allowed;
> >  	int ret;
> >  
> > -	/*
> > -	 * Check if device has already been claimed. This may
> > -	 * happen with driver loading, device discovery/registration,
> > -	 * and deferred probe processing happens all at once with
> > -	 * multiple threads.
> > -	 */
> > -	if (dev->driver)
> > -		return -EBUSY;
> > -
> >  	ret = driver_match_device(drv, dev);
> >  	if (ret == 0) {
> >  		/* no match */
> > @@ -774,6 +765,15 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> >  
> >  	device_lock(dev);
> >  
> > +	/*
> > +	 * Check if device has already been removed or claimed. This may
> > +	 * happen with driver loading, device discovery/registration,
> > +	 * and deferred probe processing happens all at once with
> > +	 * multiple threads.
> > +	 */
> > +	if (dev->dead || dev->driver)
> > +		goto out_unlock;
> > +
> >  	if (dev->parent)
> >  		pm_runtime_get_sync(dev->parent);
> >  
> > @@ -784,7 +784,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> >  
> >  	if (dev->parent)
> >  		pm_runtime_put(dev->parent);
> > -
> > +out_unlock:
> >  	device_unlock(dev);
> >  
> >  	put_device(dev);
> > @@ -897,7 +897,7 @@ static int __driver_attach(struct device *dev, void *data)
> >  	if (dev->parent && dev->bus->need_parent_lock)
> >  		device_lock(dev->parent);
> >  	device_lock(dev);
> > -	if (!dev->driver)
> > +	if (!dev->dead && !dev->driver)
> >  		driver_probe_device(drv, dev);
> >  	device_unlock(dev);
> >  	if (dev->parent && dev->bus->need_parent_lock)
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 1b25c7a43f4c..f73dad81e811 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -957,6 +957,10 @@ struct dev_links_info {
> >   *              device.
> >   * @dma_coherent: this particular device is dma coherent, even if the
> >   *		architecture supports non-coherent devices.
> > + * @dead:	This device is currently either in the process of or has
> > + *		been removed from the system. Any asynchronous events
> > + *		scheduled for this device should exit without taking any
> > + *		action.
> >   *
> >   * At the lowest level, every device in a Linux system is represented by an
> >   * instance of struct device. The device structure contains the information
> > @@ -1051,6 +1055,7 @@ struct device {
> >      defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
> >  	bool			dma_coherent:1;
> >  #endif
> > +	bool			dead:1;
> 
> This really should live in the struct device_private structure, as
> nothing outside of the driver core should care about this, or touch it.
> 
> A number of other bitfields should also move there as well, your's isn't
> the only one that I missed this for.
> 
> So can you make that quick change, and rebase (you needed to for patch 2
> anyway), and resend so I can get this into my -next tree for people to
> start testing and basing their work on?
> 
> sorry this has taken so long, and thanks for sticking with it.
> 
> greg k-h

Okay. I will try to work it into my schedule and hopefully have the
updated patches ready sometime early next week.

Thanks.

- Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-01-18 19:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-13  0:44 [driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls Alexander Duyck
2018-12-13  0:44 ` [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag Alexander Duyck
2018-12-19 14:27   ` Rafael J. Wysocki
2018-12-20 15:28     ` Greg Kroah-Hartman
2019-01-10 17:37       ` Alexander Duyck
2019-01-18 15:50         ` Greg Kroah-Hartman
2019-01-18 15:54   ` Greg KH
2019-01-18 19:00     ` Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device Alexander Duyck
2018-12-14 10:40   ` Rafael J. Wysocki
2018-12-17 16:31     ` Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 3/9] driver core: Probe devices asynchronously instead of the driver Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 5/9] async: Add support for queueing on specific " Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 6/9] driver core: Attach devices on CPU local to device node Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 7/9] PM core: Use new async_schedule_dev command Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 8/9] libnvdimm: Schedule device registration on node local to the device Alexander Duyck
2018-12-13  0:45 ` [driver-core PATCH v9 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity Alexander Duyck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).