All of lore.kernel.org
 help / color / mirror / Atom feed
* Make NVME shutdown async - version 2
@ 2023-12-15  0:03 Jeremy Allison
  2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-15  0:03 UTC (permalink / raw)
  To: jallison, jra, tansuresh, hch; +Cc: linux-nvme, gregkh, rafael, bhelgaas

This is version 2 of a rebased update and resend of a patchset
originally written by Tanjore Suresh <tansuresh@google.com> to
make shutdown of nvme devices asynchronous. Minor changes
made to use an enum shutdown_type instead of an
integer flag.

Changes from version 1:

As requested by Sagi Grimberg <sagi@grimberg.me>, ensure
that shutdown_pre is only called if shutdown_post is
also defined.

-------------------------------------------------------------
Currently the Linux nvme driver shutdown code steps
through each connected drive, sets the NVME_CC_SHN_NORMAL
(normal shutdown) flag and then polls the given drive
waiting for the response NVME_CSTS_SHST_CMPLT flag
(shutdown complete).

Each drive is taking around 13 seconds to respond to this.

The customer has 20+ drives on the box so this time adds
up on shutdown when the nvme driver is being shut down.

This patchset changes shutdown to proceed in parallel,
so the NVME_CC_SHN_NORMAL (normal shutdown) flag is
sent to all drives first, and then it polls waiting
for the NVME_CSTS_SHST_CMPLT flag (shutdown complete)
for all drives.

In the specific customer case it reduces the NVME
shutdown time from over 300 seconds to around 15
seconds.
-------------------------------------------------------------

Thanks for your consideration,

Jeremy Allison.
CIQ / Samba Team.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-15  0:03 Make NVME shutdown async - version 2 Jeremy Allison
@ 2023-12-15  0:03 ` Jeremy Allison
  2023-12-15 12:21   ` Greg KH
  2023-12-19  5:33   ` Christoph Hellwig
  2023-12-15  0:03 ` [PATCH 2/3] PCI: Support asynchronous shutdown Jeremy Allison
  2023-12-15  0:03 ` [PATCH 3/3] nvme: Add async shutdown support Jeremy Allison
  2 siblings, 2 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-15  0:03 UTC (permalink / raw)
  To: jallison, jra, tansuresh, hch; +Cc: linux-nvme, gregkh, rafael, bhelgaas

From: Tanjore Suresh <tansuresh@google.com>

This changes the bus driver interface with additional entry points
to enable devices to implement asynchronous shutdown. The existing
synchronous interface to shutdown is unmodified and retained for
backward compatibility. shutdown_pre is only called if a matching
shutdown_post function is also registered, otherwise the synchronous
synchronous interface is used.

This changes the common device shutdown code to enable devices to
participate in asynchronous shutdown implementation.

Signed-off-by: Tanjore Suresh <tansuresh@google.com>
Signed-off-by: Jeremy Allison <jallison@ciq.com>
---
 drivers/base/core.c        | 41 +++++++++++++++++++++++++++++++++++++-
 include/linux/device/bus.h | 11 ++++++++++
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 67ba592afc77..a842d402a088 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4725,6 +4725,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
 void device_shutdown(void)
 {
 	struct device *dev, *parent;
+	LIST_HEAD(async_shutdown_list);
 
 	wait_for_device_probe();
 	device_block_probing();
@@ -4769,7 +4770,16 @@ void device_shutdown(void)
 				dev_info(dev, "shutdown_pre\n");
 			dev->class->shutdown_pre(dev);
 		}
-		if (dev->bus && dev->bus->shutdown) {
+
+		/* Only call shutdown_pre if a shutdown_post is also defined. */
+		if (dev->bus && dev->bus->shutdown_pre &&
+				dev->bus->shutdown_post) {
+			if (initcall_debug)
+				dev_info(dev, "shutdown_pre\n");
+			dev->bus->shutdown_pre(dev);
+			list_add(&dev->kobj.entry,
+				&async_shutdown_list);
+		} else if (dev->bus && dev->bus->shutdown) {
 			if (initcall_debug)
 				dev_info(dev, "shutdown\n");
 			dev->bus->shutdown(dev);
@@ -4789,6 +4799,35 @@ void device_shutdown(void)
 		spin_lock(&devices_kset->list_lock);
 	}
 	spin_unlock(&devices_kset->list_lock);
+
+	/*
+	 * Second pass spin for only devices, that have configured
+	 * Asynchronous shutdown.
+	 */
+	while (!list_empty(&async_shutdown_list)) {
+		dev = list_entry(async_shutdown_list.next, struct device,
+				kobj.entry);
+		parent = get_device(dev->parent);
+		get_device(dev);
+		/*
+		 * Make sure the device is off the  list
+		 */
+		list_del_init(&dev->kobj.entry);
+		if (parent)
+			device_lock(parent);
+		device_lock(dev);
+		if (dev->bus && dev->bus->shutdown_post) {
+			if (initcall_debug)
+				dev_info(dev,
+				"shutdown_post called\n");
+			dev->bus->shutdown_post(dev);
+		}
+		device_unlock(dev);
+		if (parent)
+			device_unlock(parent);
+		put_device(dev);
+		put_device(parent);
+	}
 }
 
 /*
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index ae10c4322754..d49dae1a280c 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -48,6 +48,15 @@ struct fwnode_handle;
  *		will never get called until they do.
  * @remove:	Called when a device removed from this bus.
  * @shutdown:	Called at shut-down time to quiesce the device.
+ * @shutdown_pre:	Called at the shutdown-time to start the shutdown
+ *			process on the device. This entry point will be called
+ *			only when the bus driver has indicated it would like
+ *			to participate in asynchronous shutdown completion
+ *			and has also defined a shutdown_post function.
+ * @shutdown_post:	Called at shutdown-time  to complete the shutdown
+ *			process of the device. This entry point will be called
+ *			only when the bus drive has indicated it would like to
+ *			participate in the asynchronous shutdown completion.
  *
  * @online:	Called to put the device back online (after offlining it).
  * @offline:	Called to put the device offline for hot-removal. May fail.
@@ -90,6 +99,8 @@ struct bus_type {
 	void (*sync_state)(struct device *dev);
 	void (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
+	void (*shutdown_pre)(struct device *dev);
+	void (*shutdown_post)(struct device *dev);
 
 	int (*online)(struct device *dev);
 	int (*offline)(struct device *dev);
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] PCI: Support asynchronous shutdown
  2023-12-15  0:03 Make NVME shutdown async - version 2 Jeremy Allison
  2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
@ 2023-12-15  0:03 ` Jeremy Allison
  2023-12-15  0:03 ` [PATCH 3/3] nvme: Add async shutdown support Jeremy Allison
  2 siblings, 0 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-15  0:03 UTC (permalink / raw)
  To: jallison, jra, tansuresh, hch; +Cc: linux-nvme, gregkh, rafael, bhelgaas

From: Tanjore Suresh <tansuresh@google.com>

Enhances the base PCI driver to add support for asynchronous
shutdown. Adds shutdown_pre/shutdown_post callbacks only
called in preference to shutdown if both are defined.

Assume a device takes n secs to shutdown. If a machine has been
populated with M such devices, the total time spent in shutting down
all the devices will be M * n secs, if the shutdown is done
synchronously. For example, if NVMe PCI Controllers take 5 secs
to shutdown and if there are 16 such NVMe controllers in a system,
system will spend a total of 80 secs to shutdown all
NVMe devices in that system.

In order to speed up the shutdown time, asynchronous interface to
shutdown has been implemented. This will significantly reduce
the machine reboot time.

Signed-off-by: Tanjore Suresh <tansuresh@google.com>
Signed-off-by: Jeremy Allison <jallison@ciq.com>
---
 drivers/pci/pci-driver.c | 18 +++++++++++++++---
 include/linux/pci.h      |  6 ++++++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 51ec9e7e784f..865316e5236b 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -502,14 +502,17 @@ static void pci_device_remove(struct device *dev)
 	pci_dev_put(pci_dev);
 }
 
-static void pci_device_shutdown(struct device *dev)
+static void pci_device_shutdown_pre(struct device *dev)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	struct pci_driver *drv = pci_dev->driver;
 
 	pm_runtime_resume(dev);
 
-	if (drv && drv->shutdown)
+	/* Only call shutdown_pre if shutdown_post is also defined. */
+	if (drv && drv->shutdown_pre && drv->shutdown_post)
+		drv->shutdown_pre(pci_dev);
+	else if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
 
 	/*
@@ -547,6 +550,14 @@ static int pci_restore_standard_config(struct pci_dev *pci_dev)
 }
 #endif /* CONFIG_PM_SLEEP */
 
+static void pci_device_shutdown_post(struct device *dev)
+{
+	struct pci_dev *pci_dev = to_pci_dev(dev);
+	struct pci_driver *drv = pci_dev->driver;
+
+	if (drv && drv->shutdown_post)
+		drv->shutdown_post(pci_dev);
+}
 #ifdef CONFIG_PM
 
 /* Auxiliary functions used for system resume and run-time resume */
@@ -1681,7 +1692,8 @@ struct bus_type pci_bus_type = {
 	.uevent		= pci_uevent,
 	.probe		= pci_device_probe,
 	.remove		= pci_device_remove,
-	.shutdown	= pci_device_shutdown,
+	.shutdown_pre	= pci_device_shutdown_pre,
+	.shutdown_post	= pci_device_shutdown_post,
 	.dev_groups	= pci_dev_groups,
 	.bus_groups	= pci_bus_groups,
 	.drv_groups	= pci_drv_groups,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 60ca768bc867..a25d1a3a3764 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -917,6 +917,10 @@ struct module;
  *		Useful for enabling wake-on-lan (NIC) or changing
  *		the power state of a device before reboot.
  *		e.g. drivers/net/e100.c.
+ * @shutdown_pre: Optional driver callback to allow asynchronous
+ *              shutdown request. Called instead of shutdown only
+ *              if shutdown_post is also defined.
+ * @shutdown_post: Matching driver callback to shutdown_pre.
  * @sriov_configure: Optional driver callback to allow configuration of
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
  * @sriov_set_msix_vec_count: PF Driver callback to change number of MSI-X
@@ -948,6 +952,8 @@ struct pci_driver {
 	int  (*suspend)(struct pci_dev *dev, pm_message_t state);	/* Device suspended */
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
+	void (*shutdown_pre)(struct pci_dev *dev);
+	void (*shutdown_post)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
 	int  (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */
 	u32  (*sriov_get_vf_total_msix)(struct pci_dev *pf);
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] nvme: Add async shutdown support
  2023-12-15  0:03 Make NVME shutdown async - version 2 Jeremy Allison
  2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
  2023-12-15  0:03 ` [PATCH 2/3] PCI: Support asynchronous shutdown Jeremy Allison
@ 2023-12-15  0:03 ` Jeremy Allison
  2023-12-19  5:43   ` Christoph Hellwig
  2 siblings, 1 reply; 20+ messages in thread
From: Jeremy Allison @ 2023-12-15  0:03 UTC (permalink / raw)
  To: jallison, jra, tansuresh, hch; +Cc: linux-nvme, gregkh, rafael, bhelgaas

From: Tanjore Suresh <tansuresh@google.com>

This works with the asynchronous shutdown mechanism setup for the PCI
drivers and participates to provide both pre and post shutdown
routines at pci_driver structure level.

The shutdown_pre routine starts the shutdown and does not wait for the
shutdown to complete.  The shutdown_post routine waits for the shutdown
to complete on individual controllers that this driver instance
controls. This mechanism optimizes to speed up the shutdown in a
system which host many controllers.

Signed-off-by: Tanjore Suresh <tansuresh@google.com>
Signed-off-by: Jeremy Allison <jallison@ciq.com>
---
 drivers/nvme/host/core.c   | 31 +++++++++++++++--
 drivers/nvme/host/nvme.h   |  9 ++++-
 drivers/nvme/host/pci.c    | 68 ++++++++++++++++++++++++--------------
 drivers/nvme/host/rdma.c   |  3 +-
 drivers/nvme/host/tcp.c    |  2 +-
 drivers/nvme/target/loop.c |  2 +-
 6 files changed, 84 insertions(+), 31 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 590cd4f097c2..45645af41586 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2201,12 +2201,12 @@ static int nvme_wait_ready(struct nvme_ctrl *ctrl, u32 mask, u32 val,
 	return ret;
 }
 
-int nvme_disable_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
+int nvme_disable_ctrl(struct nvme_ctrl *ctrl, enum shutdown_type shutdown_type)
 {
 	int ret;
 
 	ctrl->ctrl_config &= ~NVME_CC_SHN_MASK;
-	if (shutdown)
+	if (shutdown_type != DO_NOT_SHUTDOWN)
 		ctrl->ctrl_config |= NVME_CC_SHN_NORMAL;
 	else
 		ctrl->ctrl_config &= ~NVME_CC_ENABLE;
@@ -2215,10 +2215,24 @@ int nvme_disable_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
 	if (ret)
 		return ret;
 
-	if (shutdown) {
+	switch (shutdown_type) {
+	case SHUTDOWN_TYPE_ASYNC:
+		/*
+		 * nvme_wait_for_shutdown_cmpl() will read the reply for this.
+		 */
+		return ret;
+	case SHUTDOWN_TYPE_SYNC:
+		/*
+		 * Spin on the read of the control register.
+		 */
 		return nvme_wait_ready(ctrl, NVME_CSTS_SHST_MASK,
 				       NVME_CSTS_SHST_CMPLT,
 				       ctrl->shutdown_timeout, "shutdown");
+	case DO_NOT_SHUTDOWN:
+		/*
+		 * Doing a reset here. Handle below.
+		 */
+		break;
 	}
 	if (ctrl->quirks & NVME_QUIRK_DELAY_BEFORE_CHK_RDY)
 		msleep(NVME_QUIRK_DELAY_AMOUNT);
@@ -2227,6 +2241,17 @@ int nvme_disable_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
 }
 EXPORT_SYMBOL_GPL(nvme_disable_ctrl);
 
+int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl)
+{
+	ctrl->ctrl_config &= ~NVME_CC_SHN_MASK;
+	ctrl->ctrl_config |= NVME_CC_SHN_NORMAL;
+
+	return nvme_wait_ready(ctrl, NVME_CSTS_SHST_MASK,
+			       NVME_CSTS_SHST_CMPLT,
+			       ctrl->shutdown_timeout, "shutdown");
+}
+EXPORT_SYMBOL_GPL(nvme_wait_for_shutdown_cmpl);
+
 int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
 {
 	unsigned dev_page_min;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 39a90b7cb125..1ce034185000 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -186,6 +186,12 @@ enum {
 	NVME_MPATH_IO_STATS		= (1 << 2),
 };
 
+enum shutdown_type {
+	DO_NOT_SHUTDOWN = 0,
+	SHUTDOWN_TYPE_SYNC = 1,
+	SHUTDOWN_TYPE_ASYNC = 2,
+};
+
 static inline struct nvme_request *nvme_req(struct request *req)
 {
 	return blk_mq_rq_to_pdu(req);
@@ -746,7 +752,8 @@ void nvme_cancel_tagset(struct nvme_ctrl *ctrl);
 void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl);
 bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		enum nvme_ctrl_state new_state);
-int nvme_disable_ctrl(struct nvme_ctrl *ctrl, bool shutdown);
+int nvme_disable_ctrl(struct nvme_ctrl *ctrl, enum shutdown_type shutdown_type);
+int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl);
 int nvme_enable_ctrl(struct nvme_ctrl *ctrl);
 int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 		const struct nvme_ctrl_ops *ops, unsigned long quirks);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 507bc149046d..5379ce6dd21a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -108,7 +108,7 @@ MODULE_PARM_DESC(noacpi, "disable acpi bios quirks");
 struct nvme_dev;
 struct nvme_queue;
 
-static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown);
+static void nvme_dev_disable(struct nvme_dev *dev, enum shutdown_type shutdown_type);
 static void nvme_delete_io_queues(struct nvme_dev *dev);
 static void nvme_update_attrs(struct nvme_dev *dev);
 
@@ -1330,7 +1330,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
 			 "I/O %d QID %d timeout, disable controller\n",
 			 req->tag, nvmeq->qid);
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
-		nvme_dev_disable(dev, true);
+		nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 		return BLK_EH_DONE;
 	case NVME_CTRL_RESETTING:
 		return BLK_EH_RESET_TIMER;
@@ -1390,7 +1390,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
 	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING))
 		return BLK_EH_DONE;
 
-	nvme_dev_disable(dev, false);
+	nvme_dev_disable(dev, DO_NOT_SHUTDOWN);
 	if (nvme_try_sched_reset(&dev->ctrl))
 		nvme_unquiesce_io_queues(&dev->ctrl);
 	return BLK_EH_DONE;
@@ -1736,7 +1736,7 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
 	 * commands to the admin queue ... and we don't know what memory that
 	 * might be pointing at!
 	 */
-	result = nvme_disable_ctrl(&dev->ctrl, false);
+	result = nvme_disable_ctrl(&dev->ctrl, DO_NOT_SHUTDOWN);
 	if (result < 0)
 		return result;
 
@@ -2571,7 +2571,7 @@ static bool nvme_pci_ctrl_is_dead(struct nvme_dev *dev)
 	return (csts & NVME_CSTS_CFS) || !(csts & NVME_CSTS_RDY);
 }
 
-static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
+static void nvme_dev_disable(struct nvme_dev *dev, enum shutdown_type shutdown_type)
 {
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	bool dead;
@@ -2586,7 +2586,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 		 * Give the controller a chance to complete all entered requests
 		 * if doing a safe shutdown.
 		 */
-		if (!dead && shutdown)
+		if (!dead && (shutdown_type != DO_NOT_SHUTDOWN))
 			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
 	}
 
@@ -2594,7 +2594,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 
 	if (!dead && dev->ctrl.queue_count > 0) {
 		nvme_delete_io_queues(dev);
-		nvme_disable_ctrl(&dev->ctrl, shutdown);
+		nvme_disable_ctrl(&dev->ctrl, shutdown_type);
 		nvme_poll_irqdisable(&dev->queues[0]);
 	}
 	nvme_suspend_io_queues(dev);
@@ -2612,7 +2612,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	 * must flush all entered requests to their failed completion to avoid
 	 * deadlocking blk-mq hot-cpu notifier.
 	 */
-	if (shutdown) {
+	if (shutdown_type == SHUTDOWN_TYPE_SYNC) {
 		nvme_unquiesce_io_queues(&dev->ctrl);
 		if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q))
 			nvme_unquiesce_admin_queue(&dev->ctrl);
@@ -2620,11 +2620,11 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	mutex_unlock(&dev->shutdown_lock);
 }
 
-static int nvme_disable_prepare_reset(struct nvme_dev *dev, bool shutdown)
+static int nvme_disable_prepare_reset(struct nvme_dev *dev, enum shutdown_type shutdown_type)
 {
 	if (!nvme_wait_reset(&dev->ctrl))
 		return -EBUSY;
-	nvme_dev_disable(dev, shutdown);
+	nvme_dev_disable(dev, shutdown_type);
 	return 0;
 }
 
@@ -2702,7 +2702,7 @@ static void nvme_reset_work(struct work_struct *work)
 	 * moving on.
 	 */
 	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
-		nvme_dev_disable(dev, false);
+		nvme_dev_disable(dev, DO_NOT_SHUTDOWN);
 	nvme_sync_queues(&dev->ctrl);
 
 	mutex_lock(&dev->shutdown_lock);
@@ -2780,7 +2780,7 @@ static void nvme_reset_work(struct work_struct *work)
 	dev_warn(dev->ctrl.device, "Disabling device after reset failure: %d\n",
 		 result);
 	nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
-	nvme_dev_disable(dev, true);
+	nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 	nvme_sync_queues(&dev->ctrl);
 	nvme_mark_namespaces_dead(&dev->ctrl);
 	nvme_unquiesce_io_queues(&dev->ctrl);
@@ -3058,7 +3058,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 out_disable:
 	nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
-	nvme_dev_disable(dev, true);
+	nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 	nvme_free_host_mem(dev);
 	nvme_dev_remove_admin(dev);
 	nvme_dbbuf_dma_free(dev);
@@ -3084,7 +3084,7 @@ static void nvme_reset_prepare(struct pci_dev *pdev)
 	 * state as pci_dev device lock is held, making it impossible to race
 	 * with ->remove().
 	 */
-	nvme_disable_prepare_reset(dev, false);
+	nvme_disable_prepare_reset(dev, DO_NOT_SHUTDOWN);
 	nvme_sync_queues(&dev->ctrl);
 }
 
@@ -3096,11 +3096,30 @@ static void nvme_reset_done(struct pci_dev *pdev)
 		flush_work(&dev->ctrl.reset_work);
 }
 
-static void nvme_shutdown(struct pci_dev *pdev)
+static void nvme_shutdown_pre(struct pci_dev *pdev)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
 
-	nvme_disable_prepare_reset(dev, true);
+	nvme_disable_prepare_reset(dev, SHUTDOWN_TYPE_ASYNC);
+}
+
+static void nvme_shutdown_post(struct pci_dev *pdev)
+{
+	struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+	mutex_lock(&dev->shutdown_lock);
+	nvme_wait_for_shutdown_cmpl(&dev->ctrl);
+
+	/*
+	 * The driver will not be starting up queues again if shutting down so
+	 * must flush all entered requests to their failed completion to avoid
+	 * deadlocking blk-mq hot-cpu notifier.
+	 */
+	nvme_unquiesce_io_queues(&dev->ctrl);
+	if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q))
+		nvme_unquiesce_admin_queue(&dev->ctrl);
+
+	mutex_unlock(&dev->shutdown_lock);
 }
 
 /*
@@ -3117,13 +3136,13 @@ static void nvme_remove(struct pci_dev *pdev)
 
 	if (!pci_device_is_present(pdev)) {
 		nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD);
-		nvme_dev_disable(dev, true);
+		nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 	}
 
 	flush_work(&dev->ctrl.reset_work);
 	nvme_stop_ctrl(&dev->ctrl);
 	nvme_remove_namespaces(&dev->ctrl);
-	nvme_dev_disable(dev, true);
+	nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 	nvme_free_host_mem(dev);
 	nvme_dev_remove_admin(dev);
 	nvme_dbbuf_dma_free(dev);
@@ -3186,7 +3205,7 @@ static int nvme_suspend(struct device *dev)
 	if (pm_suspend_via_firmware() || !ctrl->npss ||
 	    !pcie_aspm_enabled(pdev) ||
 	    (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
-		return nvme_disable_prepare_reset(ndev, true);
+		return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC);
 
 	nvme_start_freeze(ctrl);
 	nvme_wait_freeze(ctrl);
@@ -3229,7 +3248,7 @@ static int nvme_suspend(struct device *dev)
 		 * Clearing npss forces a controller reset on resume. The
 		 * correct value will be rediscovered then.
 		 */
-		ret = nvme_disable_prepare_reset(ndev, true);
+		ret = nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC);
 		ctrl->npss = 0;
 	}
 unfreeze:
@@ -3241,7 +3260,7 @@ static int nvme_simple_suspend(struct device *dev)
 {
 	struct nvme_dev *ndev = pci_get_drvdata(to_pci_dev(dev));
 
-	return nvme_disable_prepare_reset(ndev, true);
+	return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC);
 }
 
 static int nvme_simple_resume(struct device *dev)
@@ -3279,10 +3298,10 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
 		dev_warn(dev->ctrl.device,
 			"frozen state error detected, reset controller\n");
 		if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) {
-			nvme_dev_disable(dev, true);
+			nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC);
 			return PCI_ERS_RESULT_DISCONNECT;
 		}
-		nvme_dev_disable(dev, false);
+		nvme_dev_disable(dev, DO_NOT_SHUTDOWN);
 		return PCI_ERS_RESULT_NEED_RESET;
 	case pci_channel_io_perm_failure:
 		dev_warn(dev->ctrl.device,
@@ -3491,7 +3510,8 @@ static struct pci_driver nvme_driver = {
 	.id_table	= nvme_id_table,
 	.probe		= nvme_probe,
 	.remove		= nvme_remove,
-	.shutdown	= nvme_shutdown,
+	.shutdown_pre	= nvme_shutdown_pre,
+	.shutdown_post	= nvme_shutdown_post,
 	.driver		= {
 		.probe_type	= PROBE_PREFER_ASYNCHRONOUS,
 #ifdef CONFIG_PM_SLEEP
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 6d178d555920..5a5bef40ed2a 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2136,7 +2136,8 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
 {
 	nvme_rdma_teardown_io_queues(ctrl, shutdown);
 	nvme_quiesce_admin_queue(&ctrl->ctrl);
-	nvme_disable_ctrl(&ctrl->ctrl, shutdown);
+	nvme_disable_ctrl(&ctrl->ctrl,
+			  shutdown ? SHUTDOWN_TYPE_SYNC : DO_NOT_SHUTDOWN);
 	nvme_rdma_teardown_admin_queue(ctrl, shutdown);
 }
 
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index d79811cfa0ce..a1e33ee14f5b 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2292,7 +2292,7 @@ static void nvme_tcp_teardown_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
 {
 	nvme_tcp_teardown_io_queues(ctrl, shutdown);
 	nvme_quiesce_admin_queue(ctrl);
-	nvme_disable_ctrl(ctrl, shutdown);
+	nvme_disable_ctrl(ctrl, shutdown ? SHUTDOWN_TYPE_SYNC : DO_NOT_SHUTDOWN);
 	nvme_tcp_teardown_admin_queue(ctrl, shutdown);
 }
 
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 9cb434c58075..9d1221c77061 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -401,7 +401,7 @@ static void nvme_loop_shutdown_ctrl(struct nvme_loop_ctrl *ctrl)
 
 	nvme_quiesce_admin_queue(&ctrl->ctrl);
 	if (ctrl->ctrl.state == NVME_CTRL_LIVE)
-		nvme_disable_ctrl(&ctrl->ctrl, true);
+		nvme_disable_ctrl(&ctrl->ctrl, SHUTDOWN_TYPE_SYNC);
 
 	nvme_cancel_admin_tagset(&ctrl->ctrl);
 	nvme_loop_destroy_admin_queue(ctrl);
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
@ 2023-12-15 12:21   ` Greg KH
  2023-12-19  5:33   ` Christoph Hellwig
  1 sibling, 0 replies; 20+ messages in thread
From: Greg KH @ 2023-12-15 12:21 UTC (permalink / raw)
  To: Jeremy Allison; +Cc: jra, tansuresh, hch, linux-nvme, rafael, bhelgaas

On Thu, Dec 14, 2023 at 04:03:56PM -0800, Jeremy Allison wrote:
> From: Tanjore Suresh <tansuresh@google.com>
> 
> This changes the bus driver interface with additional entry points
> to enable devices to implement asynchronous shutdown. The existing
> synchronous interface to shutdown is unmodified and retained for
> backward compatibility. shutdown_pre is only called if a matching
> shutdown_post function is also registered, otherwise the synchronous
> synchronous interface is used.
> 
> This changes the common device shutdown code to enable devices to
> participate in asynchronous shutdown implementation.
> 
> Signed-off-by: Tanjore Suresh <tansuresh@google.com>
> Signed-off-by: Jeremy Allison <jallison@ciq.com>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
  2023-12-15 12:21   ` Greg KH
@ 2023-12-19  5:33   ` Christoph Hellwig
  2023-12-19  6:19     ` Jeremy Allison
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-19  5:33 UTC (permalink / raw)
  To: Jeremy Allison; +Cc: jra, tansuresh, hch, linux-nvme, gregkh, rafael, bhelgaas

On Thu, Dec 14, 2023 at 04:03:56PM -0800, Jeremy Allison wrote:
> From: Tanjore Suresh <tansuresh@google.com>
> 
> This changes the bus driver interface with additional entry points
> to enable devices to implement asynchronous shutdown. The existing
> synchronous interface to shutdown is unmodified and retained for
> backward compatibility. shutdown_pre is only called if a matching
> shutdown_post function is also registered, otherwise the synchronous
> synchronous interface is used.
> 
> This changes the common device shutdown code to enable devices to
> participate in asynchronous shutdown implementation.

Is there any reason to have a separate shutdown_pre method?
Especially with all the method wrapping in the driver core, yet
another method just keeps confusing everyone.

And on the post side, might shutdown_wait be a better name to
describe the operation, but I'm open to opinions.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] nvme: Add async shutdown support
  2023-12-15  0:03 ` [PATCH 3/3] nvme: Add async shutdown support Jeremy Allison
@ 2023-12-19  5:43   ` Christoph Hellwig
  2023-12-19  6:35     ` Jeremy Allison
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-19  5:43 UTC (permalink / raw)
  To: Jeremy Allison; +Cc: jra, tansuresh, hch, linux-nvme, gregkh, rafael, bhelgaas

On Thu, Dec 14, 2023 at 04:03:58PM -0800, Jeremy Allison wrote:
> From: Tanjore Suresh <tansuresh@google.com>
> 
> This works with the asynchronous shutdown mechanism setup for the PCI
> drivers and participates to provide both pre and post shutdown
> routines at pci_driver structure level.
> 
> The shutdown_pre routine starts the shutdown and does not wait for the
> shutdown to complete.  The shutdown_post routine waits for the shutdown
> to complete on individual controllers that this driver instance
> controls. This mechanism optimizes to speed up the shutdown in a
> system which host many controllers.

I had a really hard time trying to understand this patch.

Please split switching from the bool shutdown to an enum (with initially
just two values) into a separate patch.   And the names really confuse
me.  I would have expect something like:

	NVME_DISABLE_RESET,
	NVME_DISABLE_SHUTDOWN_SYNC,
	NVME_DISABLE_SHUTDOWN_ASYNC,

then again mixing two rather different concept (reset vs shutdown)
into a single enum is also not very helpful (but neither would be
two bool arguments).  Not really sure what the right thing is, but
as-is it feels pretty obfuscated.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-19  5:33   ` Christoph Hellwig
@ 2023-12-19  6:19     ` Jeremy Allison
  2023-12-19  6:21       ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Jeremy Allison @ 2023-12-19  6:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jeremy Allison, tansuresh, linux-nvme, gregkh, rafael, bhelgaas, jra

On Tue, Dec 19, 2023 at 06:33:18AM +0100, Christoph Hellwig wrote:
>On Thu, Dec 14, 2023 at 04:03:56PM -0800, Jeremy Allison wrote:
>> From: Tanjore Suresh <tansuresh@google.com>
>>
>> This changes the bus driver interface with additional entry points
>> to enable devices to implement asynchronous shutdown. The existing
>> synchronous interface to shutdown is unmodified and retained for
>> backward compatibility. shutdown_pre is only called if a matching
>> shutdown_post function is also registered, otherwise the synchronous
>> synchronous interface is used.
>>
>> This changes the common device shutdown code to enable devices to
>> participate in asynchronous shutdown implementation.
>
>Is there any reason to have a separate shutdown_pre method?
>Especially with all the method wrapping in the driver core, yet
>another method just keeps confusing everyone.

Currently in the patch the existence of a shutdown_pre() method
for a device causes it to be added to the async_shutdown_list
which is walked to reap the completion status after all the
calls to shutdown_pre().

I could change this so that the existing shutdown() method
is always called, and the device is only added to the async_shutdown_list
if a shutdown_post() (or as requested below, shutdown_wait())
method is defined for the device.

That makes sense to me and I'm happy to make that change.

>And on the post side, might shutdown_wait be a better name to
>describe the operation, but I'm open to opinions.

I'm happy to change shutdown_post() -> shudown_wait().

shutdown_pre()/post() seemed a natural fit, but if we're removing
shutdown_pre() then shudown_wait() works.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-19  6:19     ` Jeremy Allison
@ 2023-12-19  6:21       ` Christoph Hellwig
  2023-12-19 13:49         ` Sagi Grimberg
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-19  6:21 UTC (permalink / raw)
  To: Jeremy Allison
  Cc: Christoph Hellwig, Jeremy Allison, tansuresh, linux-nvme, gregkh,
	rafael, bhelgaas

On Mon, Dec 18, 2023 at 10:19:43PM -0800, Jeremy Allison wrote:
>> Is there any reason to have a separate shutdown_pre method?
>> Especially with all the method wrapping in the driver core, yet
>> another method just keeps confusing everyone.
>
> Currently in the patch the existence of a shutdown_pre() method
> for a device causes it to be added to the async_shutdown_list
> which is walked to reap the completion status after all the
> calls to shutdown_pre().
>
> I could change this so that the existing shutdown() method
> is always called, and the device is only added to the async_shutdown_list
> if a shutdown_post() (or as requested below, shutdown_wait())
> method is defined for the device.

Yes, that's what I mean.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] nvme: Add async shutdown support
  2023-12-19  5:43   ` Christoph Hellwig
@ 2023-12-19  6:35     ` Jeremy Allison
  0 siblings, 0 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-19  6:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jeremy Allison, tansuresh, linux-nvme, gregkh, rafael, bhelgaas, jra

On Tue, Dec 19, 2023 at 06:43:49AM +0100, Christoph Hellwig wrote:
>On Thu, Dec 14, 2023 at 04:03:58PM -0800, Jeremy Allison wrote:
>> From: Tanjore Suresh <tansuresh@google.com>
>>
>> This works with the asynchronous shutdown mechanism setup for the PCI
>> drivers and participates to provide both pre and post shutdown
>> routines at pci_driver structure level.
>>
>> The shutdown_pre routine starts the shutdown and does not wait for the
>> shutdown to complete.  The shutdown_post routine waits for the shutdown
>> to complete on individual controllers that this driver instance
>> controls. This mechanism optimizes to speed up the shutdown in a
>> system which host many controllers.
>
>I had a really hard time trying to understand this patch.

Sorry. Me too when I first read it :-).

>Please split switching from the bool shutdown to an enum (with initially
>just two values) into a separate patch.   And the names really confuse
>me.  I would have expect something like:
>
>	NVME_DISABLE_RESET,
>	NVME_DISABLE_SHUTDOWN_SYNC,
>	NVME_DISABLE_SHUTDOWN_ASYNC,

Makes sense. Start with

'enum shutdown_type { NVME_DISABLE_RESET, NVME_DISABLE_SHUTDOWN_SYNC}'

and then add NVME_DISABLE_SHUTDOWN_ASYNC when the
shutdown_wait() is added in the subsequent async patch.

>then again mixing two rather different concept (reset vs shutdown)
>into a single enum is also not very helpful (but neither would be
>two bool arguments).  Not really sure what the right thing is, but
>as-is it feels pretty obfuscated.

Yeah. I really didn't want to add another bool here as having two
bools side by side as parameters to a function is a receipe for
mistakes. The original code just uses one bool parameter 'shutdown',
to the nvme_disable_ctrl() function which when set means do the
shutdown request, and when clear means reset. So 'shutdown and
reset' are already conflated in the code.

I think if I use the names you suggested and split the initial
add of the enum into a separate preparatory patch might make things
a little clearer for people to follow. I'm happy to take your
guidance on this though.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-19  6:21       ` Christoph Hellwig
@ 2023-12-19 13:49         ` Sagi Grimberg
  2023-12-19 13:56           ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2023-12-19 13:49 UTC (permalink / raw)
  To: Christoph Hellwig, Jeremy Allison
  Cc: Jeremy Allison, tansuresh, linux-nvme, gregkh, rafael, bhelgaas



On 12/19/23 08:21, Christoph Hellwig wrote:
> On Mon, Dec 18, 2023 at 10:19:43PM -0800, Jeremy Allison wrote:
>>> Is there any reason to have a separate shutdown_pre method?
>>> Especially with all the method wrapping in the driver core, yet
>>> another method just keeps confusing everyone.
>>
>> Currently in the patch the existence of a shutdown_pre() method
>> for a device causes it to be added to the async_shutdown_list
>> which is walked to reap the completion status after all the
>> calls to shutdown_pre().
>>
>> I could change this so that the existing shutdown() method
>> is always called, and the device is only added to the async_shutdown_list
>> if a shutdown_post() (or as requested below, shutdown_wait())
>> method is defined for the device.
> 
> Yes, that's what I mean.

I think its usually better to separate sync vs async interfaces. However
I assume that the suggested interface exists elsewhere in the kernel, so
its not a big deal.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-19 13:49         ` Sagi Grimberg
@ 2023-12-19 13:56           ` Christoph Hellwig
  2023-12-19 14:12             ` Sagi Grimberg
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-19 13:56 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Jeremy Allison, Jeremy Allison, tansuresh,
	linux-nvme, gregkh, rafael, bhelgaas

On Tue, Dec 19, 2023 at 03:49:18PM +0200, Sagi Grimberg wrote:
>>> I could change this so that the existing shutdown() method
>>> is always called, and the device is only added to the async_shutdown_list
>>> if a shutdown_post() (or as requested below, shutdown_wait())
>>> method is defined for the device.
>>
>> Yes, that's what I mean.
>
> I think its usually better to separate sync vs async interfaces. However
> I assume that the suggested interface exists elsewhere in the kernel, so
> its not a big deal.

I don't think we have async shutdown anywhere else.

It's also not really an async interface is the classic sense, but more a
fire now and then wait for completion later interface, i.e. no notication
on completion but pure polling.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-19 13:56           ` Christoph Hellwig
@ 2023-12-19 14:12             ` Sagi Grimberg
  0 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2023-12-19 14:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jeremy Allison, Jeremy Allison, tansuresh, linux-nvme, gregkh,
	rafael, bhelgaas



On 12/19/23 15:56, Christoph Hellwig wrote:
> On Tue, Dec 19, 2023 at 03:49:18PM +0200, Sagi Grimberg wrote:
>>>> I could change this so that the existing shutdown() method
>>>> is always called, and the device is only added to the async_shutdown_list
>>>> if a shutdown_post() (or as requested below, shutdown_wait())
>>>> method is defined for the device.
>>>
>>> Yes, that's what I mean.
>>
>> I think its usually better to separate sync vs async interfaces. However
>> I assume that the suggested interface exists elsewhere in the kernel, so
>> its not a big deal.
> 
> I don't think we have async shutdown anywhere else.

I meant the case where an interface is either sync where it needs to
complete inside the callback, or it can be async because it is also
paired with an optional _wait|_post|_end handler.

> It's also not really an async interface is the classic sense, but more a
> fire now and then wait for completion later interface, i.e. no notication
> on completion but pure polling.

It could have been waiting on a notify/completion for that matter.

I find the above a bit ambiguous, but maybe its just me, so I don't
care much about it. Your suggestion is fine too.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-12 18:09 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
  2023-12-13 13:59   ` Sagi Grimberg
@ 2023-12-13 17:48   ` Bart Van Assche
  1 sibling, 0 replies; 20+ messages in thread
From: Bart Van Assche @ 2023-12-13 17:48 UTC (permalink / raw)
  To: Jeremy Allison, jra, tansuresh, hch; +Cc: linux-nvme, Greg Kroah-Hartman

On 12/12/23 10:09, Jeremy Allison wrote:
> From: Tanjore Suresh <tansuresh@google.com>
> 
> This changes the bus driver interface with additional entry points
> to enable devices to implement asynchronous shutdown. The existing
> synchronous interface to shutdown is unmodified and retained for
> backward compatibility.
> 
> This changes the common device shutdown code to enable devices to
> participate in asynchronous shutdown implementation.
> 
> Signed-off-by: Tanjore Suresh <tansuresh@google.com>
> ---
>   drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
>   include/linux/device/bus.h | 10 ++++++++++
>   2 files changed, 48 insertions(+), 1 deletion(-)

 From the MAINTAINERS file:

DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS
M:      Greg Kroah-Hartman <gregkh@linuxfoundation.org>
R:      "Rafael J. Wysocki" <rafael@kernel.org>
S:      Supported
T:      git 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
F:      Documentation/core-api/kobject.rst
F:      drivers/base/
F:      fs/debugfs/
F:      fs/sysfs/
F:      include/linux/debugfs.h
F:      include/linux/fwnode.h
F:      include/linux/kobj*
F:      include/linux/property.h
F:      lib/kobj*

Please Cc the maintainers of the modified files when posting kernel patches.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-13 13:59   ` Sagi Grimberg
@ 2023-12-13 17:34     ` Jeremy Allison
  0 siblings, 0 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-13 17:34 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: jra, tansuresh, hch, linux-nvme

Good catch. Thanks. I'll fix that and re-submit.

On Wed, Dec 13, 2023 at 5:59 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
> > From: Tanjore Suresh <tansuresh@google.com>
> >
> > This changes the bus driver interface with additional entry points
> > to enable devices to implement asynchronous shutdown. The existing
> > synchronous interface to shutdown is unmodified and retained for
> > backward compatibility.
> >
> > This changes the common device shutdown code to enable devices to
> > participate in asynchronous shutdown implementation.
> >
> > Signed-off-by: Tanjore Suresh <tansuresh@google.com>
> > ---
> >   drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
> >   include/linux/device/bus.h | 10 ++++++++++
> >   2 files changed, 48 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 67ba592afc77..d9745822fb50 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -4725,6 +4725,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
> >   void device_shutdown(void)
> >   {
> >       struct device *dev, *parent;
> > +     LIST_HEAD(async_shutdown_list);
> >
> >       wait_for_device_probe();
> >       device_block_probing();
> > @@ -4769,7 +4770,14 @@ void device_shutdown(void)
> >                               dev_info(dev, "shutdown_pre\n");
> >                       dev->class->shutdown_pre(dev);
> >               }
> > -             if (dev->bus && dev->bus->shutdown) {
> > +
> > +             if (dev->bus && dev->bus->shutdown_pre) {
>
> I'm assuming that there is no shutdown_pre without a shutdown_post
> paired with it, so I think the code should verify that.
>
> > +                     if (initcall_debug)
> > +                             dev_info(dev, "shutdown_pre\n");
> > +                     dev->bus->shutdown_pre(dev);
> > +                     list_add(&dev->kobj.entry,
> > +                             &async_shutdown_list);
> > +             } else if (dev->bus && dev->bus->shutdown) {
> >                       if (initcall_debug)
> >                               dev_info(dev, "shutdown\n");
> >                       dev->bus->shutdown(dev);
> > @@ -4789,6 +4797,35 @@ void device_shutdown(void)
> >               spin_lock(&devices_kset->list_lock);
> >       }
> >       spin_unlock(&devices_kset->list_lock);
> > +
> > +     /*
> > +      * Second pass spin for only devices, that have configured
> > +      * Asynchronous shutdown.
> > +      */
> > +     while (!list_empty(&async_shutdown_list)) {
> > +             dev = list_entry(async_shutdown_list.next, struct device,
> > +                             kobj.entry);
> > +             parent = get_device(dev->parent);
> > +             get_device(dev);
> > +             /*
> > +              * Make sure the device is off the  list
> > +              */
> > +             list_del_init(&dev->kobj.entry);
> > +             if (parent)
> > +                     device_lock(parent);
> > +             device_lock(dev);
> > +             if (dev->bus && dev->bus->shutdown_post) {
> > +                     if (initcall_debug)
> > +                             dev_info(dev,
> > +                             "shutdown_post called\n");
> > +                     dev->bus->shutdown_post(dev);
> > +             }
> > +             device_unlock(dev);
> > +             if (parent)
> > +                     device_unlock(parent);
> > +             put_device(dev);
> > +             put_device(parent);
> > +     }
> >   }
> >
> >   /*
> > diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
> > index ae10c4322754..cbcb001f6336 100644
> > --- a/include/linux/device/bus.h
> > +++ b/include/linux/device/bus.h
> > @@ -48,6 +48,14 @@ struct fwnode_handle;
> >    *          will never get called until they do.
> >    * @remove: Called when a device removed from this bus.
> >    * @shutdown:       Called at shut-down time to quiesce the device.
> > + * @shutdown_pre:    Called at the shutdown-time to start the shutdown
> > + *                   process on the device. This entry point will be called
> > + *                   only when the bus driver has indicated it would like
> > + *                   to participate in asynchronous shutdown completion.
> > + * @shutdown_post:   Called at shutdown-time  to complete the shutdown
> > + *                   process of the device. This entry point will be called
> > + *                   only when the bus drive has indicated it would like to
> > + *                   participate in the asynchronous shutdown completion.
> >    *
> >    * @online: Called to put the device back online (after offlining it).
> >    * @offline:        Called to put the device offline for hot-removal. May fail.
> > @@ -90,6 +98,8 @@ struct bus_type {
> >       void (*sync_state)(struct device *dev);
> >       void (*remove)(struct device *dev);
> >       void (*shutdown)(struct device *dev);
> > +     void (*shutdown_pre)(struct device *dev);
> > +     void (*shutdown_post)(struct device *dev);
> >
> >       int (*online)(struct device *dev);
> >       int (*offline)(struct device *dev);


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-12 18:09 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
@ 2023-12-13 13:59   ` Sagi Grimberg
  2023-12-13 17:34     ` Jeremy Allison
  2023-12-13 17:48   ` Bart Van Assche
  1 sibling, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2023-12-13 13:59 UTC (permalink / raw)
  To: Jeremy Allison, jra, tansuresh, hch; +Cc: linux-nvme

> From: Tanjore Suresh <tansuresh@google.com>
> 
> This changes the bus driver interface with additional entry points
> to enable devices to implement asynchronous shutdown. The existing
> synchronous interface to shutdown is unmodified and retained for
> backward compatibility.
> 
> This changes the common device shutdown code to enable devices to
> participate in asynchronous shutdown implementation.
> 
> Signed-off-by: Tanjore Suresh <tansuresh@google.com>
> ---
>   drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
>   include/linux/device/bus.h | 10 ++++++++++
>   2 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 67ba592afc77..d9745822fb50 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -4725,6 +4725,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
>   void device_shutdown(void)
>   {
>   	struct device *dev, *parent;
> +	LIST_HEAD(async_shutdown_list);
>   
>   	wait_for_device_probe();
>   	device_block_probing();
> @@ -4769,7 +4770,14 @@ void device_shutdown(void)
>   				dev_info(dev, "shutdown_pre\n");
>   			dev->class->shutdown_pre(dev);
>   		}
> -		if (dev->bus && dev->bus->shutdown) {
> +
> +		if (dev->bus && dev->bus->shutdown_pre) {

I'm assuming that there is no shutdown_pre without a shutdown_post
paired with it, so I think the code should verify that.

> +			if (initcall_debug)
> +				dev_info(dev, "shutdown_pre\n");
> +			dev->bus->shutdown_pre(dev);
> +			list_add(&dev->kobj.entry,
> +				&async_shutdown_list);
> +		} else if (dev->bus && dev->bus->shutdown) {
>   			if (initcall_debug)
>   				dev_info(dev, "shutdown\n");
>   			dev->bus->shutdown(dev);
> @@ -4789,6 +4797,35 @@ void device_shutdown(void)
>   		spin_lock(&devices_kset->list_lock);
>   	}
>   	spin_unlock(&devices_kset->list_lock);
> +
> +	/*
> +	 * Second pass spin for only devices, that have configured
> +	 * Asynchronous shutdown.
> +	 */
> +	while (!list_empty(&async_shutdown_list)) {
> +		dev = list_entry(async_shutdown_list.next, struct device,
> +				kobj.entry);
> +		parent = get_device(dev->parent);
> +		get_device(dev);
> +		/*
> +		 * Make sure the device is off the  list
> +		 */
> +		list_del_init(&dev->kobj.entry);
> +		if (parent)
> +			device_lock(parent);
> +		device_lock(dev);
> +		if (dev->bus && dev->bus->shutdown_post) {
> +			if (initcall_debug)
> +				dev_info(dev,
> +				"shutdown_post called\n");
> +			dev->bus->shutdown_post(dev);
> +		}
> +		device_unlock(dev);
> +		if (parent)
> +			device_unlock(parent);
> +		put_device(dev);
> +		put_device(parent);
> +	}
>   }
>   
>   /*
> diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
> index ae10c4322754..cbcb001f6336 100644
> --- a/include/linux/device/bus.h
> +++ b/include/linux/device/bus.h
> @@ -48,6 +48,14 @@ struct fwnode_handle;
>    *		will never get called until they do.
>    * @remove:	Called when a device removed from this bus.
>    * @shutdown:	Called at shut-down time to quiesce the device.
> + * @shutdown_pre:	Called at the shutdown-time to start the shutdown
> + *			process on the device. This entry point will be called
> + *			only when the bus driver has indicated it would like
> + *			to participate in asynchronous shutdown completion.
> + * @shutdown_post:	Called at shutdown-time  to complete the shutdown
> + *			process of the device. This entry point will be called
> + *			only when the bus drive has indicated it would like to
> + *			participate in the asynchronous shutdown completion.
>    *
>    * @online:	Called to put the device back online (after offlining it).
>    * @offline:	Called to put the device offline for hot-removal. May fail.
> @@ -90,6 +98,8 @@ struct bus_type {
>   	void (*sync_state)(struct device *dev);
>   	void (*remove)(struct device *dev);
>   	void (*shutdown)(struct device *dev);
> +	void (*shutdown_pre)(struct device *dev);
> +	void (*shutdown_post)(struct device *dev);
>   
>   	int (*online)(struct device *dev);
>   	int (*offline)(struct device *dev);


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2023-12-12 18:09 Make NVME shutdown async Jeremy Allison
@ 2023-12-12 18:09 ` Jeremy Allison
  2023-12-13 13:59   ` Sagi Grimberg
  2023-12-13 17:48   ` Bart Van Assche
  0 siblings, 2 replies; 20+ messages in thread
From: Jeremy Allison @ 2023-12-12 18:09 UTC (permalink / raw)
  To: jallison, jra, tansuresh, hch; +Cc: linux-nvme

From: Tanjore Suresh <tansuresh@google.com>

This changes the bus driver interface with additional entry points
to enable devices to implement asynchronous shutdown. The existing
synchronous interface to shutdown is unmodified and retained for
backward compatibility.

This changes the common device shutdown code to enable devices to
participate in asynchronous shutdown implementation.

Signed-off-by: Tanjore Suresh <tansuresh@google.com>
---
 drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
 include/linux/device/bus.h | 10 ++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 67ba592afc77..d9745822fb50 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4725,6 +4725,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
 void device_shutdown(void)
 {
 	struct device *dev, *parent;
+	LIST_HEAD(async_shutdown_list);
 
 	wait_for_device_probe();
 	device_block_probing();
@@ -4769,7 +4770,14 @@ void device_shutdown(void)
 				dev_info(dev, "shutdown_pre\n");
 			dev->class->shutdown_pre(dev);
 		}
-		if (dev->bus && dev->bus->shutdown) {
+
+		if (dev->bus && dev->bus->shutdown_pre) {
+			if (initcall_debug)
+				dev_info(dev, "shutdown_pre\n");
+			dev->bus->shutdown_pre(dev);
+			list_add(&dev->kobj.entry,
+				&async_shutdown_list);
+		} else if (dev->bus && dev->bus->shutdown) {
 			if (initcall_debug)
 				dev_info(dev, "shutdown\n");
 			dev->bus->shutdown(dev);
@@ -4789,6 +4797,35 @@ void device_shutdown(void)
 		spin_lock(&devices_kset->list_lock);
 	}
 	spin_unlock(&devices_kset->list_lock);
+
+	/*
+	 * Second pass spin for only devices, that have configured
+	 * Asynchronous shutdown.
+	 */
+	while (!list_empty(&async_shutdown_list)) {
+		dev = list_entry(async_shutdown_list.next, struct device,
+				kobj.entry);
+		parent = get_device(dev->parent);
+		get_device(dev);
+		/*
+		 * Make sure the device is off the  list
+		 */
+		list_del_init(&dev->kobj.entry);
+		if (parent)
+			device_lock(parent);
+		device_lock(dev);
+		if (dev->bus && dev->bus->shutdown_post) {
+			if (initcall_debug)
+				dev_info(dev,
+				"shutdown_post called\n");
+			dev->bus->shutdown_post(dev);
+		}
+		device_unlock(dev);
+		if (parent)
+			device_unlock(parent);
+		put_device(dev);
+		put_device(parent);
+	}
 }
 
 /*
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index ae10c4322754..cbcb001f6336 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -48,6 +48,14 @@ struct fwnode_handle;
  *		will never get called until they do.
  * @remove:	Called when a device removed from this bus.
  * @shutdown:	Called at shut-down time to quiesce the device.
+ * @shutdown_pre:	Called at the shutdown-time to start the shutdown
+ *			process on the device. This entry point will be called
+ *			only when the bus driver has indicated it would like
+ *			to participate in asynchronous shutdown completion.
+ * @shutdown_post:	Called at shutdown-time  to complete the shutdown
+ *			process of the device. This entry point will be called
+ *			only when the bus drive has indicated it would like to
+ *			participate in the asynchronous shutdown completion.
  *
  * @online:	Called to put the device back online (after offlining it).
  * @offline:	Called to put the device offline for hot-removal. May fail.
@@ -90,6 +98,8 @@ struct bus_type {
 	void (*sync_state)(struct device *dev);
 	void (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
+	void (*shutdown_pre)(struct device *dev);
+	void (*shutdown_post)(struct device *dev);
 
 	int (*online)(struct device *dev);
 	int (*offline)(struct device *dev);
-- 
2.39.3



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2022-03-24 21:34 Tanjore Suresh
  2022-03-25  5:59 ` Greg Kroah-Hartman
@ 2022-03-25 13:24 ` Bjorn Helgaas
  1 sibling, 0 replies; 20+ messages in thread
From: Bjorn Helgaas @ 2022-03-25 13:24 UTC (permalink / raw)
  To: Tanjore Suresh; +Cc: Greg Kroah-Hartman, Rafael J . Wysocki, linux-kernel

[dropped "trivial" from cc]

On Thu, Mar 24, 2022 at 02:34:45PM -0700, Tanjore Suresh wrote:
> This changes the bus driver interface to take in a flag to indicate
> whether a bus and associated devices are willing to participate in
> the asynchronous shutdown. If this flag is not set bus driver
> implementation will follow synchronous shutdown semantics.
> 
> Signed-off-by: Tanjore Suresh <tansuresh@google.com>

There's useful functionality here.  Some hints to make it more
digestable:

- Add a cover letter to give an overview.  The patches themselves
  should be sent as responses to the cover letter so everything is
  connected in the email archives.

  [1] is a nice example of what this looks like.  You can currently
  find your series as [2], which searches for everything from you, but
  there's no single permanent URL that finds the whole series.

- Send the whole series (cover letter + patches) to everybody, so
  people can see the context and where each piece fits.

  No need to CC the "trivial@kernel.org" list.  That's for things like
  tiny, obviously correct patches that can be reviewed very quickly.

- Wait a week or so for any comments on this series before sending a
  revised v2.  When you send a v2, use "git format-patch -v 2" or
  similar to mark it as v2.  Also include notes what what changed
  between v1 (this posting) and v2.  [1] has nice examples of how to
  do that, both in the cover letter and the individual patches.

- Update this commit log so it matches the code (there is no longer a
  flag).

- Write commit logs in imperative mood; see [3, 4].

  In this case, your commit log should probably have two parts: the
  first to outline the problem, and the second to say what this
  patches does about it, e.g., something like this:

    Driver .shutdown() methods are all run serially, so there's no
    parallelism even across unrelated devices.

    Add an optional asynchronous shutdown method so drivers can
    schedule work to be done in parallel.

A few code comments below.

[1] https://lore.kernel.org/linux-pci/20220325093827.4983-1-pali@kernel.org/T/#t
[2] https://lore.kernel.org/all/?q=f%3Atansuresh
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-tip.rst?id=v5.16#n134
[4] https://chris.beams.io/posts/git-commit/

> ---
>  drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
>  include/linux/device/bus.h | 10 ++++++++++
>  2 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 3d6430eb0c6a..359e7067e8b8 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -4479,6 +4479,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
>  void device_shutdown(void)
>  {
>  	struct device *dev, *parent;
> +	LIST_HEAD(async_shutdown_list);
>  
>  	wait_for_device_probe();
>  	device_block_probing();
> @@ -4523,7 +4524,14 @@ void device_shutdown(void)
>  				dev_info(dev, "shutdown_pre\n");
>  			dev->class->shutdown_pre(dev);
>  		}
> -		if (dev->bus && dev->bus->shutdown) {
> +
> +		if (dev->bus && dev->bus->shutdown_pre) {
> +			if (initcall_debug)
> +				dev_info(dev, "shutdown_pre\n");
> +			dev->bus->shutdown_pre(dev);
> +			list_add(&dev->kobj.entry,
> +				&async_shutdown_list);
> +		} else if (dev->bus && dev->bus->shutdown) {
>  			if (initcall_debug)
>  				dev_info(dev, "shutdown\n");
>  			dev->bus->shutdown(dev);
> @@ -4543,6 +4551,35 @@ void device_shutdown(void)
>  		spin_lock(&devices_kset->list_lock);
>  	}
>  	spin_unlock(&devices_kset->list_lock);
> +
> +	/*
> +	 * Second pass spin for only devices, that have configured
> +	 * Asynchronous shutdown.
> +	 */
> +	while (!list_empty(&async_shutdown_list)) {
> +		dev = list_entry(async_shutdown_list.next, struct device,
> +				kobj.entry);
> +		parent = get_device(dev->parent);
> +		get_device(dev);
> +		/*
> +		 * Make sure the device is off the  list
> +		 */
> +		list_del_init(&dev->kobj.entry);
> +		if (parent)
> +			device_lock(parent);
> +		device_lock(dev);
> +		if (dev->bus && dev->bus->shutdown_post) {
> +			if (initcall_debug)
> +				dev_info(dev,
> +				"shutdown_post called\n");
> +			dev->bus->shutdown_post(dev);
> +		}
> +		device_unlock(dev);
> +		if (parent)
> +			device_unlock(parent);
> +		put_device(dev);
> +		put_device(parent);
> +	}

I'm not a driver core expert, but AFAICS, the existing model is that
.shutdown() is always synchronous.  We call it for each device
serially.

And your proposal is to allow some shutdown processing to happen in
parallel, by adding .shutdown_pre() to *schedule* work that can happen
after .shutdown_pre() returns, and .shutdown_post() to *wait* for all
that scheduled work to complete.

IIUC, .shutdown_post() is completely synchronous, just like the
existing .shutdown() is, so it seems unnecessary to add it.

Seems like it would be simpler to add an optional .shutdown_async()
method.  This method would be called in a loop *before* the existing
loop that calls .shutdown(), and it could start the async work.

Drivers that implement .shutdown_async() would at the same time update
their .shutdown() methods to wait for all the work started by
.shutdown_async().

>  }
>  
>  /*
> diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
> index a039ab809753..e261819601e9 100644
> --- a/include/linux/device/bus.h
> +++ b/include/linux/device/bus.h
> @@ -49,6 +49,14 @@ struct fwnode_handle;
>   *		will never get called until they do.
>   * @remove:	Called when a device removed from this bus.
>   * @shutdown:	Called at shut-down time to quiesce the device.
> + * @shutdown_pre:	Called at the shutdown-time to start the shutdown
> + *			process on the device. This entry point will be called
> + *			only when the bus driver has indicated it would like
> + *			to participate in asynchronous shutdown completion.
> + * @shutdown_post:	Called at shutdown-time  to complete the shutdown
> + *			process of the device. This entry point will be called
> + *			only when the bus drive has indicated it would like to
> + *			participate in the asynchronous shutdown completion.
>   *
>   * @online:	Called to put the device back online (after offlining it).
>   * @offline:	Called to put the device offline for hot-removal. May fail.
> @@ -93,6 +101,8 @@ struct bus_type {
>  	void (*sync_state)(struct device *dev);
>  	void (*remove)(struct device *dev);
>  	void (*shutdown)(struct device *dev);
> +	void (*shutdown_pre)(struct device *dev);
> +	void (*shutdown_post)(struct device *dev);
>  
>  	int (*online)(struct device *dev);
>  	int (*offline)(struct device *dev);
> -- 
> 2.35.1.1021.g381101b075-goog
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] driver core: Support asynchronous driver shutdown
  2022-03-24 21:34 Tanjore Suresh
@ 2022-03-25  5:59 ` Greg Kroah-Hartman
  2022-03-25 13:24 ` Bjorn Helgaas
  1 sibling, 0 replies; 20+ messages in thread
From: Greg Kroah-Hartman @ 2022-03-25  5:59 UTC (permalink / raw)
  To: Tanjore Suresh; +Cc: Rafael J . Wysocki, linux-kernel, trivial

On Thu, Mar 24, 2022 at 02:34:45PM -0700, Tanjore Suresh wrote:
> --- a/include/linux/device/bus.h
> +++ b/include/linux/device/bus.h
> @@ -49,6 +49,14 @@ struct fwnode_handle;
>   *		will never get called until they do.
>   * @remove:	Called when a device removed from this bus.
>   * @shutdown:	Called at shut-down time to quiesce the device.
> + * @shutdown_pre:	Called at the shutdown-time to start the shutdown
> + *			process on the device. This entry point will be called
> + *			only when the bus driver has indicated it would like
> + *			to participate in asynchronous shutdown completion.
> + * @shutdown_post:	Called at shutdown-time  to complete the shutdown
> + *			process of the device. This entry point will be called
> + *			only when the bus drive has indicated it would like to
> + *			participate in the asynchronous shutdown completion.

Sorry, but no, this should not be needed expecially as you did not offer
any justification or reason to do so.

Nor did you send the remaining changes in the series to me, and why
would these be "trivial"?

Please work with others at Google who know how to submit changes to the
kernel first and get their review and signed-off-by on the changes
before sending them out again.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] driver core: Support asynchronous driver shutdown
@ 2022-03-24 21:34 Tanjore Suresh
  2022-03-25  5:59 ` Greg Kroah-Hartman
  2022-03-25 13:24 ` Bjorn Helgaas
  0 siblings, 2 replies; 20+ messages in thread
From: Tanjore Suresh @ 2022-03-24 21:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J . Wysocki
  Cc: linux-kernel, trivial, Tanjore Suresh

This changes the bus driver interface to take in a flag to indicate
whether a bus and associated devices are willing to participate in
the asynchronous shutdown. If this flag is not set bus driver
implementation will follow synchronous shutdown semantics.

Signed-off-by: Tanjore Suresh <tansuresh@google.com>
---
 drivers/base/core.c        | 39 +++++++++++++++++++++++++++++++++++++-
 include/linux/device/bus.h | 10 ++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3d6430eb0c6a..359e7067e8b8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -4479,6 +4479,7 @@ EXPORT_SYMBOL_GPL(device_change_owner);
 void device_shutdown(void)
 {
 	struct device *dev, *parent;
+	LIST_HEAD(async_shutdown_list);
 
 	wait_for_device_probe();
 	device_block_probing();
@@ -4523,7 +4524,14 @@ void device_shutdown(void)
 				dev_info(dev, "shutdown_pre\n");
 			dev->class->shutdown_pre(dev);
 		}
-		if (dev->bus && dev->bus->shutdown) {
+
+		if (dev->bus && dev->bus->shutdown_pre) {
+			if (initcall_debug)
+				dev_info(dev, "shutdown_pre\n");
+			dev->bus->shutdown_pre(dev);
+			list_add(&dev->kobj.entry,
+				&async_shutdown_list);
+		} else if (dev->bus && dev->bus->shutdown) {
 			if (initcall_debug)
 				dev_info(dev, "shutdown\n");
 			dev->bus->shutdown(dev);
@@ -4543,6 +4551,35 @@ void device_shutdown(void)
 		spin_lock(&devices_kset->list_lock);
 	}
 	spin_unlock(&devices_kset->list_lock);
+
+	/*
+	 * Second pass spin for only devices, that have configured
+	 * Asynchronous shutdown.
+	 */
+	while (!list_empty(&async_shutdown_list)) {
+		dev = list_entry(async_shutdown_list.next, struct device,
+				kobj.entry);
+		parent = get_device(dev->parent);
+		get_device(dev);
+		/*
+		 * Make sure the device is off the  list
+		 */
+		list_del_init(&dev->kobj.entry);
+		if (parent)
+			device_lock(parent);
+		device_lock(dev);
+		if (dev->bus && dev->bus->shutdown_post) {
+			if (initcall_debug)
+				dev_info(dev,
+				"shutdown_post called\n");
+			dev->bus->shutdown_post(dev);
+		}
+		device_unlock(dev);
+		if (parent)
+			device_unlock(parent);
+		put_device(dev);
+		put_device(parent);
+	}
 }
 
 /*
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..e261819601e9 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -49,6 +49,14 @@ struct fwnode_handle;
  *		will never get called until they do.
  * @remove:	Called when a device removed from this bus.
  * @shutdown:	Called at shut-down time to quiesce the device.
+ * @shutdown_pre:	Called at the shutdown-time to start the shutdown
+ *			process on the device. This entry point will be called
+ *			only when the bus driver has indicated it would like
+ *			to participate in asynchronous shutdown completion.
+ * @shutdown_post:	Called at shutdown-time  to complete the shutdown
+ *			process of the device. This entry point will be called
+ *			only when the bus drive has indicated it would like to
+ *			participate in the asynchronous shutdown completion.
  *
  * @online:	Called to put the device back online (after offlining it).
  * @offline:	Called to put the device offline for hot-removal. May fail.
@@ -93,6 +101,8 @@ struct bus_type {
 	void (*sync_state)(struct device *dev);
 	void (*remove)(struct device *dev);
 	void (*shutdown)(struct device *dev);
+	void (*shutdown_pre)(struct device *dev);
+	void (*shutdown_post)(struct device *dev);
 
 	int (*online)(struct device *dev);
 	int (*offline)(struct device *dev);
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-12-19 14:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-15  0:03 Make NVME shutdown async - version 2 Jeremy Allison
2023-12-15  0:03 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
2023-12-15 12:21   ` Greg KH
2023-12-19  5:33   ` Christoph Hellwig
2023-12-19  6:19     ` Jeremy Allison
2023-12-19  6:21       ` Christoph Hellwig
2023-12-19 13:49         ` Sagi Grimberg
2023-12-19 13:56           ` Christoph Hellwig
2023-12-19 14:12             ` Sagi Grimberg
2023-12-15  0:03 ` [PATCH 2/3] PCI: Support asynchronous shutdown Jeremy Allison
2023-12-15  0:03 ` [PATCH 3/3] nvme: Add async shutdown support Jeremy Allison
2023-12-19  5:43   ` Christoph Hellwig
2023-12-19  6:35     ` Jeremy Allison
  -- strict thread matches above, loose matches on Subject: below --
2023-12-12 18:09 Make NVME shutdown async Jeremy Allison
2023-12-12 18:09 ` [PATCH 1/3] driver core: Support asynchronous driver shutdown Jeremy Allison
2023-12-13 13:59   ` Sagi Grimberg
2023-12-13 17:34     ` Jeremy Allison
2023-12-13 17:48   ` Bart Van Assche
2022-03-24 21:34 Tanjore Suresh
2022-03-25  5:59 ` Greg Kroah-Hartman
2022-03-25 13:24 ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.