* [PATCH v2 0/1] multi-threading device shutdown @ 2018-05-05 15:40 ` Pavel Tatashin 0 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw) To: pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, alexander.duyck, tobin Changelog v1 - v2 - It turns out we cannot lock more than MAX_LOCK_DEPTH by a single thread. (By default this value is 48), and is used to detect deadlocks. So, I re-wrote the code to only lock one devices per thread instead of pre-locking all devices by the main thread. - Addressed comments from Tobin C. Harding. - As suggested by Alexander Duyck removed ixgbe changes. It can be done as a separate work scaling RTNL mutex. Do a faster shutdown by calling dev->*->shutdown(dev) in parallel. device_shutdown() calls these functions for every single device but only using one thread. Since, nothing else is running on the machine by the device_shutdown() s called, there is no reason not to utilize all the available CPU resources. Pavel Tatashin (1): drivers core: multi-threading device shutdown drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 225 insertions(+), 50 deletions(-) -- 2.17.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH v2 0/1] multi-threading device shutdown @ 2018-05-05 15:40 ` Pavel Tatashin 0 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw) To: intel-wired-lan Changelog v1 - v2 - It turns out we cannot lock more than MAX_LOCK_DEPTH by a single thread. (By default this value is 48), and is used to detect deadlocks. So, I re-wrote the code to only lock one devices per thread instead of pre-locking all devices by the main thread. - Addressed comments from Tobin C. Harding. - As suggested by Alexander Duyck removed ixgbe changes. It can be done as a separate work scaling RTNL mutex. Do a faster shutdown by calling dev->*->shutdown(dev) in parallel. device_shutdown() calls these functions for every single device but only using one thread. Since, nothing else is running on the machine by the device_shutdown() s called, there is no reason not to utilize all the available CPU resources. Pavel Tatashin (1): drivers core: multi-threading device shutdown drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 225 insertions(+), 50 deletions(-) -- 2.17.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/1] drivers core: multi-threading device shutdown 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin @ 2018-05-05 15:40 ` Pavel Tatashin -1 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw) To: pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, alexander.duyck, tobin When system is rebooted, halted or kexeced device_shutdown() is called. This function shuts down every single device by calling either: dev->bus->shutdown(dev) dev->driver->shutdown(dev) Even on a machine with just a moderate amount of devices, device_shutdown() may take multiple seconds to complete. This is because many devices require a specific delays to perform this operation. Here is a sample analysis of time it takes to call device_shutdown() on a two socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz machine. device_shutdown 2.95s ----------------------------- mlx4_shutdown 1.14s megasas_shutdown 0.24s ixgbe_shutdown 0.37s x 4 (four ixgbe devices on this machine). the rest 0.09s In mlx4 we spent the most time, but that is because there is a 1 second sleep, which is defined by hardware specifications: mlx4_shutdown mlx4_unload_one mlx4_free_ownership msleep(1000) With megasas we spend a quarter of a second, but sometimes longer (up-to 0.5s) in this path: megasas_shutdown megasas_flush_cache megasas_issue_blocked_cmd wait_event_timeout Finally, with ixgbe_shutdown() it takes 0.37 for each device, but that time is spread all over the place, with bigger offenders: ixgbe_shutdown __ixgbe_shutdown ixgbe_close_suspend ixgbe_down ixgbe_init_hw_generic ixgbe_reset_hw_X540 msleep(100); 0.104483472 ixgbe_get_san_mac_addr_generic 0.048414851 ixgbe_get_wwn_prefix_generic 0.048409893 ixgbe_start_hw_X540 ixgbe_start_hw_generic ixgbe_clear_hw_cntrs_generic 0.048581502 ixgbe_setup_fc_generic 0.024225800 All the ixgbe_*generic functions end-up calling: ixgbe_read_eerd_X540() ixgbe_acquire_swfw_sync_X540 usleep_range(5000, 6000); ixgbe_release_swfw_sync_X540 usleep_range(5000, 6000); While these are short sleeps, they end-up calling them over 24 times! 24 * 0.0055s = 0.132s. Adding-up to 0.528s for four devices. Also we have four msleep(100). Totaling to: 0.928s While we should keep optimizing the individual device drivers, in some cases this is simply a hardware property that forces a specific delay, and we must wait. So, the solution for this problem is to shutdown devices in parallel. However, we must shutdown children before shutting down parents, so parent device must wait for its children to finish. With this patch, on the same machine devices_shutdown() takes 1.142s, and without mlx4 one second delay only 0.38s Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> --- drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 225 insertions(+), 50 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index b610816eb887..110d7970deef 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -25,6 +25,7 @@ #include <linux/netdevice.h> #include <linux/sched/signal.h> #include <linux/sysfs.h> +#include <linux/kthread.h> #include "base.h" #include "power/power.h" @@ -2102,6 +2103,59 @@ const char *device_get_devnode(struct device *dev, return *tmp = s; } +/** + * device_children_count - device children count + * @parent: parent struct device. + * + * Returns number of children for this device or 0 if none. + */ +static int device_children_count(struct device *parent) +{ + struct klist_iter i; + int children = 0; + + if (!parent->p) + return 0; + + klist_iter_init(&parent->p->klist_children, &i); + while (next_device(&i)) + children++; + klist_iter_exit(&i); + + return children; +} + +/** + * device_get_child_by_index - Return child using the provided index. + * @parent: parent struct device. + * @index: Index of the child, where 0 is the first child in the children list, + * and so on. + * + * Returns child or NULL if child with this index is not present. + */ +static struct device * +device_get_child_by_index(struct device *parent, int index) +{ + struct klist_iter i; + struct device *dev = NULL, *d; + int child_index = 0; + + if (!parent->p || index < 0) + return NULL; + + klist_iter_init(&parent->p->klist_children, &i); + while ((d = next_device(&i))) { + if (child_index == index) { + dev = d; + break; + } + child_index++; + } + klist_iter_exit(&i); + + return dev; +} + /** * device_for_each_child - device child iterator. * @parent: parent struct device. @@ -2765,71 +2819,192 @@ int device_move(struct device *dev, struct device *new_parent, } EXPORT_SYMBOL_GPL(device_move); +/* + * device_shutdown_one - call ->shutdown() for the device passed as + * argument. + */ +static void device_shutdown_one(struct device *dev) +{ + /* Don't allow any more runtime suspends */ + pm_runtime_get_noresume(dev); + pm_runtime_barrier(dev); + + if (dev->class && dev->class->shutdown_pre) { + if (initcall_debug) + dev_info(dev, "shutdown_pre\n"); + dev->class->shutdown_pre(dev); + } + if (dev->bus && dev->bus->shutdown) { + if (initcall_debug) + dev_info(dev, "shutdown\n"); + dev->bus->shutdown(dev); + } else if (dev->driver && dev->driver->shutdown) { + if (initcall_debug) + dev_info(dev, "shutdown\n"); + dev->driver->shutdown(dev); + } + + /* decrement the reference counter */ + put_device(dev); +} + +/** + * Passed as an argument to device_shutdown_child_task(). + * child_next_index the next available child index. + * tasks_running number of tasks still running. Each tasks decrements it + * when job is finished and the last task signals that the + * job is complete. + * complete Used to signal job competition. + * parent Parent device. + */ +struct device_shutdown_task_data { + atomic_t child_next_index; + atomic_t tasks_running; + struct completion complete; + struct device *parent; +}; + +static int device_shutdown_child_task(void *data); + +/** + * These globals are used by tasks that are started for root devices. + * device_root_tasks_finished Number of root devices finished shutting down. + * device_root_tasks_started Total number of root devices tasks started. + * device_root_tasks_done The completion signal to the main thread. + */ +static atomic_t device_root_tasks_finished; +static atomic_t device_root_tasks_started; +struct completion device_root_tasks_done; + +/** + * Shutdown device tree with root started in dev. If dev has no children + * simply shutdown only this device. If dev has children recursively shutdown + * children first, and only then the parent. For performance reasons children + * are shutdown in parallel using kernel threads. dev is locked, so its children + * cannot be removed while this functions is running. + */ +static void device_shutdown_tree(struct device *dev) +{ + int children_count = device_children_count(dev); + + if (children_count) { + struct device_shutdown_task_data tdata; + int i; + + init_completion(&tdata.complete); + atomic_set(&tdata.child_next_index, 0); + atomic_set(&tdata.tasks_running, children_count); + tdata.parent = dev; + + for (i = 0; i < children_count; i++) { + kthread_run(device_shutdown_child_task, + &tdata, "device_shutdown.%s", + dev_name(dev)); + } + wait_for_completion(&tdata.complete); + } + device_shutdown_one(dev); +} + +/** + * Only devices with parent are going through this function. The parent is + * locked and waits for all of its children to finish shutting down before + * calling shutdown function for itself. + */ +static int device_shutdown_child_task(void *data) +{ + struct device_shutdown_task_data *tdata = data; + int cidx = atomic_inc_return(&tdata->child_next_index) - 1; + struct device *dev = device_get_child_by_index(tdata->parent, cidx); + + /* ref. counter is going to be decremented in device_shutdown_one() */ + get_device(dev); + device_lock(dev); + device_shutdown_tree(dev); + device_unlock(dev); + + /* If we are the last to exit, signal the completion */ + if (atomic_dec_return(&tdata->tasks_running) == 0) + complete(&tdata->complete); + return 0; +} + +/** + * On shutdown each root device (the one that does not have a parent) goes + * through this function. + */ +static int device_shutdown_root_task(void *data) +{ + struct device *dev = (struct device *)data; + int root_devices; + + device_lock(dev); + device_shutdown_tree(dev); + device_unlock(dev); + + /* If we are the last to exit, signal the completion */ + root_devices = atomic_inc_return(&device_root_tasks_finished); + if (root_devices == atomic_read(&device_root_tasks_started)) + complete(&device_root_tasks_done); + return 0; +} + /** * device_shutdown - call ->shutdown() on each device to shutdown. */ void device_shutdown(void) { - struct device *dev, *parent; + int root_devices = 0; + struct device *dev; - spin_lock(&devices_kset->list_lock); - /* - * Walk the devices list backward, shutting down each in turn. - * Beware that device unplug events may also start pulling - * devices offline, even as the system is shutting down. + atomic_set(&device_root_tasks_finished, 0); + atomic_set(&device_root_tasks_started, 0); + init_completion(&device_root_tasks_done); + + /* Shutdown the root devices in parallel. The children are going to be + * shutdown first in device_shutdown_tree(). */ + spin_lock(&devices_kset->list_lock); while (!list_empty(&devices_kset->list)) { - dev = list_entry(devices_kset->list.prev, struct device, - kobj.entry); + dev = list_entry(devices_kset->list.next, struct device, + kobj.entry); - /* - * hold reference count of device's parent to - * prevent it from being freed because parent's - * lock is to be held - */ - parent = get_device(dev->parent); - get_device(dev); - /* - * Make sure the device is off the kset list, in the - * event that dev->*->shutdown() doesn't remove it. + /* Make sure the device is off the kset list, in the event that + * dev->*->shutdown() doesn't remove it. */ list_del_init(&dev->kobj.entry); - spin_unlock(&devices_kset->list_lock); - - /* hold lock to avoid race with probe/release */ - if (parent) - device_lock(parent); - device_lock(dev); - - /* Don't allow any more runtime suspends */ - pm_runtime_get_noresume(dev); - pm_runtime_barrier(dev); - if (dev->class && dev->class->shutdown_pre) { - if (initcall_debug) - dev_info(dev, "shutdown_pre\n"); - dev->class->shutdown_pre(dev); + /* Here we start tasks for root devices only */ + if (!dev->parent) { + /* Prevents devices from being freed. The counter is + * going to be decremented in device_shutdown_one() once + * this root device is shutdown. + */ + get_device(dev); + + /* We unlock list for performance reasons, + * dev->*->shutdown(), may try to take this lock to + * remove us from kset list. To avoid unlocking this + * list we could replace spin lock in: + * dev->kobj.kset->list_lock with a dummy one once + * device is locked in device_shutdown_root_task() and + * in device_shutdown_child_task(). + */ + spin_unlock(&devices_kset->list_lock); + + root_devices++; + kthread_run(device_shutdown_root_task, + dev, "device_root_shutdown.%s", + dev_name(dev)); + spin_lock(&devices_kset->list_lock); } - if (dev->bus && dev->bus->shutdown) { - if (initcall_debug) - dev_info(dev, "shutdown\n"); - dev->bus->shutdown(dev); - } else if (dev->driver && dev->driver->shutdown) { - if (initcall_debug) - dev_info(dev, "shutdown\n"); - dev->driver->shutdown(dev); - } - - device_unlock(dev); - if (parent) - device_unlock(parent); - - put_device(dev); - put_device(parent); - - spin_lock(&devices_kset->list_lock); } spin_unlock(&devices_kset->list_lock); + + /* Set number of root tasks started, and waits for completion */ + atomic_set(&device_root_tasks_started, root_devices); + if (root_devices != atomic_read(&device_root_tasks_finished)) + wait_for_completion(&device_root_tasks_done); } /* -- 2.17.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH v2 1/1] drivers core: multi-threading device shutdown @ 2018-05-05 15:40 ` Pavel Tatashin 0 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw) To: intel-wired-lan When system is rebooted, halted or kexeced device_shutdown() is called. This function shuts down every single device by calling either: dev->bus->shutdown(dev) dev->driver->shutdown(dev) Even on a machine with just a moderate amount of devices, device_shutdown() may take multiple seconds to complete. This is because many devices require a specific delays to perform this operation. Here is a sample analysis of time it takes to call device_shutdown() on a two socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz machine. device_shutdown 2.95s ----------------------------- mlx4_shutdown 1.14s megasas_shutdown 0.24s ixgbe_shutdown 0.37s x 4 (four ixgbe devices on this machine). the rest 0.09s In mlx4 we spent the most time, but that is because there is a 1 second sleep, which is defined by hardware specifications: mlx4_shutdown mlx4_unload_one mlx4_free_ownership msleep(1000) With megasas we spend a quarter of a second, but sometimes longer (up-to 0.5s) in this path: megasas_shutdown megasas_flush_cache megasas_issue_blocked_cmd wait_event_timeout Finally, with ixgbe_shutdown() it takes 0.37 for each device, but that time is spread all over the place, with bigger offenders: ixgbe_shutdown __ixgbe_shutdown ixgbe_close_suspend ixgbe_down ixgbe_init_hw_generic ixgbe_reset_hw_X540 msleep(100); 0.104483472 ixgbe_get_san_mac_addr_generic 0.048414851 ixgbe_get_wwn_prefix_generic 0.048409893 ixgbe_start_hw_X540 ixgbe_start_hw_generic ixgbe_clear_hw_cntrs_generic 0.048581502 ixgbe_setup_fc_generic 0.024225800 All the ixgbe_*generic functions end-up calling: ixgbe_read_eerd_X540() ixgbe_acquire_swfw_sync_X540 usleep_range(5000, 6000); ixgbe_release_swfw_sync_X540 usleep_range(5000, 6000); While these are short sleeps, they end-up calling them over 24 times! 24 * 0.0055s = 0.132s. Adding-up to 0.528s for four devices. Also we have four msleep(100). Totaling to: 0.928s While we should keep optimizing the individual device drivers, in some cases this is simply a hardware property that forces a specific delay, and we must wait. So, the solution for this problem is to shutdown devices in parallel. However, we must shutdown children before shutting down parents, so parent device must wait for its children to finish. With this patch, on the same machine devices_shutdown() takes 1.142s, and without mlx4 one second delay only 0.38s Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> --- drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 225 insertions(+), 50 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index b610816eb887..110d7970deef 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -25,6 +25,7 @@ #include <linux/netdevice.h> #include <linux/sched/signal.h> #include <linux/sysfs.h> +#include <linux/kthread.h> #include "base.h" #include "power/power.h" @@ -2102,6 +2103,59 @@ const char *device_get_devnode(struct device *dev, return *tmp = s; } +/** + * device_children_count - device children count + * @parent: parent struct device. + * + * Returns number of children for this device or 0 if none. + */ +static int device_children_count(struct device *parent) +{ + struct klist_iter i; + int children = 0; + + if (!parent->p) + return 0; + + klist_iter_init(&parent->p->klist_children, &i); + while (next_device(&i)) + children++; + klist_iter_exit(&i); + + return children; +} + +/** + * device_get_child_by_index - Return child using the provided index. + * @parent: parent struct device. + * @index: Index of the child, where 0 is the first child in the children list, + * and so on. + * + * Returns child or NULL if child with this index is not present. + */ +static struct device * +device_get_child_by_index(struct device *parent, int index) +{ + struct klist_iter i; + struct device *dev = NULL, *d; + int child_index = 0; + + if (!parent->p || index < 0) + return NULL; + + klist_iter_init(&parent->p->klist_children, &i); + while ((d = next_device(&i))) { + if (child_index == index) { + dev = d; + break; + } + child_index++; + } + klist_iter_exit(&i); + + return dev; +} + /** * device_for_each_child - device child iterator. * @parent: parent struct device. @@ -2765,71 +2819,192 @@ int device_move(struct device *dev, struct device *new_parent, } EXPORT_SYMBOL_GPL(device_move); +/* + * device_shutdown_one - call ->shutdown() for the device passed as + * argument. + */ +static void device_shutdown_one(struct device *dev) +{ + /* Don't allow any more runtime suspends */ + pm_runtime_get_noresume(dev); + pm_runtime_barrier(dev); + + if (dev->class && dev->class->shutdown_pre) { + if (initcall_debug) + dev_info(dev, "shutdown_pre\n"); + dev->class->shutdown_pre(dev); + } + if (dev->bus && dev->bus->shutdown) { + if (initcall_debug) + dev_info(dev, "shutdown\n"); + dev->bus->shutdown(dev); + } else if (dev->driver && dev->driver->shutdown) { + if (initcall_debug) + dev_info(dev, "shutdown\n"); + dev->driver->shutdown(dev); + } + + /* decrement the reference counter */ + put_device(dev); +} + +/** + * Passed as an argument to device_shutdown_child_task(). + * child_next_index the next available child index. + * tasks_running number of tasks still running. Each tasks decrements it + * when job is finished and the last task signals that the + * job is complete. + * complete Used to signal job competition. + * parent Parent device. + */ +struct device_shutdown_task_data { + atomic_t child_next_index; + atomic_t tasks_running; + struct completion complete; + struct device *parent; +}; + +static int device_shutdown_child_task(void *data); + +/** + * These globals are used by tasks that are started for root devices. + * device_root_tasks_finished Number of root devices finished shutting down. + * device_root_tasks_started Total number of root devices tasks started. + * device_root_tasks_done The completion signal to the main thread. + */ +static atomic_t device_root_tasks_finished; +static atomic_t device_root_tasks_started; +struct completion device_root_tasks_done; + +/** + * Shutdown device tree with root started in dev. If dev has no children + * simply shutdown only this device. If dev has children recursively shutdown + * children first, and only then the parent. For performance reasons children + * are shutdown in parallel using kernel threads. dev is locked, so its children + * cannot be removed while this functions is running. + */ +static void device_shutdown_tree(struct device *dev) +{ + int children_count = device_children_count(dev); + + if (children_count) { + struct device_shutdown_task_data tdata; + int i; + + init_completion(&tdata.complete); + atomic_set(&tdata.child_next_index, 0); + atomic_set(&tdata.tasks_running, children_count); + tdata.parent = dev; + + for (i = 0; i < children_count; i++) { + kthread_run(device_shutdown_child_task, + &tdata, "device_shutdown.%s", + dev_name(dev)); + } + wait_for_completion(&tdata.complete); + } + device_shutdown_one(dev); +} + +/** + * Only devices with parent are going through this function. The parent is + * locked and waits for all of its children to finish shutting down before + * calling shutdown function for itself. + */ +static int device_shutdown_child_task(void *data) +{ + struct device_shutdown_task_data *tdata = data; + int cidx = atomic_inc_return(&tdata->child_next_index) - 1; + struct device *dev = device_get_child_by_index(tdata->parent, cidx); + + /* ref. counter is going to be decremented in device_shutdown_one() */ + get_device(dev); + device_lock(dev); + device_shutdown_tree(dev); + device_unlock(dev); + + /* If we are the last to exit, signal the completion */ + if (atomic_dec_return(&tdata->tasks_running) == 0) + complete(&tdata->complete); + return 0; +} + +/** + * On shutdown each root device (the one that does not have a parent) goes + * through this function. + */ +static int device_shutdown_root_task(void *data) +{ + struct device *dev = (struct device *)data; + int root_devices; + + device_lock(dev); + device_shutdown_tree(dev); + device_unlock(dev); + + /* If we are the last to exit, signal the completion */ + root_devices = atomic_inc_return(&device_root_tasks_finished); + if (root_devices == atomic_read(&device_root_tasks_started)) + complete(&device_root_tasks_done); + return 0; +} + /** * device_shutdown - call ->shutdown() on each device to shutdown. */ void device_shutdown(void) { - struct device *dev, *parent; + int root_devices = 0; + struct device *dev; - spin_lock(&devices_kset->list_lock); - /* - * Walk the devices list backward, shutting down each in turn. - * Beware that device unplug events may also start pulling - * devices offline, even as the system is shutting down. + atomic_set(&device_root_tasks_finished, 0); + atomic_set(&device_root_tasks_started, 0); + init_completion(&device_root_tasks_done); + + /* Shutdown the root devices in parallel. The children are going to be + * shutdown first in device_shutdown_tree(). */ + spin_lock(&devices_kset->list_lock); while (!list_empty(&devices_kset->list)) { - dev = list_entry(devices_kset->list.prev, struct device, - kobj.entry); + dev = list_entry(devices_kset->list.next, struct device, + kobj.entry); - /* - * hold reference count of device's parent to - * prevent it from being freed because parent's - * lock is to be held - */ - parent = get_device(dev->parent); - get_device(dev); - /* - * Make sure the device is off the kset list, in the - * event that dev->*->shutdown() doesn't remove it. + /* Make sure the device is off the kset list, in the event that + * dev->*->shutdown() doesn't remove it. */ list_del_init(&dev->kobj.entry); - spin_unlock(&devices_kset->list_lock); - - /* hold lock to avoid race with probe/release */ - if (parent) - device_lock(parent); - device_lock(dev); - - /* Don't allow any more runtime suspends */ - pm_runtime_get_noresume(dev); - pm_runtime_barrier(dev); - if (dev->class && dev->class->shutdown_pre) { - if (initcall_debug) - dev_info(dev, "shutdown_pre\n"); - dev->class->shutdown_pre(dev); + /* Here we start tasks for root devices only */ + if (!dev->parent) { + /* Prevents devices from being freed. The counter is + * going to be decremented in device_shutdown_one() once + * this root device is shutdown. + */ + get_device(dev); + + /* We unlock list for performance reasons, + * dev->*->shutdown(), may try to take this lock to + * remove us from kset list. To avoid unlocking this + * list we could replace spin lock in: + * dev->kobj.kset->list_lock with a dummy one once + * device is locked in device_shutdown_root_task() and + * in device_shutdown_child_task(). + */ + spin_unlock(&devices_kset->list_lock); + + root_devices++; + kthread_run(device_shutdown_root_task, + dev, "device_root_shutdown.%s", + dev_name(dev)); + spin_lock(&devices_kset->list_lock); } - if (dev->bus && dev->bus->shutdown) { - if (initcall_debug) - dev_info(dev, "shutdown\n"); - dev->bus->shutdown(dev); - } else if (dev->driver && dev->driver->shutdown) { - if (initcall_debug) - dev_info(dev, "shutdown\n"); - dev->driver->shutdown(dev); - } - - device_unlock(dev); - if (parent) - device_unlock(parent); - - put_device(dev); - put_device(parent); - - spin_lock(&devices_kset->list_lock); } spin_unlock(&devices_kset->list_lock); + + /* Set number of root tasks started, and waits for completion */ + atomic_set(&device_root_tasks_started, root_devices); + if (root_devices != atomic_read(&device_root_tasks_finished)) + wait_for_completion(&device_root_tasks_done); } /* -- 2.17.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Intel-wired-lan] [PATCH v2 1/1] drivers core: multi-threading device shutdown 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin @ 2018-05-06 1:56 ` kbuild test robot -1 siblings, 0 replies; 12+ messages in thread From: kbuild test robot @ 2018-05-06 1:56 UTC (permalink / raw) To: Pavel Tatashin Cc: kbuild-all, pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, alexander.duyck, tobin Hi Pavel, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on driver-core/driver-core-testing] [also build test WARNING on v4.17-rc3 next-20180504] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Pavel-Tatashin/multi-threading-device-shutdown/20180506-035150 reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) >> drivers/base/core.c:2877:25: sparse: symbol 'device_root_tasks_done' was not declared. Should it be static? drivers/base/core.c:62:5: sparse: context imbalance in 'device_links_read_lock' - wrong count at exit drivers/base/core.c:67:6: sparse: context imbalance in 'device_links_read_unlock' - unexpected unlock Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH v2 1/1] drivers core: multi-threading device shutdown @ 2018-05-06 1:56 ` kbuild test robot 0 siblings, 0 replies; 12+ messages in thread From: kbuild test robot @ 2018-05-06 1:56 UTC (permalink / raw) To: intel-wired-lan Hi Pavel, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on driver-core/driver-core-testing] [also build test WARNING on v4.17-rc3 next-20180504] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Pavel-Tatashin/multi-threading-device-shutdown/20180506-035150 reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) >> drivers/base/core.c:2877:25: sparse: symbol 'device_root_tasks_done' was not declared. Should it be static? drivers/base/core.c:62:5: sparse: context imbalance in 'device_links_read_lock' - wrong count at exit drivers/base/core.c:67:6: sparse: context imbalance in 'device_links_read_unlock' - unexpected unlock Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH] drivers core: device_root_tasks_done can be static 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin @ 2018-05-06 1:56 ` kbuild test robot -1 siblings, 0 replies; 12+ messages in thread From: kbuild test robot @ 2018-05-06 1:56 UTC (permalink / raw) To: Pavel Tatashin Cc: kbuild-all, pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, alexander.duyck, tobin Fixes: 6d54ba128905 ("drivers core: multi-threading device shutdown") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> --- core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 110d797..666c163 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2874,7 +2874,7 @@ static int device_shutdown_child_task(void *data); */ static atomic_t device_root_tasks_finished; static atomic_t device_root_tasks_started; -struct completion device_root_tasks_done; +static struct completion device_root_tasks_done; /** * Shutdown device tree with root started in dev. If dev has no children ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [RFC PATCH] drivers core: device_root_tasks_done can be static @ 2018-05-06 1:56 ` kbuild test robot 0 siblings, 0 replies; 12+ messages in thread From: kbuild test robot @ 2018-05-06 1:56 UTC (permalink / raw) To: intel-wired-lan Fixes: 6d54ba128905 ("drivers core: multi-threading device shutdown") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> --- core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 110d797..666c163 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2874,7 +2874,7 @@ static int device_shutdown_child_task(void *data); */ static atomic_t device_root_tasks_finished; static atomic_t device_root_tasks_started; -struct completion device_root_tasks_done; +static struct completion device_root_tasks_done; /** * Shutdown device tree with root started in dev. If dev has no children ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/1] drivers core: multi-threading device shutdown 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin @ 2018-05-23 10:40 ` Pavel Machek -1 siblings, 0 replies; 12+ messages in thread From: Pavel Machek @ 2018-05-23 10:40 UTC (permalink / raw) To: Pavel Tatashin Cc: steven.sistare, daniel.m.jordan, linux-kernel, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, alexander.duyck, tobin [-- Attachment #1: Type: text/plain, Size: 1260 bytes --] Hi! > When system is rebooted, halted or kexeced device_shutdown() is > called. > > This function shuts down every single device by calling either: > > dev->bus->shutdown(dev) > dev->driver->shutdown(dev) ... > Finally, with ixgbe_shutdown() it takes 0.37 for each device, but that time > is spread all over the place, with bigger offenders: > > ixgbe_shutdown > __ixgbe_shutdown > ixgbe_close_suspend > ixgbe_down > ixgbe_init_hw_generic > ixgbe_reset_hw_X540 > msleep(100); 0.104483472 > ixgbe_get_san_mac_addr_generic 0.048414851 > ixgbe_get_wwn_prefix_generic 0.048409893 > ixgbe_start_hw_X540 > ixgbe_start_hw_generic > ixgbe_clear_hw_cntrs_generic 0.048581502 > ixgbe_setup_fc_generic 0.024225800 ixgbe is network card, right? So ... it does not have any persistent state and no moving parts, and there's no reason we could not "just power it down"? > /* -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH v2 1/1] drivers core: multi-threading device shutdown @ 2018-05-23 10:40 ` Pavel Machek 0 siblings, 0 replies; 12+ messages in thread From: Pavel Machek @ 2018-05-23 10:40 UTC (permalink / raw) To: intel-wired-lan Hi! > When system is rebooted, halted or kexeced device_shutdown() is > called. > > This function shuts down every single device by calling either: > > dev->bus->shutdown(dev) > dev->driver->shutdown(dev) ... > Finally, with ixgbe_shutdown() it takes 0.37 for each device, but that time > is spread all over the place, with bigger offenders: > > ixgbe_shutdown > __ixgbe_shutdown > ixgbe_close_suspend > ixgbe_down > ixgbe_init_hw_generic > ixgbe_reset_hw_X540 > msleep(100); 0.104483472 > ixgbe_get_san_mac_addr_generic 0.048414851 > ixgbe_get_wwn_prefix_generic 0.048409893 > ixgbe_start_hw_X540 > ixgbe_start_hw_generic > ixgbe_clear_hw_cntrs_generic 0.048581502 > ixgbe_setup_fc_generic 0.024225800 ixgbe is network card, right? So ... it does not have any persistent state and no moving parts, and there's no reason we could not "just power it down"? > /* -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: Digital signature URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20180523/8c822be1/attachment.asc> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/1] drivers core: multi-threading device shutdown 2018-05-23 10:40 ` [Intel-wired-lan] " Pavel Machek @ 2018-05-23 11:31 ` Pavel Tatashin -1 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-23 11:31 UTC (permalink / raw) To: pavel Cc: Steven Sistare, Daniel Jordan, LKML, jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh, Alexander Duyck, tobin Hi Pavel, Thank you for looking at this patch. BTW, the version 5 is out. The latest thread is anchered here: http://lkml.kernel.org/r/20180516024004.28977-1-pasha.tatashin@oracle.com > ixgbe is network card, right? So ... it does not have any persistent > state and no moving parts, and there's no reason we could not "just > power it down"? True to what you said, but the same path is used for both regular reboot, and kexec reboot. In the later case we want to make sure that devices are quieced and do not send any interrupts, do not do any DMA activity, and basically in the same state as when system was first cold booted, so the driver initializing functions can pick and bring the devices up. My understanding is that because we do not want to diverge the regular reboot and kexec reboot, we do it for both cases. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Intel-wired-lan] [PATCH v2 1/1] drivers core: multi-threading device shutdown @ 2018-05-23 11:31 ` Pavel Tatashin 0 siblings, 0 replies; 12+ messages in thread From: Pavel Tatashin @ 2018-05-23 11:31 UTC (permalink / raw) To: intel-wired-lan Hi Pavel, Thank you for looking at this patch. BTW, the version 5 is out. The latest thread is anchered here: http://lkml.kernel.org/r/20180516024004.28977-1-pasha.tatashin at oracle.com > ixgbe is network card, right? So ... it does not have any persistent > state and no moving parts, and there's no reason we could not "just > power it down"? True to what you said, but the same path is used for both regular reboot, and kexec reboot. In the later case we want to make sure that devices are quieced and do not send any interrupts, do not do any DMA activity, and basically in the same state as when system was first cold booted, so the driver initializing functions can pick and bring the devices up. My understanding is that because we do not want to diverge the regular reboot and kexec reboot, we do it for both cases. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-05-23 11:32 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-05 15:40 [PATCH v2 0/1] multi-threading device shutdown Pavel Tatashin 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin 2018-05-05 15:40 ` [PATCH v2 1/1] drivers core: " Pavel Tatashin 2018-05-05 15:40 ` [Intel-wired-lan] " Pavel Tatashin 2018-05-06 1:56 ` kbuild test robot 2018-05-06 1:56 ` kbuild test robot 2018-05-06 1:56 ` [RFC PATCH] drivers core: device_root_tasks_done can be static kbuild test robot 2018-05-06 1:56 ` [Intel-wired-lan] " kbuild test robot 2018-05-23 10:40 ` [PATCH v2 1/1] drivers core: multi-threading device shutdown Pavel Machek 2018-05-23 10:40 ` [Intel-wired-lan] " Pavel Machek 2018-05-23 11:31 ` Pavel Tatashin 2018-05-23 11:31 ` [Intel-wired-lan] " Pavel Tatashin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.