All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices
@ 2013-04-29 12:23 Rafael J. Wysocki
  2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
                   ` (3 more replies)
  0 siblings, 4 replies; 105+ messages in thread
From: Rafael J. Wysocki @ 2013-04-29 12:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Toshi Kani
  Cc: ACPI Devel Maling List, LKML, isimatu.yasuaki, vasilis.liaskovitis

Hi,

It has been argued for a number of times that in some cases, if a device cannot
be gracefully removed from the system, it shouldn't be removed from it at all,
because that may lead to a kernel crash.  In particular, that will happen if a
memory module holding kernel memory is removed, but also removing the last CPU
in the system may not be a good idea.  [And I can imagine a few other cases
like that.]

The kernel currently only supports "forced" hot-remove which cannot be stopped
once started, so users have no choice but to try to hot-remove stuff and see
whether or not that crashes the kernel which is kind of unpleasant.  That seems
to be based on the "the user knows better" argument according to which users
triggering device hot-removal should really know what they are doing, so the
kernel doesn't have to worry about that.  However, for instance, this pretty
much isn't the case for memory modules, because the users have no way to see
whether or not any kernel memory has been allocated from a given module.

There have been a few attempts to address this issue, but none of them has
gained broader acceptance.  The following 3 patches are the heart of a new
proposal which is based on the idea to introduce device_offline() and
device_online() operations along the lines of the existing CPU offline/online
mechanism (or, rather, to extend the CPU offline/online so that analogous
operations are available for other devices).  The way it is supposed to work is
that device_offline() will fail if the given device cannot be gracefully
removed from the system (in the kernel's view).  Once it succeeds, though, the
device won't be used any more until either it is removed, or device_online() is
run for it.  That will allow the ACPI device hot-remove code, for one example,
to avoid triggering a non-reversible removal procedure for devices that cannot
be removed gracefully.

Patch [1/3] introduces device_offline() and device_online() as outlined above.
The .offline() and .online() callbacks are only added at the bus type level for
now, because that should be sufficient to cover the memory and CPU use cases.

Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
device_online() to support the sysfs 'online' attribute for CPUs.

Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
for checking if graceful removal of devices is possible.  The way it does that
is to walk the list of "physical" companion devices for each struct acpi_device
involved in the operation and call device_offline() for each of them.  If any
of the device_offline() calls fails (and the hot-removal is not "forced", which
is an option), the removal procedure (which is not reversible) is simply not
carried out.

Of some concern is that device_offline() (and possibly device_online()) is
called under physical_node_lock of the corresponding struct acpi_device, which
introduces ordering dependency between that lock and device locks for the
"physical" devices, but I didn't see any cleaner way to do that (I guess it
is avoidable at the expense of added complexity, but for now it's just better
to make the code as clean as possible IMO).

The next step will be to modify the ACPI processor driver to use the new
mechanism.  Unfortunately, this isn't really straightforward, because it
requires untangling some events handling functionality from hotplug support
code, but I don't see any fundamental obstacles to that at the moment.  Then,
the same approach may be applied to memory hotplug and possibly other devices
in the future.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2013-05-23  4:34 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-29 12:23 [PATCH 0/3 RFC] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-04-29 12:26 ` [PATCH 1/3 RFC] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-04-29 23:10   ` Greg Kroah-Hartman
2013-04-30 11:59     ` Rafael J. Wysocki
2013-04-30 15:32       ` Greg Kroah-Hartman
2013-04-30 20:05         ` Rafael J. Wysocki
2013-04-30 23:38   ` Toshi Kani
2013-05-02  0:58     ` Rafael J. Wysocki
2013-05-02 23:29       ` Toshi Kani
2013-05-03 11:48         ` Rafael J. Wysocki
2013-04-29 12:28 ` [PATCH 2/3 RFC] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-04-29 23:11   ` Greg Kroah-Hartman
2013-04-30 12:01     ` Rafael J. Wysocki
2013-04-30 15:27       ` Greg Kroah-Hartman
2013-04-30 20:06         ` Rafael J. Wysocki
2013-04-30 23:42   ` Toshi Kani
2013-05-01 14:49     ` Rafael J. Wysocki
2013-05-01 20:07       ` Toshi Kani
2013-05-02  0:26         ` Rafael J. Wysocki
2013-04-29 12:29 ` [PATCH 3/3 RFC] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-04-30 23:49   ` Toshi Kani
2013-05-01 15:05     ` Rafael J. Wysocki
2013-05-01 20:20       ` Toshi Kani
2013-05-02  0:53         ` Rafael J. Wysocki
2013-05-02 12:26 ` [PATCH 0/4] Driver core / ACPI: Add offline/online for graceful hot-removal of devices Rafael J. Wysocki
2013-05-02 12:27   ` [PATCH 1/4] Driver core: Add offline/online device operations Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 23:11     ` Toshi Kani
2013-05-02 23:36       ` Rafael J. Wysocki
2013-05-02 23:23         ` Toshi Kani
2013-05-02 12:28   ` [PATCH 2/4] Driver core: Use generic offline/online for CPU offline/online Rafael J. Wysocki
2013-05-02 13:57     ` Greg Kroah-Hartman
2013-05-02 12:29   ` [PATCH 3/4] ACPI / hotplug: Use device offline/online for graceful hot-removal Rafael J. Wysocki
2013-05-02 12:31   ` [PATCH 4/4] ACPI / processor: Use common hotplug infrastructure Rafael J. Wysocki
2013-05-02 13:59     ` Greg Kroah-Hartman
2013-05-02 23:20     ` Toshi Kani
2013-05-03 12:05       ` Rafael J. Wysocki
2013-05-03 12:21         ` Rafael J. Wysocki
2013-05-03 18:27         ` Toshi Kani
2013-05-03 19:31           ` Rafael J. Wysocki
2013-05-03 19:34             ` Toshi Kani
2013-05-04  1:01   ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04  1:01     ` Rafael J. Wysocki
2013-05-04  1:03     ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04  1:03       ` Rafael J. Wysocki
2013-05-04  1:04     ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
2013-05-04  1:04       ` Rafael J. Wysocki
2013-05-04  1:06     ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04  1:06       ` Rafael J. Wysocki
2013-05-04 11:11     ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:11       ` Rafael J. Wysocki
2013-05-04 11:12       ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 11:12         ` Rafael J. Wysocki
2013-05-21  6:50         ` Tang Chen
2013-05-21  6:50           ` Tang Chen
2013-05-04 11:21       ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04 11:21         ` Rafael J. Wysocki
2013-05-06 16:28         ` Vasilis Liaskovitis
2013-05-06 16:28           ` Vasilis Liaskovitis
2013-05-07  0:59           ` Rafael J. Wysocki
2013-05-07  0:59             ` Rafael J. Wysocki
2013-05-07 10:59             ` Vasilis Liaskovitis
2013-05-07 10:59               ` Vasilis Liaskovitis
2013-05-07 12:11               ` Rafael J. Wysocki
2013-05-07 12:11                 ` Rafael J. Wysocki
2013-05-07 21:03                 ` Toshi Kani
2013-05-07 21:03                   ` Toshi Kani
2013-05-07 22:10                   ` Rafael J. Wysocki
2013-05-07 22:10                     ` Rafael J. Wysocki
2013-05-07 22:45                     ` Toshi Kani
2013-05-07 22:45                       ` Toshi Kani
2013-05-07 23:17                       ` Rafael J. Wysocki
2013-05-07 23:17                         ` Rafael J. Wysocki
2013-05-07 23:59                         ` Toshi Kani
2013-05-07 23:59                           ` Toshi Kani
2013-05-08  0:24                           ` Rafael J. Wysocki
2013-05-08  0:24                             ` Rafael J. Wysocki
2013-05-08  0:37                             ` Toshi Kani
2013-05-08  0:37                               ` Toshi Kani
2013-05-08 11:53                               ` Rafael J. Wysocki
2013-05-08 11:53                                 ` Rafael J. Wysocki
2013-05-08 14:38                                 ` Toshi Kani
2013-05-08 14:38                                   ` Toshi Kani
2013-05-06 17:20         ` Greg Kroah-Hartman
2013-05-06 17:20           ` Greg Kroah-Hartman
2013-05-06 19:46           ` Rafael J. Wysocki
2013-05-06 19:46             ` Rafael J. Wysocki
2013-05-21  6:37         ` Tang Chen
2013-05-21  6:37           ` Tang Chen
2013-05-21 11:15           ` Rafael J. Wysocki
2013-05-21 11:15             ` Rafael J. Wysocki
2013-05-22  4:45             ` Tang Chen
2013-05-22  4:45               ` Tang Chen
2013-05-22 10:42               ` Rafael J. Wysocki
2013-05-22 10:42                 ` Rafael J. Wysocki
2013-05-22 22:06               ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
2013-05-22 22:06                 ` Rafael J. Wysocki
2013-05-22 22:14                 ` Greg Kroah-Hartman
2013-05-22 22:14                   ` Greg Kroah-Hartman
2013-05-22 23:29                   ` Rafael J. Wysocki
2013-05-22 23:29                     ` Rafael J. Wysocki
2013-05-23  4:37                 ` Tang Chen
2013-05-23  4:37                   ` Tang Chen
2013-05-06 10:48       ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-06 10:48         ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.