All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] driver core: check for dead devices before onlining/offlining
@ 2020-01-20 10:49 David Hildenbrand
  2020-01-24  9:00 ` Greg Kroah-Hartman
  2020-01-24 17:14 ` David Hildenbrand
  0 siblings, 2 replies; 13+ messages in thread
From: David Hildenbrand @ 2020-01-20 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Greg Kroah-Hartman, Rafael J. Wysocki,
	Suzuki K Poulose, Saravana Kannan, Heikki Krogerus, Dan Williams,
	Michal Hocko

We can have rare cases where the removal of a device races with
somebody trying to online it (esp. via sysfs). We can simply check
if the device is already removed or getting removed under the dev->lock.

E.g., right now, if memory block devices are removed (remove_memory()),
we do a:

remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
lock_device() -> dev->dead = true

Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
triggers a:

lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...

So if we made it just before the lock_device_hotplug_sysfs() but get
delayed until remove_memory() released all locks, we will continue
taking locks and trying to online the device - which is then a zombie
device.

Note that at least the memory onlining path seems to be protected by
checking if all memory sections are still present (something we can then
get rid of). We do have other sysfs attributes
(e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
such locking yet and might race with memory removal in a similar way. For
these users, we can then do a

device_lock(dev);
if (!device_is_dead(dev)) {
	/* magic /*
}
device_unlock(dev);

Introduce and use device_is_dead() right away.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---

Am I missing any obvious mechanism in the device core that handles
something like this already? (especially also for other sysfs attributes?)

---
 drivers/base/core.c    | 13 +++++++++++--
 drivers/base/dd.c      |  6 +++---
 include/linux/device.h |  1 +
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 42a672456432..01cd06eeb513 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2952,7 +2952,9 @@ int device_offline(struct device *dev)
 		return ret;
 
 	device_lock(dev);
-	if (device_supports_offline(dev)) {
+	if (device_is_dead(dev))
+		ret = -ENXIO;
+	else if (device_supports_offline(dev)) {
 		if (dev->offline) {
 			ret = 1;
 		} else {
@@ -2983,7 +2985,9 @@ int device_online(struct device *dev)
 	int ret = 0;
 
 	device_lock(dev);
-	if (device_supports_offline(dev)) {
+	if (device_is_dead(dev))
+		ret = -ENXIO;
+	else if (device_supports_offline(dev)) {
 		if (dev->offline) {
 			ret = dev->bus->online(dev);
 			if (!ret) {
@@ -2999,6 +3003,11 @@ int device_online(struct device *dev)
 	return ret;
 }
 
+bool device_is_dead(struct device *dev)
+{
+	return dev->p->dead;
+}
+
 struct root_device {
 	struct device dev;
 	struct module *owner;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index b25bcab2a26b..b1d7361720af 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -848,7 +848,7 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
 	 * and deferred probe processing happens all at once with
 	 * multiple threads.
 	 */
-	if (dev->p->dead || dev->driver)
+	if (device_is_dead(dev) || dev->driver)
 		goto out_unlock;
 
 	if (dev->parent)
@@ -994,7 +994,7 @@ int device_driver_attach(struct device_driver *drv, struct device *dev)
 	 * If device has been removed or someone has already successfully
 	 * bound a driver before us just skip the driver probe call.
 	 */
-	if (!dev->p->dead && !dev->driver)
+	if (!device_is_dead(dev) && !dev->driver)
 		ret = driver_probe_device(drv, dev);
 
 	__device_driver_unlock(dev, dev->parent);
@@ -1016,7 +1016,7 @@ static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie)
 	 * If device has been removed or someone has already successfully
 	 * bound a driver before us just skip the driver probe call.
 	 */
-	if (!dev->p->dead && !dev->driver)
+	if (!device_is_dead(dev) && !dev->driver)
 		ret = driver_probe_device(drv, dev);
 
 	__device_driver_unlock(dev, dev->parent);
diff --git a/include/linux/device.h b/include/linux/device.h
index ce6db68c3f29..acea6a14daa1 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -832,6 +832,7 @@ extern void unlock_device_hotplug(void);
 extern int lock_device_hotplug_sysfs(void);
 extern int device_offline(struct device *dev);
 extern int device_online(struct device *dev);
+extern bool device_is_dead(struct device *dev);
 extern void set_primary_fwnode(struct device *dev, struct fwnode_handle *fwnode);
 extern void set_secondary_fwnode(struct device *dev, struct fwnode_handle *fwnode);
 void device_set_of_node_from_dev(struct device *dev, const struct device *dev2);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-20 10:49 [PATCH v1] driver core: check for dead devices before onlining/offlining David Hildenbrand
@ 2020-01-24  9:00 ` Greg Kroah-Hartman
  2020-01-24  9:09   ` David Hildenbrand
  2020-01-24 17:14 ` David Hildenbrand
  1 sibling, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-24  9:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
> We can have rare cases where the removal of a device races with
> somebody trying to online it (esp. via sysfs). We can simply check
> if the device is already removed or getting removed under the dev->lock.
> 
> E.g., right now, if memory block devices are removed (remove_memory()),
> we do a:
> 
> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> lock_device() -> dev->dead = true
> 
> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> triggers a:
> 
> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> 
> So if we made it just before the lock_device_hotplug_sysfs() but get
> delayed until remove_memory() released all locks, we will continue
> taking locks and trying to online the device - which is then a zombie
> device.
> 
> Note that at least the memory onlining path seems to be protected by
> checking if all memory sections are still present (something we can then
> get rid of). We do have other sysfs attributes
> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> such locking yet and might race with memory removal in a similar way. For
> these users, we can then do a
> 
> device_lock(dev);
> if (!device_is_dead(dev)) {
> 	/* magic /*
> }
> device_unlock(dev);
> 
> Introduce and use device_is_dead() right away.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Saravana Kannan <saravanak@google.com>
> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> 
> Am I missing any obvious mechanism in the device core that handles
> something like this already? (especially also for other sysfs attributes?)

So is a sysfs attribute causing the device itself to go away?  We have
problems with that in the past, look at how the scsi layer handled it, I
think there's a specific call you should be making instead of trying to
rely on this "dead" flag.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24  9:00 ` Greg Kroah-Hartman
@ 2020-01-24  9:09   ` David Hildenbrand
  2020-01-24  9:12     ` Greg Kroah-Hartman
  2020-01-24  9:38     ` Rafael J. Wysocki
  0 siblings, 2 replies; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24  9:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 24.01.20 10:00, Greg Kroah-Hartman wrote:
> On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
>> We can have rare cases where the removal of a device races with
>> somebody trying to online it (esp. via sysfs). We can simply check
>> if the device is already removed or getting removed under the dev->lock.
>>
>> E.g., right now, if memory block devices are removed (remove_memory()),
>> we do a:
>>
>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>> lock_device() -> dev->dead = true
>>
>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>> triggers a:
>>
>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>
>> So if we made it just before the lock_device_hotplug_sysfs() but get
>> delayed until remove_memory() released all locks, we will continue
>> taking locks and trying to online the device - which is then a zombie
>> device.
>>
>> Note that at least the memory onlining path seems to be protected by
>> checking if all memory sections are still present (something we can then
>> get rid of). We do have other sysfs attributes
>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>> such locking yet and might race with memory removal in a similar way. For
>> these users, we can then do a
>>
>> device_lock(dev);
>> if (!device_is_dead(dev)) {
>> 	/* magic /*
>> }
>> device_unlock(dev);
>>
>> Introduce and use device_is_dead() right away.
>>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> Cc: Saravana Kannan <saravanak@google.com>
>> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>
>> Am I missing any obvious mechanism in the device core that handles
>> something like this already? (especially also for other sysfs attributes?)
> 
> So is a sysfs attribute causing the device itself to go away?  We have

nope, removal is triggered via the driver, not via a sysfs attribute.

Regarding this patch: Is there anything prohibiting the possible
scenario I document above (IOW, is this patch applicable, or is there
another way to fence it properly (e.g., the "specific call" you mentioned))?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24  9:09   ` David Hildenbrand
@ 2020-01-24  9:12     ` Greg Kroah-Hartman
  2020-01-24 13:31       ` David Hildenbrand
  2020-01-24  9:38     ` Rafael J. Wysocki
  1 sibling, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-24  9:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On Fri, Jan 24, 2020 at 10:09:03AM +0100, David Hildenbrand wrote:
> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
> > On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
> >> We can have rare cases where the removal of a device races with
> >> somebody trying to online it (esp. via sysfs). We can simply check
> >> if the device is already removed or getting removed under the dev->lock.
> >>
> >> E.g., right now, if memory block devices are removed (remove_memory()),
> >> we do a:
> >>
> >> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> >> lock_device() -> dev->dead = true
> >>
> >> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> >> triggers a:
> >>
> >> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> >>
> >> So if we made it just before the lock_device_hotplug_sysfs() but get
> >> delayed until remove_memory() released all locks, we will continue
> >> taking locks and trying to online the device - which is then a zombie
> >> device.
> >>
> >> Note that at least the memory onlining path seems to be protected by
> >> checking if all memory sections are still present (something we can then
> >> get rid of). We do have other sysfs attributes
> >> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> >> such locking yet and might race with memory removal in a similar way. For
> >> these users, we can then do a
> >>
> >> device_lock(dev);
> >> if (!device_is_dead(dev)) {
> >> 	/* magic /*
> >> }
> >> device_unlock(dev);
> >>
> >> Introduce and use device_is_dead() right away.
> >>
> >> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> >> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> Cc: Saravana Kannan <saravanak@google.com>
> >> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Michal Hocko <mhocko@kernel.org>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> >> ---
> >>
> >> Am I missing any obvious mechanism in the device core that handles
> >> something like this already? (especially also for other sysfs attributes?)
> > 
> > So is a sysfs attribute causing the device itself to go away?  We have
> 
> nope, removal is triggered via the driver, not via a sysfs attribute.

But the idea is the same, it comes from the driver, not the driver core.

> Regarding this patch: Is there anything prohibiting the possible
> scenario I document above (IOW, is this patch applicable, or is there
> another way to fence it properly (e.g., the "specific call" you mentioned))?

I think it's the same thing, look at how scsi does it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24  9:09   ` David Hildenbrand
  2020-01-24  9:12     ` Greg Kroah-Hartman
@ 2020-01-24  9:38     ` Rafael J. Wysocki
  2020-01-24 13:29       ` David Hildenbrand
  1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2020-01-24  9:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Greg Kroah-Hartman, Linux Kernel Mailing List, Rafael J. Wysocki,
	Suzuki K Poulose, Saravana Kannan, Heikki Krogerus, Dan Williams,
	Michal Hocko

On Fri, Jan 24, 2020 at 10:09 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
> > On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
> >> We can have rare cases where the removal of a device races with
> >> somebody trying to online it (esp. via sysfs). We can simply check
> >> if the device is already removed or getting removed under the dev->lock.
> >>
> >> E.g., right now, if memory block devices are removed (remove_memory()),
> >> we do a:
> >>
> >> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> >> lock_device() -> dev->dead = true
> >>
> >> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> >> triggers a:
> >>
> >> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> >>
> >> So if we made it just before the lock_device_hotplug_sysfs() but get
> >> delayed until remove_memory() released all locks, we will continue
> >> taking locks and trying to online the device - which is then a zombie
> >> device.
> >>
> >> Note that at least the memory onlining path seems to be protected by
> >> checking if all memory sections are still present (something we can then
> >> get rid of). We do have other sysfs attributes
> >> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> >> such locking yet and might race with memory removal in a similar way. For
> >> these users, we can then do a
> >>
> >> device_lock(dev);
> >> if (!device_is_dead(dev)) {
> >>      /* magic /*
> >> }
> >> device_unlock(dev);
> >>
> >> Introduce and use device_is_dead() right away.
> >>
> >> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> >> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> Cc: Saravana Kannan <saravanak@google.com>
> >> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Michal Hocko <mhocko@kernel.org>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> >> ---
> >>
> >> Am I missing any obvious mechanism in the device core that handles
> >> something like this already? (especially also for other sysfs attributes?)
> >
> > So is a sysfs attribute causing the device itself to go away?  We have
>
> nope, removal is triggered via the driver, not via a sysfs attribute.
>
> Regarding this patch: Is there anything prohibiting the possible
> scenario I document above (IOW, is this patch applicable, or is there
> another way to fence it properly (e.g., the "specific call" you mentioned))?

For the devices that support online/offline (like CPUs in memory), the
idea is that calling device_del() on them while they are in use may
cause problems like data loss to occur and note that device_del()
itself cannot fail (because it needs to handle surprise removal too).
However, offline can fail, so the rule of thumb is to do the offline
(and handle the errors possibly returned by it) and only call
device_del() if that is successful.

Of course, if surprise removal is possible, offline is kind of
pointless, but if it is supported anyway it should return success when
the device is physically not present already.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24  9:38     ` Rafael J. Wysocki
@ 2020-01-24 13:29       ` David Hildenbrand
  0 siblings, 0 replies; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Linux Kernel Mailing List, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 24.01.20 10:38, Rafael J. Wysocki wrote:
> On Fri, Jan 24, 2020 at 10:09 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
>>> On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
>>>> We can have rare cases where the removal of a device races with
>>>> somebody trying to online it (esp. via sysfs). We can simply check
>>>> if the device is already removed or getting removed under the dev->lock.
>>>>
>>>> E.g., right now, if memory block devices are removed (remove_memory()),
>>>> we do a:
>>>>
>>>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>>>> lock_device() -> dev->dead = true
>>>>
>>>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>>>> triggers a:
>>>>
>>>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>>>
>>>> So if we made it just before the lock_device_hotplug_sysfs() but get
>>>> delayed until remove_memory() released all locks, we will continue
>>>> taking locks and trying to online the device - which is then a zombie
>>>> device.
>>>>
>>>> Note that at least the memory onlining path seems to be protected by
>>>> checking if all memory sections are still present (something we can then
>>>> get rid of). We do have other sysfs attributes
>>>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>>>> such locking yet and might race with memory removal in a similar way. For
>>>> these users, we can then do a
>>>>
>>>> device_lock(dev);
>>>> if (!device_is_dead(dev)) {
>>>>      /* magic /*
>>>> }
>>>> device_unlock(dev);
>>>>
>>>> Introduce and use device_is_dead() right away.
>>>>
>>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> Cc: Saravana Kannan <saravanak@google.com>
>>>> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>
>>>> Am I missing any obvious mechanism in the device core that handles
>>>> something like this already? (especially also for other sysfs attributes?)
>>>
>>> So is a sysfs attribute causing the device itself to go away?  We have
>>
>> nope, removal is triggered via the driver, not via a sysfs attribute.
>>
>> Regarding this patch: Is there anything prohibiting the possible
>> scenario I document above (IOW, is this patch applicable, or is there
>> another way to fence it properly (e.g., the "specific call" you mentioned))?
> 
> For the devices that support online/offline (like CPUs in memory), the
> idea is that calling device_del() on them while they are in use may
> cause problems like data loss to occur and note that device_del()
> itself cannot fail (because it needs to handle surprise removal too).
> However, offline can fail, so the rule of thumb is to do the offline
> (and handle the errors possibly returned by it) and only call
> device_del() if that is successful.
> 
> Of course, if surprise removal is possible, offline is kind of
> pointless, but if it is supported anyway it should return success when
> the device is physically not present already.

Just to clarify: This is not about removing devices that are still
online (surprise removal). This is about possible race with user space
*while* removing an offline device. I think Greg pointed me into the
right direction regarding avoiding that ...


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24  9:12     ` Greg Kroah-Hartman
@ 2020-01-24 13:31       ` David Hildenbrand
  2020-01-24 13:48         ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 13:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 24.01.20 10:12, Greg Kroah-Hartman wrote:
> On Fri, Jan 24, 2020 at 10:09:03AM +0100, David Hildenbrand wrote:
>> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
>>> On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
>>>> We can have rare cases where the removal of a device races with
>>>> somebody trying to online it (esp. via sysfs). We can simply check
>>>> if the device is already removed or getting removed under the dev->lock.
>>>>
>>>> E.g., right now, if memory block devices are removed (remove_memory()),
>>>> we do a:
>>>>
>>>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>>>> lock_device() -> dev->dead = true
>>>>
>>>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>>>> triggers a:
>>>>
>>>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>>>
>>>> So if we made it just before the lock_device_hotplug_sysfs() but get
>>>> delayed until remove_memory() released all locks, we will continue
>>>> taking locks and trying to online the device - which is then a zombie
>>>> device.
>>>>
>>>> Note that at least the memory onlining path seems to be protected by
>>>> checking if all memory sections are still present (something we can then
>>>> get rid of). We do have other sysfs attributes
>>>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>>>> such locking yet and might race with memory removal in a similar way. For
>>>> these users, we can then do a
>>>>
>>>> device_lock(dev);
>>>> if (!device_is_dead(dev)) {
>>>> 	/* magic /*
>>>> }
>>>> device_unlock(dev);
>>>>
>>>> Introduce and use device_is_dead() right away.
>>>>
>>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> Cc: Saravana Kannan <saravanak@google.com>
>>>> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>
>>>> Am I missing any obvious mechanism in the device core that handles
>>>> something like this already? (especially also for other sysfs attributes?)
>>>
>>> So is a sysfs attribute causing the device itself to go away?  We have
>>
>> nope, removal is triggered via the driver, not via a sysfs attribute.
> 
> But the idea is the same, it comes from the driver, not the driver core.
> 
>> Regarding this patch: Is there anything prohibiting the possible
>> scenario I document above (IOW, is this patch applicable, or is there
>> another way to fence it properly (e.g., the "specific call" you mentioned))?
> 
> I think it's the same thing, look at how scsi does it.

I think you are talking about doing a "transport_remove_device(dev)"
before doing the "device_del(dev)", combined with proper locking.

Will look into that for the memory subsystem ...

Thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24 13:31       ` David Hildenbrand
@ 2020-01-24 13:48         ` David Hildenbrand
  2020-01-24 16:31           ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 13:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 24.01.20 14:31, David Hildenbrand wrote:
> On 24.01.20 10:12, Greg Kroah-Hartman wrote:
>> On Fri, Jan 24, 2020 at 10:09:03AM +0100, David Hildenbrand wrote:
>>> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
>>>> On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
>>>>> We can have rare cases where the removal of a device races with
>>>>> somebody trying to online it (esp. via sysfs). We can simply check
>>>>> if the device is already removed or getting removed under the dev->lock.
>>>>>
>>>>> E.g., right now, if memory block devices are removed (remove_memory()),
>>>>> we do a:
>>>>>
>>>>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>>>>> lock_device() -> dev->dead = true
>>>>>
>>>>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>>>>> triggers a:
>>>>>
>>>>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>>>>
>>>>> So if we made it just before the lock_device_hotplug_sysfs() but get
>>>>> delayed until remove_memory() released all locks, we will continue
>>>>> taking locks and trying to online the device - which is then a zombie
>>>>> device.
>>>>>
>>>>> Note that at least the memory onlining path seems to be protected by
>>>>> checking if all memory sections are still present (something we can then
>>>>> get rid of). We do have other sysfs attributes
>>>>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>>>>> such locking yet and might race with memory removal in a similar way. For
>>>>> these users, we can then do a
>>>>>
>>>>> device_lock(dev);
>>>>> if (!device_is_dead(dev)) {
>>>>> 	/* magic /*
>>>>> }
>>>>> device_unlock(dev);
>>>>>
>>>>> Introduce and use device_is_dead() right away.
>>>>>
>>>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>> Cc: Saravana Kannan <saravanak@google.com>
>>>>> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>> ---
>>>>>
>>>>> Am I missing any obvious mechanism in the device core that handles
>>>>> something like this already? (especially also for other sysfs attributes?)
>>>>
>>>> So is a sysfs attribute causing the device itself to go away?  We have
>>>
>>> nope, removal is triggered via the driver, not via a sysfs attribute.
>>
>> But the idea is the same, it comes from the driver, not the driver core.
>>
>>> Regarding this patch: Is there anything prohibiting the possible
>>> scenario I document above (IOW, is this patch applicable, or is there
>>> another way to fence it properly (e.g., the "specific call" you mentioned))?
>>
>> I think it's the same thing, look at how scsi does it.
> 
> I think you are talking about doing a "transport_remove_device(dev)"
> before doing the "device_del(dev)", combined with proper locking.
> 
> Will look into that for the memory subsystem ...

... looking into transports, it most probably does not apply here, hmm ...

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24 13:48         ` David Hildenbrand
@ 2020-01-24 16:31           ` David Hildenbrand
  0 siblings, 0 replies; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 16:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 24.01.20 14:48, David Hildenbrand wrote:
> On 24.01.20 14:31, David Hildenbrand wrote:
>> On 24.01.20 10:12, Greg Kroah-Hartman wrote:
>>> On Fri, Jan 24, 2020 at 10:09:03AM +0100, David Hildenbrand wrote:
>>>> On 24.01.20 10:00, Greg Kroah-Hartman wrote:
>>>>> On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
>>>>>> We can have rare cases where the removal of a device races with
>>>>>> somebody trying to online it (esp. via sysfs). We can simply check
>>>>>> if the device is already removed or getting removed under the dev->lock.
>>>>>>
>>>>>> E.g., right now, if memory block devices are removed (remove_memory()),
>>>>>> we do a:
>>>>>>
>>>>>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>>>>>> lock_device() -> dev->dead = true
>>>>>>
>>>>>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>>>>>> triggers a:
>>>>>>
>>>>>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>>>>>
>>>>>> So if we made it just before the lock_device_hotplug_sysfs() but get
>>>>>> delayed until remove_memory() released all locks, we will continue
>>>>>> taking locks and trying to online the device - which is then a zombie
>>>>>> device.
>>>>>>
>>>>>> Note that at least the memory onlining path seems to be protected by
>>>>>> checking if all memory sections are still present (something we can then
>>>>>> get rid of). We do have other sysfs attributes
>>>>>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>>>>>> such locking yet and might race with memory removal in a similar way. For
>>>>>> these users, we can then do a
>>>>>>
>>>>>> device_lock(dev);
>>>>>> if (!device_is_dead(dev)) {
>>>>>> 	/* magic /*
>>>>>> }
>>>>>> device_unlock(dev);
>>>>>>
>>>>>> Introduce and use device_is_dead() right away.
>>>>>>
>>>>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>>>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> Cc: Saravana Kannan <saravanak@google.com>
>>>>>> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
>>>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>>> ---
>>>>>>
>>>>>> Am I missing any obvious mechanism in the device core that handles
>>>>>> something like this already? (especially also for other sysfs attributes?)
>>>>>
>>>>> So is a sysfs attribute causing the device itself to go away?  We have
>>>>
>>>> nope, removal is triggered via the driver, not via a sysfs attribute.
>>>
>>> But the idea is the same, it comes from the driver, not the driver core.
>>>
>>>> Regarding this patch: Is there anything prohibiting the possible
>>>> scenario I document above (IOW, is this patch applicable, or is there
>>>> another way to fence it properly (e.g., the "specific call" you mentioned))?
>>>
>>> I think it's the same thing, look at how scsi does it.
>>
>> I think you are talking about doing a "transport_remove_device(dev)"
>> before doing the "device_del(dev)", combined with proper locking.
>>
>> Will look into that for the memory subsystem ...
> 
> ... looking into transports, it most probably does not apply here, hmm ...
> 

/me digging through random code

I think you were referring to

1. sysfs_break_active_protection()
2. sysfs_unbreak_active_protection()

Which can be used to implement a self-removal of devices via attributes
(e.g., sdev_store_delete())

I don't see yet how that would solve the race between some driver
removing an offline device and someone being stuck in an attribute,
waiting for a lock - but maybe I am looking at the wrong function again :)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-20 10:49 [PATCH v1] driver core: check for dead devices before onlining/offlining David Hildenbrand
  2020-01-24  9:00 ` Greg Kroah-Hartman
@ 2020-01-24 17:14 ` David Hildenbrand
  2020-01-24 18:40   ` Dan Williams
  1 sibling, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 17:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Suzuki K Poulose,
	Saravana Kannan, Heikki Krogerus, Dan Williams, Michal Hocko

On 20.01.20 11:49, David Hildenbrand wrote:
> We can have rare cases where the removal of a device races with
> somebody trying to online it (esp. via sysfs). We can simply check
> if the device is already removed or getting removed under the dev->lock.
> 
> E.g., right now, if memory block devices are removed (remove_memory()),
> we do a:
> 
> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> lock_device() -> dev->dead = true
> 
> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> triggers a:
> 
> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> 
> So if we made it just before the lock_device_hotplug_sysfs() but get
> delayed until remove_memory() released all locks, we will continue
> taking locks and trying to online the device - which is then a zombie
> device.
> 
> Note that at least the memory onlining path seems to be protected by
> checking if all memory sections are still present (something we can then
> get rid of). We do have other sysfs attributes
> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> such locking yet and might race with memory removal in a similar way. For
> these users, we can then do a
> 
> device_lock(dev);
> if (!device_is_dead(dev)) {
> 	/* magic /*
> }
> device_unlock(dev);
> 
> Introduce and use device_is_dead() right away.
> 

So, I just added the following:

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 01cd06eeb513..49c4d8671073 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1567,6 +1567,7 @@ static ssize_t online_store(struct device *dev,
struct device_attribute *attr,
        if (ret < 0)
                return ret;

+       msleep(10000);
        ret = lock_device_hotplug_sysfs();
        if (ret)
                return ret;

Then triggered
	echo 1 > /sys/devices/system/memory/memory51/online
And quickly afterwards unplugged the DIMM.

Good news is that we get (after 10 seconds)
	sh: echo: write error: No such device

Reason is that unplug will not finish before all sysfs attributes have
been exited by other threads. Therefore, the device_hotplug_lock will
remain held by the removing thread. The thread stuck in the sysfs
attribute will fail to trylock the device_hotplug_lock and return.

Other sysfs attributes that don't do a lock_device_hotplug_sysfs() might
have to do a trylock/lock on the device_lock to synchronize properly.

Summary: This patch is not necessary.

-- 
Thanks,

David / dhildenb


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24 17:14 ` David Hildenbrand
@ 2020-01-24 18:40   ` Dan Williams
  2020-01-24 18:58     ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2020-01-24 18:40 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Rafael J. Wysocki,
	Suzuki K Poulose, Saravana Kannan, Heikki Krogerus, Michal Hocko

On Fri, Jan 24, 2020 at 9:14 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 20.01.20 11:49, David Hildenbrand wrote:
> > We can have rare cases where the removal of a device races with
> > somebody trying to online it (esp. via sysfs). We can simply check
> > if the device is already removed or getting removed under the dev->lock.
> >
> > E.g., right now, if memory block devices are removed (remove_memory()),
> > we do a:
> >
> > remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> > lock_device() -> dev->dead = true
> >
> > Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> > triggers a:
> >
> > lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> >
> > So if we made it just before the lock_device_hotplug_sysfs() but get
> > delayed until remove_memory() released all locks, we will continue
> > taking locks and trying to online the device - which is then a zombie
> > device.
> >
> > Note that at least the memory onlining path seems to be protected by
> > checking if all memory sections are still present (something we can then
> > get rid of). We do have other sysfs attributes
> > (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> > such locking yet and might race with memory removal in a similar way. For
> > these users, we can then do a
> >
> > device_lock(dev);
> > if (!device_is_dead(dev)) {
> >       /* magic /*
> > }
> > device_unlock(dev);
> >
> > Introduce and use device_is_dead() right away.
> >
>
> So, I just added the following:
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 01cd06eeb513..49c4d8671073 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1567,6 +1567,7 @@ static ssize_t online_store(struct device *dev,
> struct device_attribute *attr,
>         if (ret < 0)
>                 return ret;
>
> +       msleep(10000);
>         ret = lock_device_hotplug_sysfs();
>         if (ret)
>                 return ret;
>
> Then triggered
>         echo 1 > /sys/devices/system/memory/memory51/online
> And quickly afterwards unplugged the DIMM.
>
> Good news is that we get (after 10 seconds)
>         sh: echo: write error: No such device
>
> Reason is that unplug will not finish before all sysfs attributes have
> been exited by other threads.

The unplug thread gets blocked for 10 seconds waiting for this thread
in online_store() to exit?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24 18:40   ` Dan Williams
@ 2020-01-24 18:58     ` David Hildenbrand
  2020-01-24 22:36       ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2020-01-24 18:58 UTC (permalink / raw)
  To: Dan Williams
  Cc: David Hildenbrand, Linux Kernel Mailing List, Greg Kroah-Hartman,
	Rafael J. Wysocki, Suzuki K Poulose, Saravana Kannan,
	Heikki Krogerus, Michal Hocko



> Am 24.01.2020 um 19:41 schrieb Dan Williams <dan.j.williams@intel.com>:
> 
> On Fri, Jan 24, 2020 at 9:14 AM David Hildenbrand <david@redhat.com> wrote:
>> 
>>> On 20.01.20 11:49, David Hildenbrand wrote:
>>> We can have rare cases where the removal of a device races with
>>> somebody trying to online it (esp. via sysfs). We can simply check
>>> if the device is already removed or getting removed under the dev->lock.
>>> 
>>> E.g., right now, if memory block devices are removed (remove_memory()),
>>> we do a:
>>> 
>>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
>>> lock_device() -> dev->dead = true
>>> 
>>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
>>> triggers a:
>>> 
>>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
>>> 
>>> So if we made it just before the lock_device_hotplug_sysfs() but get
>>> delayed until remove_memory() released all locks, we will continue
>>> taking locks and trying to online the device - which is then a zombie
>>> device.
>>> 
>>> Note that at least the memory onlining path seems to be protected by
>>> checking if all memory sections are still present (something we can then
>>> get rid of). We do have other sysfs attributes
>>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
>>> such locking yet and might race with memory removal in a similar way. For
>>> these users, we can then do a
>>> 
>>> device_lock(dev);
>>> if (!device_is_dead(dev)) {
>>>      /* magic /*
>>> }
>>> device_unlock(dev);
>>> 
>>> Introduce and use device_is_dead() right away.
>>> 
>> 
>> So, I just added the following:
>> 
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 01cd06eeb513..49c4d8671073 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -1567,6 +1567,7 @@ static ssize_t online_store(struct device *dev,
>> struct device_attribute *attr,
>>        if (ret < 0)
>>                return ret;
>> 
>> +       msleep(10000);
>>        ret = lock_device_hotplug_sysfs();
>>        if (ret)
>>                return ret;
>> 
>> Then triggered
>>        echo 1 > /sys/devices/system/memory/memory51/online
>> And quickly afterwards unplugged the DIMM.
>> 
>> Good news is that we get (after 10 seconds)
>>        sh: echo: write error: No such device
>> 
>> Reason is that unplug will not finish before all sysfs attributes have
>> been exited by other threads.
> 
> The unplug thread gets blocked for 10 seconds waiting for this thread
> in online_store() to exit?
> 

Yes, that‘s what I observed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
  2020-01-24 18:58     ` David Hildenbrand
@ 2020-01-24 22:36       ` Dan Williams
  0 siblings, 0 replies; 13+ messages in thread
From: Dan Williams @ 2020-01-24 22:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Rafael J. Wysocki,
	Suzuki K Poulose, Saravana Kannan, Heikki Krogerus, Michal Hocko

On Fri, Jan 24, 2020 at 10:58 AM David Hildenbrand <david@redhat.com> wrote:
>
>
>
> > Am 24.01.2020 um 19:41 schrieb Dan Williams <dan.j.williams@intel.com>:
> >
> > On Fri, Jan 24, 2020 at 9:14 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >>> On 20.01.20 11:49, David Hildenbrand wrote:
> >>> We can have rare cases where the removal of a device races with
> >>> somebody trying to online it (esp. via sysfs). We can simply check
> >>> if the device is already removed or getting removed under the dev->lock.
> >>>
> >>> E.g., right now, if memory block devices are removed (remove_memory()),
> >>> we do a:
> >>>
> >>> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> >>> lock_device() -> dev->dead = true
> >>>
> >>> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> >>> triggers a:
> >>>
> >>> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> >>>
> >>> So if we made it just before the lock_device_hotplug_sysfs() but get
> >>> delayed until remove_memory() released all locks, we will continue
> >>> taking locks and trying to online the device - which is then a zombie
> >>> device.
> >>>
> >>> Note that at least the memory onlining path seems to be protected by
> >>> checking if all memory sections are still present (something we can then
> >>> get rid of). We do have other sysfs attributes
> >>> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> >>> such locking yet and might race with memory removal in a similar way. For
> >>> these users, we can then do a
> >>>
> >>> device_lock(dev);
> >>> if (!device_is_dead(dev)) {
> >>>      /* magic /*
> >>> }
> >>> device_unlock(dev);
> >>>
> >>> Introduce and use device_is_dead() right away.
> >>>
> >>
> >> So, I just added the following:
> >>
> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> index 01cd06eeb513..49c4d8671073 100644
> >> --- a/drivers/base/core.c
> >> +++ b/drivers/base/core.c
> >> @@ -1567,6 +1567,7 @@ static ssize_t online_store(struct device *dev,
> >> struct device_attribute *attr,
> >>        if (ret < 0)
> >>                return ret;
> >>
> >> +       msleep(10000);
> >>        ret = lock_device_hotplug_sysfs();
> >>        if (ret)
> >>                return ret;
> >>
> >> Then triggered
> >>        echo 1 > /sys/devices/system/memory/memory51/online
> >> And quickly afterwards unplugged the DIMM.
> >>
> >> Good news is that we get (after 10 seconds)
> >>        sh: echo: write error: No such device
> >>
> >> Reason is that unplug will not finish before all sysfs attributes have
> >> been exited by other threads.
> >
> > The unplug thread gets blocked for 10 seconds waiting for this thread
> > in online_store() to exit?
> >
>
> Yes, that‘s what I observed.

Nice. This obviously wasn't in my mental model of what sysfs/kernfs guaranteed.

/me digs a bit...

So it's kernfs_drain() that is saving us. That waits for all active
nodes to deactivate and that guarantees that removal can be used as a
barrier. It also indicates that the lockdep splat I saw can't trigger
a deadlock unless mem_hotplug_begin() is being acquired within a
memblock-sysfs ops handler that is trying to remove memory. In all
cases I can audit it's always a 3rd-party sysfs attribute triggering
that removal.

So, simply moving remove_memory_block_devices() outside of
mem_hotplug_begin() should be a sufficient fix, no other invalidation
data needed. I'll respin that fix.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-24 22:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20 10:49 [PATCH v1] driver core: check for dead devices before onlining/offlining David Hildenbrand
2020-01-24  9:00 ` Greg Kroah-Hartman
2020-01-24  9:09   ` David Hildenbrand
2020-01-24  9:12     ` Greg Kroah-Hartman
2020-01-24 13:31       ` David Hildenbrand
2020-01-24 13:48         ` David Hildenbrand
2020-01-24 16:31           ` David Hildenbrand
2020-01-24  9:38     ` Rafael J. Wysocki
2020-01-24 13:29       ` David Hildenbrand
2020-01-24 17:14 ` David Hildenbrand
2020-01-24 18:40   ` Dan Williams
2020-01-24 18:58     ` David Hildenbrand
2020-01-24 22:36       ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.