linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm: add private lock to serialize memory hotplug operations
@ 2017-03-09 13:06 Heiko Carstens
  2017-03-09 13:06 ` [PATCH 1/2] " Heiko Carstens
  2017-03-09 13:06 ` [PATCH 2/2] drivers core: remove assert_held_device_hotplug() Heiko Carstens
  0 siblings, 2 replies; 15+ messages in thread
From: Heiko Carstens @ 2017-03-09 13:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, linux-s390, Dan Williams, Michal Hocko,
	Rafael J . Wysocki, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott,
	Heiko Carstens

These two patches are supposed to hopefully fix a memory hotplug
problem reported by Sebastian Ott.

Heiko Carstens (2):
  mm: add private lock to serialize memory hotplug operations
  drivers core: remove assert_held_device_hotplug()

 drivers/base/core.c    | 5 -----
 include/linux/device.h | 1 -
 kernel/memremap.c      | 4 ----
 mm/memory_hotplug.c    | 6 +++++-
 4 files changed, 5 insertions(+), 11 deletions(-)

-- 
2.8.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 13:06 [PATCH 0/2] mm: add private lock to serialize memory hotplug operations Heiko Carstens
@ 2017-03-09 13:06 ` Heiko Carstens
  2017-03-09 13:39   ` Rafael J. Wysocki
  2017-03-09 13:06 ` [PATCH 2/2] drivers core: remove assert_held_device_hotplug() Heiko Carstens
  1 sibling, 1 reply; 15+ messages in thread
From: Heiko Carstens @ 2017-03-09 13:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, linux-s390, Dan Williams, Michal Hocko,
	Rafael J . Wysocki, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott,
	Heiko Carstens

Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
introduced new functions get/put_online_mems() and
mem_hotplug_begin/end() in order to allow similar semantics for memory
hotplug like for cpu hotplug.

The corresponding functions for cpu hotplug are get/put_online_cpus()
and cpu_hotplug_begin/done() for cpu hotplug.

The commit however missed to introduce functions that would serialize
memory hotplug operations like they are done for cpu hotplug with
cpu_maps_update_begin/done().

This basically leaves mem_hotplug.active_writer unprotected and allows
concurrent writers to modify it, which may lead to problems as
outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
use mem_hotplug_{begin, done}").

That commit was extended again with commit b5d24fda9c3d ("mm,
devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
done}") which serializes memory hotplug operations for some call
sites by using the device_hotplug lock.

In addition with commit 3fc21924100b ("mm: validate device_hotplug is
held for memory hotplug") a sanity check was added to
mem_hotplug_begin() to verify that the device_hotplug lock is held.

This in turn triggers the following warning on s390:

WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 assert_held_device_hotplug+0x4a/0x58
 Call Trace:
  assert_held_device_hotplug+0x40/0x58)
  mem_hotplug_begin+0x34/0xc8
  add_memory_resource+0x7e/0x1f8
  add_memory+0xda/0x130
  add_memory_merged+0x15c/0x178
  sclp_detect_standby_memory+0x2ae/0x2f8
  do_one_initcall+0xa2/0x150
  kernel_init_freeable+0x228/0x2d8
  kernel_init+0x2a/0x140
  kernel_thread_starter+0x6/0xc

One possible fix would be to add more lock_device_hotplug() and
unlock_device_hotplug() calls around each call site of
mem_hotplug_begin/end(). But that would give the device_hotplug lock
additional semantics it better should not have (serialize memory
hotplug operations).

Instead add a new memory_add_remove_lock which has the similar
semantics like cpu_add_remove_lock for cpu hotplug.

To keep things hopefully a bit easier the lock will be locked and
unlocked within the mem_hotplug_begin/end() functions.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 kernel/memremap.c   | 4 ----
 mm/memory_hotplug.c | 6 +++++-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 06123234f118..07e85e5229da 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -247,11 +247,9 @@ static void devm_memremap_pages_release(struct device *dev, void *data)
 	align_start = res->start & ~(SECTION_SIZE - 1);
 	align_size = ALIGN(resource_size(res), SECTION_SIZE);
 
-	lock_device_hotplug();
 	mem_hotplug_begin();
 	arch_remove_memory(align_start, align_size);
 	mem_hotplug_done();
-	unlock_device_hotplug();
 
 	untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
 	pgmap_radix_release(res);
@@ -364,11 +362,9 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
 	if (error)
 		goto err_pfn_remap;
 
-	lock_device_hotplug();
 	mem_hotplug_begin();
 	error = arch_add_memory(nid, align_start, align_size, true);
 	mem_hotplug_done();
-	unlock_device_hotplug();
 	if (error)
 		goto err_add_memory;
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 295479b792ec..6fa7208bcd56 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -125,9 +125,12 @@ void put_online_mems(void)
 
 }
 
+/* Serializes write accesses to mem_hotplug.active_writer. */
+static DEFINE_MUTEX(memory_add_remove_lock);
+
 void mem_hotplug_begin(void)
 {
-	assert_held_device_hotplug();
+	mutex_lock(&memory_add_remove_lock);
 
 	mem_hotplug.active_writer = current;
 
@@ -147,6 +150,7 @@ void mem_hotplug_done(void)
 	mem_hotplug.active_writer = NULL;
 	mutex_unlock(&mem_hotplug.lock);
 	memhp_lock_release();
+	mutex_unlock(&memory_add_remove_lock);
 }
 
 /* add this memory to iomem resource */
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] drivers core: remove assert_held_device_hotplug()
  2017-03-09 13:06 [PATCH 0/2] mm: add private lock to serialize memory hotplug operations Heiko Carstens
  2017-03-09 13:06 ` [PATCH 1/2] " Heiko Carstens
@ 2017-03-09 13:06 ` Heiko Carstens
  2017-03-09 19:11   ` Dan Williams
  1 sibling, 1 reply; 15+ messages in thread
From: Heiko Carstens @ 2017-03-09 13:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, linux-s390, Dan Williams, Michal Hocko,
	Rafael J . Wysocki, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott,
	Heiko Carstens

The last caller of assert_held_device_hotplug() is gone, so remove it again.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 drivers/base/core.c    | 5 -----
 include/linux/device.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 684bda4d14a1..6bb60fb6a30b 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -639,11 +639,6 @@ int lock_device_hotplug_sysfs(void)
 	return restart_syscall();
 }
 
-void assert_held_device_hotplug(void)
-{
-	lockdep_assert_held(&device_hotplug_lock);
-}
-
 #ifdef CONFIG_BLOCK
 static inline int device_is_not_partition(struct device *dev)
 {
diff --git a/include/linux/device.h b/include/linux/device.h
index 30c4570e928d..9ef518af5515 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1140,7 +1140,6 @@ static inline bool device_supports_offline(struct device *dev)
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
 extern int lock_device_hotplug_sysfs(void);
-void assert_held_device_hotplug(void);
 extern int device_offline(struct device *dev);
 extern int device_online(struct device *dev);
 extern void set_primary_fwnode(struct device *dev, struct fwnode_handle *fwnode);
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 13:06 ` [PATCH 1/2] " Heiko Carstens
@ 2017-03-09 13:39   ` Rafael J. Wysocki
  2017-03-09 18:10     ` Dan Williams
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-09 13:39 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, linux-mm, linux-kernel, linux-s390, Dan Williams,
	Michal Hocko, Vladimir Davydov, Ben Hutchings, Gerald Schaefer,
	Martin Schwidefsky, Sebastian Ott

On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
> introduced new functions get/put_online_mems() and
> mem_hotplug_begin/end() in order to allow similar semantics for memory
> hotplug like for cpu hotplug.
> 
> The corresponding functions for cpu hotplug are get/put_online_cpus()
> and cpu_hotplug_begin/done() for cpu hotplug.
> 
> The commit however missed to introduce functions that would serialize
> memory hotplug operations like they are done for cpu hotplug with
> cpu_maps_update_begin/done().
> 
> This basically leaves mem_hotplug.active_writer unprotected and allows
> concurrent writers to modify it, which may lead to problems as
> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
> use mem_hotplug_{begin, done}").
> 
> That commit was extended again with commit b5d24fda9c3d ("mm,
> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
> done}") which serializes memory hotplug operations for some call
> sites by using the device_hotplug lock.
> 
> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
> held for memory hotplug") a sanity check was added to
> mem_hotplug_begin() to verify that the device_hotplug lock is held.

Admittedly, I haven't looked at all of the code paths involved in detail yet,
but there's one concern regarding lock/unlock_device_hotplug().

The actual main purpose of it is to ensure safe removal of devices in cases
when they cannot be removed separately, like when a whole CPU package
(including possibly an entire NUMA node with memory and all) is removed.

One of the code paths doing that is acpi_scan_hot_remove() which first
tries to offline devices slated for removal and then finally removes them.

The reason why this needs to be done in two stages is because the offlining
can fail, in which case we will fail the entire operation, while the final
removal step is, well, final (meaning that the devices are gone after it no
matter what).

This is done under device_hotplug_lock, so that the devices that were taken
offline in stage 1 cannot be brought back online before stage 2 is carried
out entirely, which surely would be bad if it happened.

Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
question actually affects this mechanism, but this in case it does, it is one
thing to double check before going ahead with this patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 13:39   ` Rafael J. Wysocki
@ 2017-03-09 18:10     ` Dan Williams
  2017-03-09 22:15       ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Dan Williams @ 2017-03-09 18:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
>> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
>> introduced new functions get/put_online_mems() and
>> mem_hotplug_begin/end() in order to allow similar semantics for memory
>> hotplug like for cpu hotplug.
>>
>> The corresponding functions for cpu hotplug are get/put_online_cpus()
>> and cpu_hotplug_begin/done() for cpu hotplug.
>>
>> The commit however missed to introduce functions that would serialize
>> memory hotplug operations like they are done for cpu hotplug with
>> cpu_maps_update_begin/done().
>>
>> This basically leaves mem_hotplug.active_writer unprotected and allows
>> concurrent writers to modify it, which may lead to problems as
>> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
>> use mem_hotplug_{begin, done}").
>>
>> That commit was extended again with commit b5d24fda9c3d ("mm,
>> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
>> done}") which serializes memory hotplug operations for some call
>> sites by using the device_hotplug lock.
>>
>> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
>> held for memory hotplug") a sanity check was added to
>> mem_hotplug_begin() to verify that the device_hotplug lock is held.
>
> Admittedly, I haven't looked at all of the code paths involved in detail yet,
> but there's one concern regarding lock/unlock_device_hotplug().
>
> The actual main purpose of it is to ensure safe removal of devices in cases
> when they cannot be removed separately, like when a whole CPU package
> (including possibly an entire NUMA node with memory and all) is removed.
>
> One of the code paths doing that is acpi_scan_hot_remove() which first
> tries to offline devices slated for removal and then finally removes them.
>
> The reason why this needs to be done in two stages is because the offlining
> can fail, in which case we will fail the entire operation, while the final
> removal step is, well, final (meaning that the devices are gone after it no
> matter what).
>
> This is done under device_hotplug_lock, so that the devices that were taken
> offline in stage 1 cannot be brought back online before stage 2 is carried
> out entirely, which surely would be bad if it happened.
>
> Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
> question actually affects this mechanism, but this in case it does, it is one
> thing to double check before going ahead with this patch.
>

I *think* we're ok in this case because unplugging the CPU package
that contains a persistent memory device will trigger
devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
device can't fail. It may be held off while pages are pinned for DMA
memory, but it will eventually complete.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] drivers core: remove assert_held_device_hotplug()
  2017-03-09 13:06 ` [PATCH 2/2] drivers core: remove assert_held_device_hotplug() Heiko Carstens
@ 2017-03-09 19:11   ` Dan Williams
  0 siblings, 0 replies; 15+ messages in thread
From: Dan Williams @ 2017-03-09 19:11 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, Linux MM, linux-kernel, linux-s390, Michal Hocko,
	Rafael J . Wysocki, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thu, Mar 9, 2017 at 5:06 AM, Heiko Carstens
<heiko.carstens@de.ibm.com> wrote:
> The last caller of assert_held_device_hotplug() is gone, so remove it again.
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Ben Hutchings <ben@decadent.org.uk>
> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

Acked-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 18:10     ` Dan Williams
@ 2017-03-09 22:15       ` Rafael J. Wysocki
  2017-03-09 22:22         ` Rafael J. Wysocki
  2017-03-09 22:33         ` Dan Williams
  0 siblings, 2 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-09 22:15 UTC (permalink / raw)
  To: Dan Williams
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
> On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
> >> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
> >> introduced new functions get/put_online_mems() and
> >> mem_hotplug_begin/end() in order to allow similar semantics for memory
> >> hotplug like for cpu hotplug.
> >>
> >> The corresponding functions for cpu hotplug are get/put_online_cpus()
> >> and cpu_hotplug_begin/done() for cpu hotplug.
> >>
> >> The commit however missed to introduce functions that would serialize
> >> memory hotplug operations like they are done for cpu hotplug with
> >> cpu_maps_update_begin/done().
> >>
> >> This basically leaves mem_hotplug.active_writer unprotected and allows
> >> concurrent writers to modify it, which may lead to problems as
> >> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
> >> use mem_hotplug_{begin, done}").
> >>
> >> That commit was extended again with commit b5d24fda9c3d ("mm,
> >> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
> >> done}") which serializes memory hotplug operations for some call
> >> sites by using the device_hotplug lock.
> >>
> >> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
> >> held for memory hotplug") a sanity check was added to
> >> mem_hotplug_begin() to verify that the device_hotplug lock is held.
> >
> > Admittedly, I haven't looked at all of the code paths involved in detail yet,
> > but there's one concern regarding lock/unlock_device_hotplug().
> >
> > The actual main purpose of it is to ensure safe removal of devices in cases
> > when they cannot be removed separately, like when a whole CPU package
> > (including possibly an entire NUMA node with memory and all) is removed.
> >
> > One of the code paths doing that is acpi_scan_hot_remove() which first
> > tries to offline devices slated for removal and then finally removes them.
> >
> > The reason why this needs to be done in two stages is because the offlining
> > can fail, in which case we will fail the entire operation, while the final
> > removal step is, well, final (meaning that the devices are gone after it no
> > matter what).
> >
> > This is done under device_hotplug_lock, so that the devices that were taken
> > offline in stage 1 cannot be brought back online before stage 2 is carried
> > out entirely, which surely would be bad if it happened.
> >
> > Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
> > question actually affects this mechanism, but this in case it does, it is one
> > thing to double check before going ahead with this patch.
> >
> 
> I *think* we're ok in this case because unplugging the CPU package
> that contains a persistent memory device will trigger
> devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
> device can't fail. It may be held off while pages are pinned for DMA
> memory, but it will eventually complete.

What about the offlining, though?  Is it guaranteed that no memory from those
ranges will go back online after the acpi_scan_try_to_offline() call in
acpi_scan_hot_remove()?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:15       ` Rafael J. Wysocki
@ 2017-03-09 22:22         ` Rafael J. Wysocki
  2017-03-09 22:37           ` Dan Williams
  2017-03-09 22:33         ` Dan Williams
  1 sibling, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-09 22:22 UTC (permalink / raw)
  To: Dan Williams
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thursday, March 09, 2017 11:15:47 PM Rafael J. Wysocki wrote:
> On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
> > On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
> > >> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
> > >> introduced new functions get/put_online_mems() and
> > >> mem_hotplug_begin/end() in order to allow similar semantics for memory
> > >> hotplug like for cpu hotplug.
> > >>
> > >> The corresponding functions for cpu hotplug are get/put_online_cpus()
> > >> and cpu_hotplug_begin/done() for cpu hotplug.
> > >>
> > >> The commit however missed to introduce functions that would serialize
> > >> memory hotplug operations like they are done for cpu hotplug with
> > >> cpu_maps_update_begin/done().
> > >>
> > >> This basically leaves mem_hotplug.active_writer unprotected and allows
> > >> concurrent writers to modify it, which may lead to problems as
> > >> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
> > >> use mem_hotplug_{begin, done}").
> > >>
> > >> That commit was extended again with commit b5d24fda9c3d ("mm,
> > >> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
> > >> done}") which serializes memory hotplug operations for some call
> > >> sites by using the device_hotplug lock.
> > >>
> > >> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
> > >> held for memory hotplug") a sanity check was added to
> > >> mem_hotplug_begin() to verify that the device_hotplug lock is held.
> > >
> > > Admittedly, I haven't looked at all of the code paths involved in detail yet,
> > > but there's one concern regarding lock/unlock_device_hotplug().
> > >
> > > The actual main purpose of it is to ensure safe removal of devices in cases
> > > when they cannot be removed separately, like when a whole CPU package
> > > (including possibly an entire NUMA node with memory and all) is removed.
> > >
> > > One of the code paths doing that is acpi_scan_hot_remove() which first
> > > tries to offline devices slated for removal and then finally removes them.
> > >
> > > The reason why this needs to be done in two stages is because the offlining
> > > can fail, in which case we will fail the entire operation, while the final
> > > removal step is, well, final (meaning that the devices are gone after it no
> > > matter what).
> > >
> > > This is done under device_hotplug_lock, so that the devices that were taken
> > > offline in stage 1 cannot be brought back online before stage 2 is carried
> > > out entirely, which surely would be bad if it happened.
> > >
> > > Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
> > > question actually affects this mechanism, but this in case it does, it is one
> > > thing to double check before going ahead with this patch.
> > >
> > 
> > I *think* we're ok in this case because unplugging the CPU package
> > that contains a persistent memory device will trigger
> > devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
> > device can't fail. It may be held off while pages are pinned for DMA
> > memory, but it will eventually complete.
> 
> What about the offlining, though?  Is it guaranteed that no memory from those
> ranges will go back online after the acpi_scan_try_to_offline() call in
> acpi_scan_hot_remove()?

My point is that after the acpi_evaluate_ej0() in acpi_scan_hot_remove() the
hardware is physically gone, so if anything is still doing DMA to that memory at
that point, then the user is going to be unhappy.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:15       ` Rafael J. Wysocki
  2017-03-09 22:22         ` Rafael J. Wysocki
@ 2017-03-09 22:33         ` Dan Williams
  2017-03-09 22:34           ` Rafael J. Wysocki
  1 sibling, 1 reply; 15+ messages in thread
From: Dan Williams @ 2017-03-09 22:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thu, Mar 9, 2017 at 2:15 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
>> On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
>> >> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
>> >> introduced new functions get/put_online_mems() and
>> >> mem_hotplug_begin/end() in order to allow similar semantics for memory
>> >> hotplug like for cpu hotplug.
>> >>
>> >> The corresponding functions for cpu hotplug are get/put_online_cpus()
>> >> and cpu_hotplug_begin/done() for cpu hotplug.
>> >>
>> >> The commit however missed to introduce functions that would serialize
>> >> memory hotplug operations like they are done for cpu hotplug with
>> >> cpu_maps_update_begin/done().
>> >>
>> >> This basically leaves mem_hotplug.active_writer unprotected and allows
>> >> concurrent writers to modify it, which may lead to problems as
>> >> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
>> >> use mem_hotplug_{begin, done}").
>> >>
>> >> That commit was extended again with commit b5d24fda9c3d ("mm,
>> >> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
>> >> done}") which serializes memory hotplug operations for some call
>> >> sites by using the device_hotplug lock.
>> >>
>> >> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
>> >> held for memory hotplug") a sanity check was added to
>> >> mem_hotplug_begin() to verify that the device_hotplug lock is held.
>> >
>> > Admittedly, I haven't looked at all of the code paths involved in detail yet,
>> > but there's one concern regarding lock/unlock_device_hotplug().
>> >
>> > The actual main purpose of it is to ensure safe removal of devices in cases
>> > when they cannot be removed separately, like when a whole CPU package
>> > (including possibly an entire NUMA node with memory and all) is removed.
>> >
>> > One of the code paths doing that is acpi_scan_hot_remove() which first
>> > tries to offline devices slated for removal and then finally removes them.
>> >
>> > The reason why this needs to be done in two stages is because the offlining
>> > can fail, in which case we will fail the entire operation, while the final
>> > removal step is, well, final (meaning that the devices are gone after it no
>> > matter what).
>> >
>> > This is done under device_hotplug_lock, so that the devices that were taken
>> > offline in stage 1 cannot be brought back online before stage 2 is carried
>> > out entirely, which surely would be bad if it happened.
>> >
>> > Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
>> > question actually affects this mechanism, but this in case it does, it is one
>> > thing to double check before going ahead with this patch.
>> >
>>
>> I *think* we're ok in this case because unplugging the CPU package
>> that contains a persistent memory device will trigger
>> devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
>> device can't fail. It may be held off while pages are pinned for DMA
>> memory, but it will eventually complete.
>
> What about the offlining, though?  Is it guaranteed that no memory from those
> ranges will go back online after the acpi_scan_try_to_offline() call in
> acpi_scan_hot_remove()?

The memory described by devm_memremap_pages() is never "onlined" to
the core mm. We're only using arch_add_memory() to get a linear
mapping and page structures. The rest of memory hotplug is skipped,
and this ZONE_DEVICE memory is otherwise hidden from the core mm.

Are ACPI devices disabled by this point? For example, If we have
disabled the nfit bus device (_HID ACPI0012) then the associated child
pmem device(s) will be gone and not coming back.

Now, that said, the ACPI0012 bus device is global for the entire
system. So we'd need more plumbing to target the pmem on a given
socket without touching the others.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:33         ` Dan Williams
@ 2017-03-09 22:34           ` Rafael J. Wysocki
  2017-03-13 18:57             ` Heiko Carstens
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-09 22:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thursday, March 09, 2017 02:33:43 PM Dan Williams wrote:
> On Thu, Mar 9, 2017 at 2:15 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
> >> On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> > On Thursday, March 09, 2017 02:06:15 PM Heiko Carstens wrote:
> >> >> Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
> >> >> introduced new functions get/put_online_mems() and
> >> >> mem_hotplug_begin/end() in order to allow similar semantics for memory
> >> >> hotplug like for cpu hotplug.
> >> >>
> >> >> The corresponding functions for cpu hotplug are get/put_online_cpus()
> >> >> and cpu_hotplug_begin/done() for cpu hotplug.
> >> >>
> >> >> The commit however missed to introduce functions that would serialize
> >> >> memory hotplug operations like they are done for cpu hotplug with
> >> >> cpu_maps_update_begin/done().
> >> >>
> >> >> This basically leaves mem_hotplug.active_writer unprotected and allows
> >> >> concurrent writers to modify it, which may lead to problems as
> >> >> outlined by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash,
> >> >> use mem_hotplug_{begin, done}").
> >> >>
> >> >> That commit was extended again with commit b5d24fda9c3d ("mm,
> >> >> devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
> >> >> done}") which serializes memory hotplug operations for some call
> >> >> sites by using the device_hotplug lock.
> >> >>
> >> >> In addition with commit 3fc21924100b ("mm: validate device_hotplug is
> >> >> held for memory hotplug") a sanity check was added to
> >> >> mem_hotplug_begin() to verify that the device_hotplug lock is held.
> >> >
> >> > Admittedly, I haven't looked at all of the code paths involved in detail yet,
> >> > but there's one concern regarding lock/unlock_device_hotplug().
> >> >
> >> > The actual main purpose of it is to ensure safe removal of devices in cases
> >> > when they cannot be removed separately, like when a whole CPU package
> >> > (including possibly an entire NUMA node with memory and all) is removed.
> >> >
> >> > One of the code paths doing that is acpi_scan_hot_remove() which first
> >> > tries to offline devices slated for removal and then finally removes them.
> >> >
> >> > The reason why this needs to be done in two stages is because the offlining
> >> > can fail, in which case we will fail the entire operation, while the final
> >> > removal step is, well, final (meaning that the devices are gone after it no
> >> > matter what).
> >> >
> >> > This is done under device_hotplug_lock, so that the devices that were taken
> >> > offline in stage 1 cannot be brought back online before stage 2 is carried
> >> > out entirely, which surely would be bad if it happened.
> >> >
> >> > Now, I'm not sure if removing lock/unlock_device_hotplug() from the code in
> >> > question actually affects this mechanism, but this in case it does, it is one
> >> > thing to double check before going ahead with this patch.
> >> >
> >>
> >> I *think* we're ok in this case because unplugging the CPU package
> >> that contains a persistent memory device will trigger
> >> devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
> >> device can't fail. It may be held off while pages are pinned for DMA
> >> memory, but it will eventually complete.
> >
> > What about the offlining, though?  Is it guaranteed that no memory from those
> > ranges will go back online after the acpi_scan_try_to_offline() call in
> > acpi_scan_hot_remove()?
> 
> The memory described by devm_memremap_pages() is never "onlined" to
> the core mm. We're only using arch_add_memory() to get a linear
> mapping and page structures. The rest of memory hotplug is skipped,
> and this ZONE_DEVICE memory is otherwise hidden from the core mm.

OK, that should be fine then.

> Are ACPI devices disabled by this point? For example, If we have
> disabled the nfit bus device (_HID ACPI0012) then the associated child
> pmem device(s) will be gone and not coming back.

We call acpi_bus_trim() on the root of the subtree in question before calling
acpi_evaluat_ej0(), so the driver's ->remove() should be called before that,
but it can't leave any delayed works behind.

> Now, that said, the ACPI0012 bus device is global for the entire
> system. So we'd need more plumbing to target the pmem on a given
> socket without touching the others.

Well, it's all a bit academic at this point AFAICS.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:22         ` Rafael J. Wysocki
@ 2017-03-09 22:37           ` Dan Williams
  2017-03-09 22:43             ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Dan Williams @ 2017-03-09 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thu, Mar 9, 2017 at 2:22 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, March 09, 2017 11:15:47 PM Rafael J. Wysocki wrote:
>> On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
>> > On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
[..]
>> > I *think* we're ok in this case because unplugging the CPU package
>> > that contains a persistent memory device will trigger
>> > devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
>> > device can't fail. It may be held off while pages are pinned for DMA
>> > memory, but it will eventually complete.
>>
>> What about the offlining, though?  Is it guaranteed that no memory from those
>> ranges will go back online after the acpi_scan_try_to_offline() call in
>> acpi_scan_hot_remove()?
>
> My point is that after the acpi_evaluate_ej0() in acpi_scan_hot_remove() the
> hardware is physically gone, so if anything is still doing DMA to that memory at
> that point, then the user is going to be unhappy.

Hmm, ACPI 6.1 does not have any text about what _EJ0 means for ACPI0012.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:37           ` Dan Williams
@ 2017-03-09 22:43             ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-09 22:43 UTC (permalink / raw)
  To: Dan Williams
  Cc: Heiko Carstens, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Thursday, March 09, 2017 02:37:55 PM Dan Williams wrote:
> On Thu, Mar 9, 2017 at 2:22 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, March 09, 2017 11:15:47 PM Rafael J. Wysocki wrote:
> >> On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
> >> > On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> [..]
> >> > I *think* we're ok in this case because unplugging the CPU package
> >> > that contains a persistent memory device will trigger
> >> > devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
> >> > device can't fail. It may be held off while pages are pinned for DMA
> >> > memory, but it will eventually complete.
> >>
> >> What about the offlining, though?  Is it guaranteed that no memory from those
> >> ranges will go back online after the acpi_scan_try_to_offline() call in
> >> acpi_scan_hot_remove()?
> >
> > My point is that after the acpi_evaluate_ej0() in acpi_scan_hot_remove() the
> > hardware is physically gone, so if anything is still doing DMA to that memory at
> > that point, then the user is going to be unhappy.
> 
> Hmm, ACPI 6.1 does not have any text about what _EJ0 means for ACPI0012.

ACPI0012 is exceptional, but in general _EJ0 does not have to be present under
a particular device for it to be affected.  It can be under the device's parent, for
example, in which case the entire subtree under a device with _EJ0 goes away in
one go.  And that very well may mean disconnect at the physical level (voltage
goes away IOW).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-09 22:34           ` Rafael J. Wysocki
@ 2017-03-13 18:57             ` Heiko Carstens
  2017-03-13 19:44               ` Dan Williams
  0 siblings, 1 reply; 15+ messages in thread
From: Heiko Carstens @ 2017-03-13 18:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Dan Williams, Andrew Morton, Linux MM, linux-kernel, linux-s390,
	Michal Hocko, Vladimir Davydov, Ben Hutchings, Gerald Schaefer,
	Martin Schwidefsky, Sebastian Ott

On Thu, Mar 09, 2017 at 11:34:44PM +0100, Rafael J. Wysocki wrote:
> > The memory described by devm_memremap_pages() is never "onlined" to
> > the core mm. We're only using arch_add_memory() to get a linear
> > mapping and page structures. The rest of memory hotplug is skipped,
> > and this ZONE_DEVICE memory is otherwise hidden from the core mm.
> 
> OK, that should be fine then.

So, does that mean that the patch is ok as it is? If so, it would be good
to get an Ack from both, you and Dan, please.

Thanks,
Heiko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-13 18:57             ` Heiko Carstens
@ 2017-03-13 19:44               ` Dan Williams
  2017-03-13 21:15                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Dan Williams @ 2017-03-13 19:44 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Rafael J. Wysocki, Andrew Morton, Linux MM, linux-kernel,
	linux-s390, Michal Hocko, Vladimir Davydov, Ben Hutchings,
	Gerald Schaefer, Martin Schwidefsky, Sebastian Ott

On Mon, Mar 13, 2017 at 11:57 AM, Heiko Carstens
<heiko.carstens@de.ibm.com> wrote:
> On Thu, Mar 09, 2017 at 11:34:44PM +0100, Rafael J. Wysocki wrote:
>> > The memory described by devm_memremap_pages() is never "onlined" to
>> > the core mm. We're only using arch_add_memory() to get a linear
>> > mapping and page structures. The rest of memory hotplug is skipped,
>> > and this ZONE_DEVICE memory is otherwise hidden from the core mm.
>>
>> OK, that should be fine then.
>
> So, does that mean that the patch is ok as it is? If so, it would be good
> to get an Ack from both, you and Dan, please.

Acked-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations
  2017-03-13 19:44               ` Dan Williams
@ 2017-03-13 21:15                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2017-03-13 21:15 UTC (permalink / raw)
  To: Dan Williams, Heiko Carstens
  Cc: Andrew Morton, Linux MM, linux-kernel, linux-s390, Michal Hocko,
	Vladimir Davydov, Ben Hutchings, Gerald Schaefer,
	Martin Schwidefsky, Sebastian Ott

On Monday, March 13, 2017 12:44:25 PM Dan Williams wrote:
> On Mon, Mar 13, 2017 at 11:57 AM, Heiko Carstens
> <heiko.carstens@de.ibm.com> wrote:
> > On Thu, Mar 09, 2017 at 11:34:44PM +0100, Rafael J. Wysocki wrote:
> >> > The memory described by devm_memremap_pages() is never "onlined" to
> >> > the core mm. We're only using arch_add_memory() to get a linear
> >> > mapping and page structures. The rest of memory hotplug is skipped,
> >> > and this ZONE_DEVICE memory is otherwise hidden from the core mm.
> >>
> >> OK, that should be fine then.
> >
> > So, does that mean that the patch is ok as it is? If so, it would be good
> > to get an Ack from both, you and Dan, please.
> 
> Acked-by: Dan Williams <dan.j.williams@intel.com>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-03-13 21:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 13:06 [PATCH 0/2] mm: add private lock to serialize memory hotplug operations Heiko Carstens
2017-03-09 13:06 ` [PATCH 1/2] " Heiko Carstens
2017-03-09 13:39   ` Rafael J. Wysocki
2017-03-09 18:10     ` Dan Williams
2017-03-09 22:15       ` Rafael J. Wysocki
2017-03-09 22:22         ` Rafael J. Wysocki
2017-03-09 22:37           ` Dan Williams
2017-03-09 22:43             ` Rafael J. Wysocki
2017-03-09 22:33         ` Dan Williams
2017-03-09 22:34           ` Rafael J. Wysocki
2017-03-13 18:57             ` Heiko Carstens
2017-03-13 19:44               ` Dan Williams
2017-03-13 21:15                 ` Rafael J. Wysocki
2017-03-09 13:06 ` [PATCH 2/2] drivers core: remove assert_held_device_hotplug() Heiko Carstens
2017-03-09 19:11   ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).