All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-21 10:44 ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	Boris Ostrovsky, Michael Neuling, Stephen Hemminger,
	Jonathan Corbet, Michael Ellerman, David Hildenbrand,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim

This is the same approach as in the first RFC, but this time without
exporting device_hotplug_lock (requested by Greg) and with some more
details and documentation regarding locking. Tested only on x86 so far.

--------------------------------------------------------------------------

Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.

While e.g.
	echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
	echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.

E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in online_pages()
basically unprotected zone->present_pages then.

Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.

Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.

I propose the general rules (documentation added in patch 6):

1. add_memory/add_memory_resource() must only be called with
   device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
   already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
   device_hotplug_lock. This is already documented and true for now in core
   code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
   online_pages/offline_pages.

To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.


RFC -> RFCv2:
- Don't export device_hotplug_lock, provide proper remove_memory/add_memory
  wrappers.
- Split up the patches a bit.
- Try to improve powernv memtrace locking
- Add some documentation for locking that matches my knowledge

David Hildenbrand (6):
  mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  mm/memory_hotplug: fix online/offline_pages called w.o.
    mem_hotplug_lock
  powerpc/powernv: hold device_hotplug_lock when calling device_online()
  powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  memory-hotplug.txt: Add some details about locking internals

 Documentation/memory-hotplug.txt              | 39 +++++++++++-
 arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
 .../platforms/pseries/hotplug-memory.c        |  8 +--
 drivers/acpi/acpi_memhotplug.c                |  4 +-
 drivers/base/memory.c                         | 22 +++----
 drivers/xen/balloon.c                         |  3 +
 include/linux/memory_hotplug.h                |  4 +-
 mm/memory_hotplug.c                           | 59 +++++++++++++++----
 8 files changed, 115 insertions(+), 38 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-21 10:44 ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Andrew Morton, Balbir Singh,
	Benjamin Herrenschmidt, Boris Ostrovsky, Dan Williams,
	Greg Kroah-Hartman, Haiyang Zhang, Heiko Carstens, John Allen,
	Jonathan Corbet, Joonsoo Kim, Juergen Gross, Kate Stewart,
	K. Y. Srinivasan, Len Brown, Martin Schwidefsky,
	Mathieu Malaterre, Michael Ellerman, Michael Neuling,
	Michal Hocko, Nathan Fontenot, Oscar Salvador, Paul Mackerras,
	Pavel Tatashin, Philippe Ombredanne, Rafael J. Wysocki,
	Rashmica Gupta, Stephen Hemminger, Thomas Gleixner,
	Vlastimil Babka, YASUAKI ISHIMATSU

This is the same approach as in the first RFC, but this time without
exporting device_hotplug_lock (requested by Greg) and with some more
details and documentation regarding locking. Tested only on x86 so far.

--------------------------------------------------------------------------

Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.

While e.g.
	echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
	echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.

E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in online_pages()
basically unprotected zone->present_pages then.

Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.

Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.

I propose the general rules (documentation added in patch 6):

1. add_memory/add_memory_resource() must only be called with
   device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
   already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
   device_hotplug_lock. This is already documented and true for now in core
   code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
   online_pages/offline_pages.

To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.


RFC -> RFCv2:
- Don't export device_hotplug_lock, provide proper remove_memory/add_memory
  wrappers.
- Split up the patches a bit.
- Try to improve powernv memtrace locking
- Add some documentation for locking that matches my knowledge

David Hildenbrand (6):
  mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  mm/memory_hotplug: fix online/offline_pages called w.o.
    mem_hotplug_lock
  powerpc/powernv: hold device_hotplug_lock when calling device_online()
  powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  memory-hotplug.txt: Add some details about locking internals

 Documentation/memory-hotplug.txt              | 39 +++++++++++-
 arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
 .../platforms/pseries/hotplug-memory.c        |  8 +--
 drivers/acpi/acpi_memhotplug.c                |  4 +-
 drivers/base/memory.c                         | 22 +++----
 drivers/xen/balloon.c                         |  3 +
 include/linux/memory_hotplug.h                |  4 +-
 mm/memory_hotplug.c                           | 59 +++++++++++++++----
 8 files changed, 115 insertions(+), 38 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-21 10:44   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown, Rashmica Gupta,
	Michael Neuling, Balbir Singh, Nathan Fontenot, John Allen,
	Andrew Morton, Michal Hocko, Dan Williams, Joonsoo Kim

remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported. So let's provide a variant
that takes the lock and only export that one.

The lock is already held in
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
So, let's use the locked variant.

The lock is not held (but taken in)
	arch/powerpc/platforms/powernv/memtrace.c
So let's keep using the (now) locked variant.

Apart from that, there are not other users in the tree.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c       | 2 --
 arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
 drivers/acpi/acpi_memhotplug.c                  | 2 +-
 include/linux/memory_hotplug.h                  | 3 ++-
 mm/memory_hotplug.c                             | 9 ++++++++-
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 51dc398ae3f7..8f1cd4f3bfd5 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -90,9 +90,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
 			  change_memblock_state);
 
-	lock_device_hotplug();
 	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
-	unlock_device_hotplug();
 
 	return true;
 }
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index c1578f54c626..b3f54466e25f 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -334,7 +334,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz
 	nid = memory_add_physaddr_to_nid(base);
 
 	for (i = 0; i < sections_per_block; i++) {
-		remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
+		__remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
 		base += MIN_MEMORY_BLOCK_SIZE;
 	}
 
@@ -423,7 +423,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
 	block_sz = pseries_memory_block_size();
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
-	remove_memory(nid, lmb->base_addr, block_sz);
+	__remove_memory(nid, lmb->base_addr, block_sz);
 
 	/* Update memory regions for memory remove */
 	memblock_remove(lmb->base_addr, block_sz);
@@ -710,7 +710,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 
 	rc = dlpar_online_lmb(lmb);
 	if (rc) {
-		remove_memory(nid, lmb->base_addr, block_sz);
+		__remove_memory(nid, lmb->base_addr, block_sz);
 		dlpar_remove_device_tree_lmb(lmb);
 	} else {
 		lmb->flags |= DRCONF_MEM_ASSIGNED;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 6b0d3ef7309c..811148415993 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
 
 		acpi_unbind_memory_blocks(info);
-		remove_memory(nid, info->start_addr, info->length);
+		__remove_memory(nid, info->start_addr, info->length);
 		list_del(&info->list);
 		kfree(info);
 	}
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 34a28227068d..1f096852f479 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -301,6 +301,7 @@ extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern void remove_memory(int nid, u64 start, u64 size);
+extern void __remove_memory(int nid, u64 start, u64 size);
 
 #else
 static inline bool is_mem_section_removable(unsigned long pfn,
@@ -317,6 +318,7 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 }
 
 static inline void remove_memory(int nid, u64 start, u64 size) {}
+static inline void __remove_memory(int nid, u64 start, u64 size) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern void __ref free_area_init_core_hotplug(int nid);
@@ -330,7 +332,6 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 		unsigned long nr_pages, struct vmem_altmap *altmap);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
-extern void remove_memory(int nid, u64 start, u64 size);
 extern int sparse_add_one_section(struct pglist_data *pgdat,
 		unsigned long start_pfn, struct vmem_altmap *altmap);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9eea6e809a4e..898e13d4d87d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1872,7 +1872,7 @@ EXPORT_SYMBOL(try_offline_node);
  * and online/offline operations before this call, as required by
  * try_offline_node().
  */
-void __ref remove_memory(int nid, u64 start, u64 size)
+void __ref __remove_memory(int nid, u64 start, u64 size)
 {
 	int ret;
 
@@ -1901,5 +1901,12 @@ void __ref remove_memory(int nid, u64 start, u64 size)
 
 	mem_hotplug_done();
 }
+
+void __ref remove_memory(int nid, u64 start, u64 size)
+{
+	lock_device_hotplug();
+	__remove_memory(nid, start, size);
+	unlock_device_hotplug();
+}
 EXPORT_SYMBOL_GPL(remove_memory);
 #endif /* CONFIG_MEMORY_HOTREMOVE */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
@ 2018-08-21 10:44   ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown, Rashmica Gupta,
	Michael Neuling, Balbir Singh, Nathan Fontenot, John Allen,
	Andrew Morton, Michal Hocko, Dan Williams, Joonsoo Kim,
	Vlastimil Babka, Pavel Tatashin, Greg Kroah-Hartman,
	Oscar Salvador, YASUAKI ISHIMATSU, Mathieu Malaterre

remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported. So let's provide a variant
that takes the lock and only export that one.

The lock is already held in
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
So, let's use the locked variant.

The lock is not held (but taken in)
	arch/powerpc/platforms/powernv/memtrace.c
So let's keep using the (now) locked variant.

Apart from that, there are not other users in the tree.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c       | 2 --
 arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
 drivers/acpi/acpi_memhotplug.c                  | 2 +-
 include/linux/memory_hotplug.h                  | 3 ++-
 mm/memory_hotplug.c                             | 9 ++++++++-
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 51dc398ae3f7..8f1cd4f3bfd5 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -90,9 +90,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
 			  change_memblock_state);
 
-	lock_device_hotplug();
 	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
-	unlock_device_hotplug();
 
 	return true;
 }
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index c1578f54c626..b3f54466e25f 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -334,7 +334,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz
 	nid = memory_add_physaddr_to_nid(base);
 
 	for (i = 0; i < sections_per_block; i++) {
-		remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
+		__remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
 		base += MIN_MEMORY_BLOCK_SIZE;
 	}
 
@@ -423,7 +423,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
 	block_sz = pseries_memory_block_size();
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
-	remove_memory(nid, lmb->base_addr, block_sz);
+	__remove_memory(nid, lmb->base_addr, block_sz);
 
 	/* Update memory regions for memory remove */
 	memblock_remove(lmb->base_addr, block_sz);
@@ -710,7 +710,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 
 	rc = dlpar_online_lmb(lmb);
 	if (rc) {
-		remove_memory(nid, lmb->base_addr, block_sz);
+		__remove_memory(nid, lmb->base_addr, block_sz);
 		dlpar_remove_device_tree_lmb(lmb);
 	} else {
 		lmb->flags |= DRCONF_MEM_ASSIGNED;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 6b0d3ef7309c..811148415993 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
 
 		acpi_unbind_memory_blocks(info);
-		remove_memory(nid, info->start_addr, info->length);
+		__remove_memory(nid, info->start_addr, info->length);
 		list_del(&info->list);
 		kfree(info);
 	}
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 34a28227068d..1f096852f479 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -301,6 +301,7 @@ extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern void remove_memory(int nid, u64 start, u64 size);
+extern void __remove_memory(int nid, u64 start, u64 size);
 
 #else
 static inline bool is_mem_section_removable(unsigned long pfn,
@@ -317,6 +318,7 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 }
 
 static inline void remove_memory(int nid, u64 start, u64 size) {}
+static inline void __remove_memory(int nid, u64 start, u64 size) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern void __ref free_area_init_core_hotplug(int nid);
@@ -330,7 +332,6 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 		unsigned long nr_pages, struct vmem_altmap *altmap);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
-extern void remove_memory(int nid, u64 start, u64 size);
 extern int sparse_add_one_section(struct pglist_data *pgdat,
 		unsigned long start_pfn, struct vmem_altmap *altmap);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9eea6e809a4e..898e13d4d87d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1872,7 +1872,7 @@ EXPORT_SYMBOL(try_offline_node);
  * and online/offline operations before this call, as required by
  * try_offline_node().
  */
-void __ref remove_memory(int nid, u64 start, u64 size)
+void __ref __remove_memory(int nid, u64 start, u64 size)
 {
 	int ret;
 
@@ -1901,5 +1901,12 @@ void __ref remove_memory(int nid, u64 start, u64 size)
 
 	mem_hotplug_done();
 }
+
+void __ref remove_memory(int nid, u64 start, u64 size)
+{
+	lock_device_hotplug();
+	__remove_memory(nid, start, size);
+	unlock_device_hotplug();
+}
 EXPORT_SYMBOL_GPL(remove_memory);
 #endif /* CONFIG_MEMORY_HOTREMOVE */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	Paul Mackerras, Rashmica Gupta, Michael Neuling,
	Michael Ellerman, David Hildenbrand, Pavel Tatashin, linux-acpi,
	xen-devel, Len Brown, YASUAKI ISHIMATSU, Nathan Fontenot,
	Dan Williams, Joonsoo Kim, Vlastimil Babka, Oscar Salvador,
	Mathieu Malaterre, Greg Kroah-Hartman, Rafael J. Wysocki,
	linux-kernel, John Allen

remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported. So let's provide a variant
that takes the lock and only export that one.

The lock is already held in
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
So, let's use the locked variant.

The lock is not held (but taken in)
	arch/powerpc/platforms/powernv/memtrace.c
So let's keep using the (now) locked variant.

Apart from that, there are not other users in the tree.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c       | 2 --
 arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
 drivers/acpi/acpi_memhotplug.c                  | 2 +-
 include/linux/memory_hotplug.h                  | 3 ++-
 mm/memory_hotplug.c                             | 9 ++++++++-
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 51dc398ae3f7..8f1cd4f3bfd5 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -90,9 +90,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
 			  change_memblock_state);
 
-	lock_device_hotplug();
 	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
-	unlock_device_hotplug();
 
 	return true;
 }
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index c1578f54c626..b3f54466e25f 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -334,7 +334,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz
 	nid = memory_add_physaddr_to_nid(base);
 
 	for (i = 0; i < sections_per_block; i++) {
-		remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
+		__remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
 		base += MIN_MEMORY_BLOCK_SIZE;
 	}
 
@@ -423,7 +423,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
 	block_sz = pseries_memory_block_size();
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
-	remove_memory(nid, lmb->base_addr, block_sz);
+	__remove_memory(nid, lmb->base_addr, block_sz);
 
 	/* Update memory regions for memory remove */
 	memblock_remove(lmb->base_addr, block_sz);
@@ -710,7 +710,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 
 	rc = dlpar_online_lmb(lmb);
 	if (rc) {
-		remove_memory(nid, lmb->base_addr, block_sz);
+		__remove_memory(nid, lmb->base_addr, block_sz);
 		dlpar_remove_device_tree_lmb(lmb);
 	} else {
 		lmb->flags |= DRCONF_MEM_ASSIGNED;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 6b0d3ef7309c..811148415993 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
 			nid = memory_add_physaddr_to_nid(info->start_addr);
 
 		acpi_unbind_memory_blocks(info);
-		remove_memory(nid, info->start_addr, info->length);
+		__remove_memory(nid, info->start_addr, info->length);
 		list_del(&info->list);
 		kfree(info);
 	}
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 34a28227068d..1f096852f479 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -301,6 +301,7 @@ extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern void remove_memory(int nid, u64 start, u64 size);
+extern void __remove_memory(int nid, u64 start, u64 size);
 
 #else
 static inline bool is_mem_section_removable(unsigned long pfn,
@@ -317,6 +318,7 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 }
 
 static inline void remove_memory(int nid, u64 start, u64 size) {}
+static inline void __remove_memory(int nid, u64 start, u64 size) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern void __ref free_area_init_core_hotplug(int nid);
@@ -330,7 +332,6 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 		unsigned long nr_pages, struct vmem_altmap *altmap);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
-extern void remove_memory(int nid, u64 start, u64 size);
 extern int sparse_add_one_section(struct pglist_data *pgdat,
 		unsigned long start_pfn, struct vmem_altmap *altmap);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9eea6e809a4e..898e13d4d87d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1872,7 +1872,7 @@ EXPORT_SYMBOL(try_offline_node);
  * and online/offline operations before this call, as required by
  * try_offline_node().
  */
-void __ref remove_memory(int nid, u64 start, u64 size)
+void __ref __remove_memory(int nid, u64 start, u64 size)
 {
 	int ret;
 
@@ -1901,5 +1901,12 @@ void __ref remove_memory(int nid, u64 start, u64 size)
 
 	mem_hotplug_done();
 }
+
+void __ref remove_memory(int nid, u64 start, u64 size)
+{
+	lock_device_hotplug();
+	__remove_memory(nid, start, size);
+	unlock_device_hotplug();
+}
 EXPORT_SYMBOL_GPL(remove_memory);
 #endif /* CONFIG_MEMORY_HOTREMOVE */
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-21 10:44   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown,
	Greg Kroah-Hartman, Boris Ostrovsky, Juergen Gross,
	Nathan Fontenot, John Allen, Andrew Morton, Michal Hocko,
	Dan Williams, Joonsoo Kim

add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory
to synchronize against online/offline request (e.g. from user space) -
which already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used
by XEN, which is never built as a module. If somebody requires it, we
also have to export a locked variant (as device_hotplug_lock is never
exported).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../platforms/pseries/hotplug-memory.c        |  2 +-
 drivers/acpi/acpi_memhotplug.c                |  2 +-
 drivers/base/memory.c                         |  9 ++++++--
 drivers/xen/balloon.c                         |  3 +++
 include/linux/memory_hotplug.h                |  1 +
 mm/memory_hotplug.c                           | 22 ++++++++++++++++---
 6 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b3f54466e25f..2e6f41dc103a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -702,7 +702,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
 	/* Add the memory */
-	rc = add_memory(nid, lmb->base_addr, block_sz);
+	rc = __add_memory(nid, lmb->base_addr, block_sz);
 	if (rc) {
 		dlpar_remove_device_tree_lmb(lmb);
 		return rc;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 811148415993..8fe0960ea572 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -228,7 +228,7 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 		if (node < 0)
 			node = memory_add_physaddr_to_nid(info->start_addr);
 
-		result = add_memory(node, info->start_addr, info->length);
+		result = __add_memory(node, info->start_addr, info->length);
 
 		/*
 		 * If the memory block has been used by the kernel, add_memory()
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index c8a1cb0b6136..5b0375be7f65 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -521,15 +521,20 @@ memory_probe_store(struct device *dev, struct device_attribute *attr,
 	if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1))
 		return -EINVAL;
 
+	ret = lock_device_hotplug_sysfs();
+	if (ret)
+		goto out;
+
 	nid = memory_add_physaddr_to_nid(phys_addr);
-	ret = add_memory(nid, phys_addr,
-			 MIN_MEMORY_BLOCK_SIZE * sections_per_block);
+	ret = __add_memory(nid, phys_addr,
+			   MIN_MEMORY_BLOCK_SIZE * sections_per_block);
 
 	if (ret)
 		goto out;
 
 	ret = count;
 out:
+	unlock_device_hotplug();
 	return ret;
 }
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e12bb256036f..6bab019a82b1 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -395,7 +395,10 @@ static enum bp_state reserve_additional_memory(void)
 	 * callers drop the mutex before trying again.
 	 */
 	mutex_unlock(&balloon_mutex);
+	/* add_memory_resource() requires the device_hotplug lock */
+	lock_device_hotplug();
 	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	unlock_device_hotplug();
 	mutex_lock(&balloon_mutex);
 
 	if (rc) {
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 1f096852f479..ffd9cd10fcf3 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -324,6 +324,7 @@ static inline void __remove_memory(int nid, u64 start, u64 size) {}
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
+extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int add_memory_resource(int nid, struct resource *resource, bool online);
 extern int arch_add_memory(int nid, u64 start, u64 size,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 898e13d4d87d..e2b5c751e3ea 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1111,7 +1111,12 @@ static int online_memory_block(struct memory_block *mem, void *arg)
 	return device_online(&mem->dev);
 }
 
-/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
+/*
+ * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
+ * and online/offline operations (triggered e.g. by sysfs).
+ *
+ * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
+ */
 int __ref add_memory_resource(int nid, struct resource *res, bool online)
 {
 	u64 start, size;
@@ -1180,9 +1185,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	mem_hotplug_done();
 	return ret;
 }
-EXPORT_SYMBOL_GPL(add_memory_resource);
 
-int __ref add_memory(int nid, u64 start, u64 size)
+/* requires device_hotplug_lock, see add_memory_resource() */
+int __ref __add_memory(int nid, u64 start, u64 size)
 {
 	struct resource *res;
 	int ret;
@@ -1196,6 +1201,17 @@ int __ref add_memory(int nid, u64 start, u64 size)
 		release_memory_resource(res);
 	return ret;
 }
+
+int add_memory(int nid, u64 start, u64 size)
+{
+	int rc;
+
+	lock_device_hotplug();
+	rc = __add_memory(nid, start, size);
+	unlock_device_hotplug();
+
+	return rc;
+}
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
@ 2018-08-21 10:44   ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown,
	Greg Kroah-Hartman, Boris Ostrovsky, Juergen Gross,
	Nathan Fontenot, John Allen, Andrew Morton, Michal Hocko,
	Dan Williams, Joonsoo Kim, Vlastimil Babka, Oscar Salvador,
	Mathieu Malaterre, Pavel Tatashin, YASUAKI ISHIMATSU

add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory
to synchronize against online/offline request (e.g. from user space) -
which already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used
by XEN, which is never built as a module. If somebody requires it, we
also have to export a locked variant (as device_hotplug_lock is never
exported).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../platforms/pseries/hotplug-memory.c        |  2 +-
 drivers/acpi/acpi_memhotplug.c                |  2 +-
 drivers/base/memory.c                         |  9 ++++++--
 drivers/xen/balloon.c                         |  3 +++
 include/linux/memory_hotplug.h                |  1 +
 mm/memory_hotplug.c                           | 22 ++++++++++++++++---
 6 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b3f54466e25f..2e6f41dc103a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -702,7 +702,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
 	/* Add the memory */
-	rc = add_memory(nid, lmb->base_addr, block_sz);
+	rc = __add_memory(nid, lmb->base_addr, block_sz);
 	if (rc) {
 		dlpar_remove_device_tree_lmb(lmb);
 		return rc;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 811148415993..8fe0960ea572 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -228,7 +228,7 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 		if (node < 0)
 			node = memory_add_physaddr_to_nid(info->start_addr);
 
-		result = add_memory(node, info->start_addr, info->length);
+		result = __add_memory(node, info->start_addr, info->length);
 
 		/*
 		 * If the memory block has been used by the kernel, add_memory()
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index c8a1cb0b6136..5b0375be7f65 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -521,15 +521,20 @@ memory_probe_store(struct device *dev, struct device_attribute *attr,
 	if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1))
 		return -EINVAL;
 
+	ret = lock_device_hotplug_sysfs();
+	if (ret)
+		goto out;
+
 	nid = memory_add_physaddr_to_nid(phys_addr);
-	ret = add_memory(nid, phys_addr,
-			 MIN_MEMORY_BLOCK_SIZE * sections_per_block);
+	ret = __add_memory(nid, phys_addr,
+			   MIN_MEMORY_BLOCK_SIZE * sections_per_block);
 
 	if (ret)
 		goto out;
 
 	ret = count;
 out:
+	unlock_device_hotplug();
 	return ret;
 }
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e12bb256036f..6bab019a82b1 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -395,7 +395,10 @@ static enum bp_state reserve_additional_memory(void)
 	 * callers drop the mutex before trying again.
 	 */
 	mutex_unlock(&balloon_mutex);
+	/* add_memory_resource() requires the device_hotplug lock */
+	lock_device_hotplug();
 	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	unlock_device_hotplug();
 	mutex_lock(&balloon_mutex);
 
 	if (rc) {
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 1f096852f479..ffd9cd10fcf3 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -324,6 +324,7 @@ static inline void __remove_memory(int nid, u64 start, u64 size) {}
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
+extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int add_memory_resource(int nid, struct resource *resource, bool online);
 extern int arch_add_memory(int nid, u64 start, u64 size,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 898e13d4d87d..e2b5c751e3ea 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1111,7 +1111,12 @@ static int online_memory_block(struct memory_block *mem, void *arg)
 	return device_online(&mem->dev);
 }
 
-/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
+/*
+ * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
+ * and online/offline operations (triggered e.g. by sysfs).
+ *
+ * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
+ */
 int __ref add_memory_resource(int nid, struct resource *res, bool online)
 {
 	u64 start, size;
@@ -1180,9 +1185,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	mem_hotplug_done();
 	return ret;
 }
-EXPORT_SYMBOL_GPL(add_memory_resource);
 
-int __ref add_memory(int nid, u64 start, u64 size)
+/* requires device_hotplug_lock, see add_memory_resource() */
+int __ref __add_memory(int nid, u64 start, u64 size)
 {
 	struct resource *res;
 	int ret;
@@ -1196,6 +1201,17 @@ int __ref add_memory(int nid, u64 start, u64 size)
 		release_memory_resource(res);
 	return ret;
 }
+
+int add_memory(int nid, u64 start, u64 size)
+{
+	int rc;
+
+	lock_device_hotplug();
+	rc = __add_memory(nid, start, size);
+	unlock_device_hotplug();
+
+	return rc;
+}
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
                   ` (3 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Paul Mackerras,
	Dan Williams, Michael Ellerman, David Hildenbrand,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown,
	YASUAKI ISHIMATSU, Nathan Fontenot, Boris Ostrovsky, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Mathieu Malaterre, Greg Kroah-Hartman, Rafael J. Wysocki,
	linux-kernel, John Allen, devel

add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory
to synchronize against online/offline request (e.g. from user space) -
which already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used
by XEN, which is never built as a module. If somebody requires it, we
also have to export a locked variant (as device_hotplug_lock is never
exported).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../platforms/pseries/hotplug-memory.c        |  2 +-
 drivers/acpi/acpi_memhotplug.c                |  2 +-
 drivers/base/memory.c                         |  9 ++++++--
 drivers/xen/balloon.c                         |  3 +++
 include/linux/memory_hotplug.h                |  1 +
 mm/memory_hotplug.c                           | 22 ++++++++++++++++---
 6 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b3f54466e25f..2e6f41dc103a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -702,7 +702,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
 	/* Add the memory */
-	rc = add_memory(nid, lmb->base_addr, block_sz);
+	rc = __add_memory(nid, lmb->base_addr, block_sz);
 	if (rc) {
 		dlpar_remove_device_tree_lmb(lmb);
 		return rc;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 811148415993..8fe0960ea572 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -228,7 +228,7 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
 		if (node < 0)
 			node = memory_add_physaddr_to_nid(info->start_addr);
 
-		result = add_memory(node, info->start_addr, info->length);
+		result = __add_memory(node, info->start_addr, info->length);
 
 		/*
 		 * If the memory block has been used by the kernel, add_memory()
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index c8a1cb0b6136..5b0375be7f65 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -521,15 +521,20 @@ memory_probe_store(struct device *dev, struct device_attribute *attr,
 	if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1))
 		return -EINVAL;
 
+	ret = lock_device_hotplug_sysfs();
+	if (ret)
+		goto out;
+
 	nid = memory_add_physaddr_to_nid(phys_addr);
-	ret = add_memory(nid, phys_addr,
-			 MIN_MEMORY_BLOCK_SIZE * sections_per_block);
+	ret = __add_memory(nid, phys_addr,
+			   MIN_MEMORY_BLOCK_SIZE * sections_per_block);
 
 	if (ret)
 		goto out;
 
 	ret = count;
 out:
+	unlock_device_hotplug();
 	return ret;
 }
 
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e12bb256036f..6bab019a82b1 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -395,7 +395,10 @@ static enum bp_state reserve_additional_memory(void)
 	 * callers drop the mutex before trying again.
 	 */
 	mutex_unlock(&balloon_mutex);
+	/* add_memory_resource() requires the device_hotplug lock */
+	lock_device_hotplug();
 	rc = add_memory_resource(nid, resource, memhp_auto_online);
+	unlock_device_hotplug();
 	mutex_lock(&balloon_mutex);
 
 	if (rc) {
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 1f096852f479..ffd9cd10fcf3 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -324,6 +324,7 @@ static inline void __remove_memory(int nid, u64 start, u64 size) {}
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
 		void *arg, int (*func)(struct memory_block *, void *));
+extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
 extern int add_memory_resource(int nid, struct resource *resource, bool online);
 extern int arch_add_memory(int nid, u64 start, u64 size,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 898e13d4d87d..e2b5c751e3ea 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1111,7 +1111,12 @@ static int online_memory_block(struct memory_block *mem, void *arg)
 	return device_online(&mem->dev);
 }
 
-/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
+/*
+ * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
+ * and online/offline operations (triggered e.g. by sysfs).
+ *
+ * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG
+ */
 int __ref add_memory_resource(int nid, struct resource *res, bool online)
 {
 	u64 start, size;
@@ -1180,9 +1185,9 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	mem_hotplug_done();
 	return ret;
 }
-EXPORT_SYMBOL_GPL(add_memory_resource);
 
-int __ref add_memory(int nid, u64 start, u64 size)
+/* requires device_hotplug_lock, see add_memory_resource() */
+int __ref __add_memory(int nid, u64 start, u64 size)
 {
 	struct resource *res;
 	int ret;
@@ -1196,6 +1201,17 @@ int __ref add_memory(int nid, u64 start, u64 size)
 		release_memory_resource(res);
 	return ret;
 }
+
+int add_memory(int nid, u64 start, u64 size)
+{
+	int rc;
+
+	lock_device_hotplug();
+	rc = __add_memory(nid, start, size);
+	unlock_device_hotplug();
+
+	return rc;
+}
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-21 10:44   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown,
	Greg Kroah-Hartman, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Martin Schwidefsky, Heiko Carstens,
	Boris Ostrovsky, Juergen Gross, Rashmica

There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
	a) device_lock()
	b) mem_hotplug_lock

While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took b),
followed by a), exposing a possible deadlock.

In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.

The problems I spotted related to this:

1. Memory block device attributes: While .state first calls
   mem_hotplug_begin() and the calls device_online() - which takes
   device_lock() - .online does no longer call mem_hotplug_begin(), so
   effectively calls online_pages() without mem_hotplug_lock.

2. device_online() should be called under device_hotplug_lock, however
   onlining memory during add_memory() does not take care of that.

In addition, I think there is also something wrong about the locking in

3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
   without locks. This was introduced after 30467e0b3be. And skimming over
   the code, I assume it could need some more care in regards to locking
   (e.g. device_online() called without device_hotplug_lock - but I'll
   not touch that for now).

Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()

We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().

Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).

[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
    2015-February/065324.html

This patch is partly based on a patch by Vitaly Kuznetsov.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c | 13 +------------
 mm/memory_hotplug.c   | 28 ++++++++++++++++++++--------
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 5b0375be7f65..04be13539eb8 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -228,7 +228,6 @@ static bool pages_correctly_probed(unsigned long start_pfn)
 /*
  * MEMORY_HOTPLUG depends on SPARSEMEM in mm/Kconfig, so it is
  * OK to have direct references to sparsemem variables in here.
- * Must already be protected by mem_hotplug_begin().
  */
 static int
 memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
@@ -294,7 +293,6 @@ static int memory_subsys_online(struct device *dev)
 	if (mem->online_type < 0)
 		mem->online_type = MMOP_ONLINE_KEEP;
 
-	/* Already under protection of mem_hotplug_begin() */
 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
 
 	/* clear online_type */
@@ -341,19 +339,11 @@ store_mem_state(struct device *dev,
 		goto err;
 	}
 
-	/*
-	 * Memory hotplug needs to hold mem_hotplug_begin() for probe to find
-	 * the correct memory block to online before doing device_online(dev),
-	 * which will take dev->mutex.  Take the lock early to prevent an
-	 * inversion, memory_subsys_online() callbacks will be implemented by
-	 * assuming it's already protected.
-	 */
-	mem_hotplug_begin();
-
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
 	case MMOP_ONLINE_KEEP:
+		/* mem->online_type is protected by device_hotplug_lock */
 		mem->online_type = online_type;
 		ret = device_online(&mem->dev);
 		break;
@@ -364,7 +354,6 @@ store_mem_state(struct device *dev,
 		ret = -EINVAL; /* should never happen */
 	}
 
-	mem_hotplug_done();
 err:
 	unlock_device_hotplug();
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e2b5c751e3ea..a2c6c87d83f3 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -881,7 +881,6 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid,
 	return zone;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
 {
 	unsigned long flags;
@@ -893,6 +892,8 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	struct memory_notify arg;
 	struct memory_block *mem;
 
+	mem_hotplug_begin();
+
 	/*
 	 * We can't use pfn_to_nid() because nid might be stored in struct page
 	 * which is not yet initialized. Instead, we find nid from memory block.
@@ -957,6 +958,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 
 	if (onlined_pages)
 		memory_notify(MEM_ONLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_addition:
@@ -964,6 +966,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
 	memory_notify(MEM_CANCEL_ONLINE, &arg);
+	mem_hotplug_done();
 	return ret;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
@@ -1168,20 +1171,20 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	/* create new memmap entry */
 	firmware_map_add_hotplug(start, start + size, "System RAM");
 
+	/* device_online() will take the lock when calling online_pages() */
+	mem_hotplug_done();
+
 	/* online pages if requested */
 	if (online)
 		walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
 				  NULL, online_memory_block);
 
-	goto out;
-
+	return ret;
 error:
 	/* rollback pgdat allocation and others */
 	if (new_node)
 		rollback_node_hotadd(nid);
 	memblock_remove(start, size);
-
-out:
 	mem_hotplug_done();
 	return ret;
 }
@@ -1621,10 +1624,16 @@ static int __ref __offline_pages(unsigned long start_pfn,
 		return -EINVAL;
 	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
 		return -EINVAL;
+
+	mem_hotplug_begin();
+
 	/* This makes hotplug much easier...and readable.
 	   we assume this for now. .*/
-	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
+	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
+				  &valid_end)) {
+		mem_hotplug_done();
 		return -EINVAL;
+	}
 
 	zone = page_zone(pfn_to_page(valid_start));
 	node = zone_to_nid(zone);
@@ -1633,8 +1642,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
 				       MIGRATE_MOVABLE, true);
-	if (ret)
+	if (ret) {
+		mem_hotplug_done();
 		return ret;
+	}
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
@@ -1705,6 +1716,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	writeback_set_ratelimit();
 
 	memory_notify(MEM_OFFLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_removal:
@@ -1714,10 +1726,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+	mem_hotplug_done();
 	return ret;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-21 10:44   ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rafael J. Wysocki, Len Brown,
	Greg Kroah-Hartman, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Martin Schwidefsky, Heiko Carstens,
	Boris Ostrovsky, Juergen Gross, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Kate Stewart, Thomas Gleixner, Philippe Ombredanne,
	Andrew Morton, Michal Hocko, Pavel Tatashin, Vlastimil Babka,
	Dan Williams, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
	a) device_lock()
	b) mem_hotplug_lock

While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took b),
followed by a), exposing a possible deadlock.

In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.

The problems I spotted related to this:

1. Memory block device attributes: While .state first calls
   mem_hotplug_begin() and the calls device_online() - which takes
   device_lock() - .online does no longer call mem_hotplug_begin(), so
   effectively calls online_pages() without mem_hotplug_lock.

2. device_online() should be called under device_hotplug_lock, however
   onlining memory during add_memory() does not take care of that.

In addition, I think there is also something wrong about the locking in

3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
   without locks. This was introduced after 30467e0b3be. And skimming over
   the code, I assume it could need some more care in regards to locking
   (e.g. device_online() called without device_hotplug_lock - but I'll
   not touch that for now).

Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()

We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().

Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).

[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
    2015-February/065324.html

This patch is partly based on a patch by Vitaly Kuznetsov.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c | 13 +------------
 mm/memory_hotplug.c   | 28 ++++++++++++++++++++--------
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 5b0375be7f65..04be13539eb8 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -228,7 +228,6 @@ static bool pages_correctly_probed(unsigned long start_pfn)
 /*
  * MEMORY_HOTPLUG depends on SPARSEMEM in mm/Kconfig, so it is
  * OK to have direct references to sparsemem variables in here.
- * Must already be protected by mem_hotplug_begin().
  */
 static int
 memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
@@ -294,7 +293,6 @@ static int memory_subsys_online(struct device *dev)
 	if (mem->online_type < 0)
 		mem->online_type = MMOP_ONLINE_KEEP;
 
-	/* Already under protection of mem_hotplug_begin() */
 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
 
 	/* clear online_type */
@@ -341,19 +339,11 @@ store_mem_state(struct device *dev,
 		goto err;
 	}
 
-	/*
-	 * Memory hotplug needs to hold mem_hotplug_begin() for probe to find
-	 * the correct memory block to online before doing device_online(dev),
-	 * which will take dev->mutex.  Take the lock early to prevent an
-	 * inversion, memory_subsys_online() callbacks will be implemented by
-	 * assuming it's already protected.
-	 */
-	mem_hotplug_begin();
-
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
 	case MMOP_ONLINE_KEEP:
+		/* mem->online_type is protected by device_hotplug_lock */
 		mem->online_type = online_type;
 		ret = device_online(&mem->dev);
 		break;
@@ -364,7 +354,6 @@ store_mem_state(struct device *dev,
 		ret = -EINVAL; /* should never happen */
 	}
 
-	mem_hotplug_done();
 err:
 	unlock_device_hotplug();
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e2b5c751e3ea..a2c6c87d83f3 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -881,7 +881,6 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid,
 	return zone;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
 {
 	unsigned long flags;
@@ -893,6 +892,8 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	struct memory_notify arg;
 	struct memory_block *mem;
 
+	mem_hotplug_begin();
+
 	/*
 	 * We can't use pfn_to_nid() because nid might be stored in struct page
 	 * which is not yet initialized. Instead, we find nid from memory block.
@@ -957,6 +958,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 
 	if (onlined_pages)
 		memory_notify(MEM_ONLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_addition:
@@ -964,6 +966,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
 	memory_notify(MEM_CANCEL_ONLINE, &arg);
+	mem_hotplug_done();
 	return ret;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
@@ -1168,20 +1171,20 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	/* create new memmap entry */
 	firmware_map_add_hotplug(start, start + size, "System RAM");
 
+	/* device_online() will take the lock when calling online_pages() */
+	mem_hotplug_done();
+
 	/* online pages if requested */
 	if (online)
 		walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
 				  NULL, online_memory_block);
 
-	goto out;
-
+	return ret;
 error:
 	/* rollback pgdat allocation and others */
 	if (new_node)
 		rollback_node_hotadd(nid);
 	memblock_remove(start, size);
-
-out:
 	mem_hotplug_done();
 	return ret;
 }
@@ -1621,10 +1624,16 @@ static int __ref __offline_pages(unsigned long start_pfn,
 		return -EINVAL;
 	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
 		return -EINVAL;
+
+	mem_hotplug_begin();
+
 	/* This makes hotplug much easier...and readable.
 	   we assume this for now. .*/
-	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
+	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
+				  &valid_end)) {
+		mem_hotplug_done();
 		return -EINVAL;
+	}
 
 	zone = page_zone(pfn_to_page(valid_start));
 	node = zone_to_nid(zone);
@@ -1633,8 +1642,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
 				       MIGRATE_MOVABLE, true);
-	if (ret)
+	if (ret) {
+		mem_hotplug_done();
 		return ret;
+	}
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
@@ -1705,6 +1716,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	writeback_set_ratelimit();
 
 	memory_notify(MEM_OFFLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_removal:
@@ -1714,10 +1726,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+	mem_hotplug_done();
 	return ret;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
                   ` (5 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	K. Y. Srinivasan, Thomas Gleixner, Michael Neuling,
	Stephen Hemminger, Michael Ellerman, David Hildenbrand,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	Dan Williams, YASUAKI ISHIMATSU, Boris Ostrovsky,
	Vlastimil Babka

There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
	a) device_lock()
	b) mem_hotplug_lock

While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took b),
followed by a), exposing a possible deadlock.

In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.

The problems I spotted related to this:

1. Memory block device attributes: While .state first calls
   mem_hotplug_begin() and the calls device_online() - which takes
   device_lock() - .online does no longer call mem_hotplug_begin(), so
   effectively calls online_pages() without mem_hotplug_lock.

2. device_online() should be called under device_hotplug_lock, however
   onlining memory during add_memory() does not take care of that.

In addition, I think there is also something wrong about the locking in

3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
   without locks. This was introduced after 30467e0b3be. And skimming over
   the code, I assume it could need some more care in regards to locking
   (e.g. device_online() called without device_hotplug_lock - but I'll
   not touch that for now).

Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()

We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().

Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).

[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
    2015-February/065324.html

This patch is partly based on a patch by Vitaly Kuznetsov.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/base/memory.c | 13 +------------
 mm/memory_hotplug.c   | 28 ++++++++++++++++++++--------
 2 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 5b0375be7f65..04be13539eb8 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -228,7 +228,6 @@ static bool pages_correctly_probed(unsigned long start_pfn)
 /*
  * MEMORY_HOTPLUG depends on SPARSEMEM in mm/Kconfig, so it is
  * OK to have direct references to sparsemem variables in here.
- * Must already be protected by mem_hotplug_begin().
  */
 static int
 memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
@@ -294,7 +293,6 @@ static int memory_subsys_online(struct device *dev)
 	if (mem->online_type < 0)
 		mem->online_type = MMOP_ONLINE_KEEP;
 
-	/* Already under protection of mem_hotplug_begin() */
 	ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
 
 	/* clear online_type */
@@ -341,19 +339,11 @@ store_mem_state(struct device *dev,
 		goto err;
 	}
 
-	/*
-	 * Memory hotplug needs to hold mem_hotplug_begin() for probe to find
-	 * the correct memory block to online before doing device_online(dev),
-	 * which will take dev->mutex.  Take the lock early to prevent an
-	 * inversion, memory_subsys_online() callbacks will be implemented by
-	 * assuming it's already protected.
-	 */
-	mem_hotplug_begin();
-
 	switch (online_type) {
 	case MMOP_ONLINE_KERNEL:
 	case MMOP_ONLINE_MOVABLE:
 	case MMOP_ONLINE_KEEP:
+		/* mem->online_type is protected by device_hotplug_lock */
 		mem->online_type = online_type;
 		ret = device_online(&mem->dev);
 		break;
@@ -364,7 +354,6 @@ store_mem_state(struct device *dev,
 		ret = -EINVAL; /* should never happen */
 	}
 
-	mem_hotplug_done();
 err:
 	unlock_device_hotplug();
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e2b5c751e3ea..a2c6c87d83f3 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -881,7 +881,6 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid,
 	return zone;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
 {
 	unsigned long flags;
@@ -893,6 +892,8 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	struct memory_notify arg;
 	struct memory_block *mem;
 
+	mem_hotplug_begin();
+
 	/*
 	 * We can't use pfn_to_nid() because nid might be stored in struct page
 	 * which is not yet initialized. Instead, we find nid from memory block.
@@ -957,6 +958,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 
 	if (onlined_pages)
 		memory_notify(MEM_ONLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_addition:
@@ -964,6 +966,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
 	memory_notify(MEM_CANCEL_ONLINE, &arg);
+	mem_hotplug_done();
 	return ret;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
@@ -1168,20 +1171,20 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
 	/* create new memmap entry */
 	firmware_map_add_hotplug(start, start + size, "System RAM");
 
+	/* device_online() will take the lock when calling online_pages() */
+	mem_hotplug_done();
+
 	/* online pages if requested */
 	if (online)
 		walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
 				  NULL, online_memory_block);
 
-	goto out;
-
+	return ret;
 error:
 	/* rollback pgdat allocation and others */
 	if (new_node)
 		rollback_node_hotadd(nid);
 	memblock_remove(start, size);
-
-out:
 	mem_hotplug_done();
 	return ret;
 }
@@ -1621,10 +1624,16 @@ static int __ref __offline_pages(unsigned long start_pfn,
 		return -EINVAL;
 	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
 		return -EINVAL;
+
+	mem_hotplug_begin();
+
 	/* This makes hotplug much easier...and readable.
 	   we assume this for now. .*/
-	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
+	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
+				  &valid_end)) {
+		mem_hotplug_done();
 		return -EINVAL;
+	}
 
 	zone = page_zone(pfn_to_page(valid_start));
 	node = zone_to_nid(zone);
@@ -1633,8 +1642,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
 				       MIGRATE_MOVABLE, true);
-	if (ret)
+	if (ret) {
+		mem_hotplug_done();
 		return ret;
+	}
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
@@ -1705,6 +1716,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	writeback_set_ratelimit();
 
 	memory_notify(MEM_OFFLINE, &arg);
+	mem_hotplug_done();
 	return 0;
 
 failed_removal:
@@ -1714,10 +1726,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+	mem_hotplug_done();
 	return ret;
 }
 
-/* Must be protected by mem_hotplug_begin() or a device_lock */
 int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	return __offline_pages(start_pfn, start_pfn + nr_pages);
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-21 10:44   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Michael Neuling, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	David Hildenbrand, linux-kernel, linux-acpi, Paul Mackerras,
	Michael Ellerman, xen-devel, devel, Rashmica Gupta, linuxppc-dev

device_online() should be called with device_hotplug_lock() held.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 8f1cd4f3bfd5..ef7181d4fe68 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -229,9 +229,11 @@ static int memtrace_online(void)
 		 * we need to online the memory ourselves.
 		 */
 		if (!memhp_auto_online) {
+			lock_device_hotplug();
 			walk_memory_range(PFN_DOWN(ent->start),
 					  PFN_UP(ent->start + ent->size - 1),
 					  NULL, online_mem_block);
+			unlock_device_hotplug();
 		}
 
 		/*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
@ 2018-08-21 10:44   ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rashmica Gupta, Balbir Singh, Michael Neuling

device_online() should be called with device_hotplug_lock() held.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 8f1cd4f3bfd5..ef7181d4fe68 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -229,9 +229,11 @@ static int memtrace_online(void)
 		 * we need to online the memory ourselves.
 		 */
 		if (!memhp_auto_online) {
+			lock_device_hotplug();
 			walk_memory_range(PFN_DOWN(ent->start),
 					  PFN_UP(ent->start + ent->size - 1),
 					  NULL, online_mem_block);
+			unlock_device_hotplug();
 		}
 
 		/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
  2018-08-21 10:44 ` David Hildenbrand
                   ` (6 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Michael Neuling, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	David Hildenbrand, linux-kernel, linux-acpi, Paul Mackerras,
	Michael Ellerman, xen-devel, devel, Rashmica Gupta, linuxppc-dev

device_online() should be called with device_hotplug_lock() held.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index 8f1cd4f3bfd5..ef7181d4fe68 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -229,9 +229,11 @@ static int memtrace_online(void)
 		 * we need to online the memory ourselves.
 		 */
 		if (!memhp_auto_online) {
+			lock_device_hotplug();
 			walk_memory_range(PFN_DOWN(ent->start),
 					  PFN_UP(ent->start + ent->size - 1),
 					  NULL, online_mem_block);
+			unlock_device_hotplug();
 		}
 
 		/*
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  2018-08-21 10:44 ` David Hildenbrand
                   ` (9 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  2018-08-30 19:38     ` Pasha Tatashin
  2018-08-30 19:38   ` Pasha Tatashin
  -1 siblings, 2 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Rashmica Gupta, Balbir Singh, Michael Neuling

Let's perform all checking + offlining + removing under
device_hotplug_lock, so nobody can mess with these devices via
sysfs concurrently.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index ef7181d4fe68..473e59842ec5 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -74,9 +74,13 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 {
 	u64 end_pfn = start_pfn + nr_pages - 1;
 
+	lock_device_hotplug();
+
 	if (walk_memory_range(start_pfn, end_pfn, NULL,
-	    check_memblock_online))
+	    check_memblock_online)) {
+		unlock_device_hotplug();
 		return false;
+	}
 
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE,
 			  change_memblock_state);
@@ -84,14 +88,16 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 	if (offline_pages(start_pfn, nr_pages)) {
 		walk_memory_range(start_pfn, end_pfn, (void *)MEM_ONLINE,
 				  change_memblock_state);
+		unlock_device_hotplug();
 		return false;
 	}
 
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
 			  change_memblock_state);
 
-	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
+	__remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
 
+	unlock_device_hotplug();
 	return true;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  2018-08-21 10:44 ` David Hildenbrand
                   ` (8 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Michael Neuling, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	David Hildenbrand, linux-kernel, linux-acpi, Paul Mackerras,
	Michael Ellerman, xen-devel, devel, Rashmica Gupta, linuxppc-dev

Let's perform all checking + offlining + removing under
device_hotplug_lock, so nobody can mess with these devices via
sysfs concurrently.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/powerpc/platforms/powernv/memtrace.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index ef7181d4fe68..473e59842ec5 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -74,9 +74,13 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 {
 	u64 end_pfn = start_pfn + nr_pages - 1;
 
+	lock_device_hotplug();
+
 	if (walk_memory_range(start_pfn, end_pfn, NULL,
-	    check_memblock_online))
+	    check_memblock_online)) {
+		unlock_device_hotplug();
 		return false;
+	}
 
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE,
 			  change_memblock_state);
@@ -84,14 +88,16 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
 	if (offline_pages(start_pfn, nr_pages)) {
 		walk_memory_range(start_pfn, end_pfn, (void *)MEM_ONLINE,
 				  change_memblock_state);
+		unlock_device_hotplug();
 		return false;
 	}
 
 	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
 			  change_memblock_state);
 
-	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
+	__remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
 
+	unlock_device_hotplug();
 	return true;
 }
 
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
  2018-08-21 10:44 ` David Hildenbrand
                   ` (10 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  2018-08-30 19:38   ` Pasha Tatashin
  2018-08-30 19:38     ` Pasha Tatashin
  -1 siblings, 2 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, David Hildenbrand, Jonathan Corbet, Michal Hocko,
	Andrew Morton

Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/memory-hotplug.txt | 39 +++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 7f49ebf3ddb2..03aaad7d7373 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -3,7 +3,7 @@ Memory Hotplug
 ==============
 
 :Created:							Jul 28 2007
-:Updated: Add description of notifier of memory hotplug:	Oct 11 2007
+:Updated: Add some details about locking internals:		Aug 20 2018
 
 This document is about memory hotplug including how-to-use and current status.
 Because Memory Hotplug is still under development, contents of this text will
@@ -495,6 +495,43 @@ further processing of the notification queue.
 
 NOTIFY_STOP stops further processing of the notification queue.
 
+
+Locking Internals
+=================
+
+When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
+the device_hotplug_lock should be held to:
+
+- synchronize against online/offline requests (e.g. via sysfs). This way, memory
+  block devices can only be accessed (.online/.state attributes) by user
+  space once memory has been fully added. And when removing memory, we
+  know nobody is in critical sections.
+- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
+
+Especially, there is a possible lock inversion that is avoided using
+device_hotplug_lock when adding memory and user space tries to online that
+memory faster than expected:
+
+- device_online() will first take the device_lock(), followed by
+  mem_hotplug_lock
+- add_memory_resource() will first take the mem_hotplug_lock, followed by
+  the device_lock() (while creating the devices, during bus_add_device()).
+
+As the device is visible to user space before taking the device_lock(), this
+can result in a lock inversion.
+
+onlining/offlining of memory should be done via device_online()/
+device_offline() - to make sure it is properly synchronized to actions
+via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
+
+When adding/removing/onlining/offlining memory or adding/removing
+heterogeneous/device memory, we should always hold the mem_hotplug_lock to
+serialise memory hotplug (e.g. access to global/zone variables).
+
+In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) allows
+for a quite efficient get_online_mems/put_online_mems implementation.
+
+
 Future Work
 ===========
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
  2018-08-21 10:44 ` David Hildenbrand
                   ` (11 preceding siblings ...)
  (?)
@ 2018-08-21 10:44 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Jonathan Corbet, Michal Hocko, linux-doc, David Hildenbrand,
	linux-kernel, linux-acpi, xen-devel, devel, Andrew Morton,
	linuxppc-dev

Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/memory-hotplug.txt | 39 +++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 7f49ebf3ddb2..03aaad7d7373 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -3,7 +3,7 @@ Memory Hotplug
 ==============
 
 :Created:							Jul 28 2007
-:Updated: Add description of notifier of memory hotplug:	Oct 11 2007
+:Updated: Add some details about locking internals:		Aug 20 2018
 
 This document is about memory hotplug including how-to-use and current status.
 Because Memory Hotplug is still under development, contents of this text will
@@ -495,6 +495,43 @@ further processing of the notification queue.
 
 NOTIFY_STOP stops further processing of the notification queue.
 
+
+Locking Internals
+=================
+
+When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
+the device_hotplug_lock should be held to:
+
+- synchronize against online/offline requests (e.g. via sysfs). This way, memory
+  block devices can only be accessed (.online/.state attributes) by user
+  space once memory has been fully added. And when removing memory, we
+  know nobody is in critical sections.
+- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
+
+Especially, there is a possible lock inversion that is avoided using
+device_hotplug_lock when adding memory and user space tries to online that
+memory faster than expected:
+
+- device_online() will first take the device_lock(), followed by
+  mem_hotplug_lock
+- add_memory_resource() will first take the mem_hotplug_lock, followed by
+  the device_lock() (while creating the devices, during bus_add_device()).
+
+As the device is visible to user space before taking the device_lock(), this
+can result in a lock inversion.
+
+onlining/offlining of memory should be done via device_online()/
+device_offline() - to make sure it is properly synchronized to actions
+via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
+
+When adding/removing/onlining/offlining memory or adding/removing
+heterogeneous/device memory, we should always hold the mem_hotplug_lock to
+serialise memory hotplug (e.g. access to global/zone variables).
+
+In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) allows
+for a quite efficient get_online_mems/put_online_mems implementation.
+
+
 Future Work
 ===========
 
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-30 12:31   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-30 12:31 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	Boris Ostrovsky, Michael Neuling, Stephen Hemminger,
	Jonathan Corbet, Michael Ellerman, Pavel Tatashin, linux-acpi,
	xen-devel, Len Brown, Haiyang Zhang, YASUAKI ISHIMATSU,
	Nathan Fontenot, Dan Williams, Joonsoo Kim, Vlastimil Babka

On 21.08.2018 12:44, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.
> 

I'll be on vacation for two weeks starting on Saturday. If there are no
comments I'll send this as !RFC once I return.

Thanks!

> --------------------------------------------------------------------------
> 
> Reading through the code and studying how mem_hotplug_lock is to be used,
> I noticed that there are two places where we can end up calling
> device_online()/device_offline() - online_pages()/offline_pages() without
> the mem_hotplug_lock. And there are other places where we call
> device_online()/device_offline() without the device_hotplug_lock.
> 
> While e.g.
> 	echo "online" > /sys/devices/system/memory/memory9/state
> is fine, e.g.
> 	echo 1 > /sys/devices/system/memory/memory9/online
> Will not take the mem_hotplug_lock. However the device_lock() and
> device_hotplug_lock.
> 
> E.g. via memory_probe_store(), we can end up calling
> add_memory()->online_pages() without the device_hotplug_lock. So we can
> have concurrent callers in online_pages(). We e.g. touch in online_pages()
> basically unprotected zone->present_pages then.
> 
> Looks like there is a longer history to that (see Patch #2 for details),
> and fixing it to work the way it was intended is not really possible. We
> would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
> sounds wrong.
> 
> Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
> More details can be found in patch 3 and patch 6.
> 
> I propose the general rules (documentation added in patch 6):
> 
> 1. add_memory/add_memory_resource() must only be called with
>    device_hotplug_lock.
> 2. remove_memory() must only be called with device_hotplug_lock. This is
>    already documented and holds for all callers.
> 3. device_online()/device_offline() must only be called with
>    device_hotplug_lock. This is already documented and true for now in core
>    code. Other callers (related to memory hotplug) have to be fixed up.
> 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
>    online_pages/offline_pages.
> 
> To me, this looks way cleaner than what we have right now (and easier to
> verify). And looking at the documentation of remove_memory, using
> lock_device_hotplug also for add_memory() feels natural.
> 
> 
> RFC -> RFCv2:
> - Don't export device_hotplug_lock, provide proper remove_memory/add_memory
>   wrappers.
> - Split up the patches a bit.
> - Try to improve powernv memtrace locking
> - Add some documentation for locking that matches my knowledge
> 
> David Hildenbrand (6):
>   mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
>   mm/memory_hotplug: make add_memory() take the device_hotplug_lock
>   mm/memory_hotplug: fix online/offline_pages called w.o.
>     mem_hotplug_lock
>   powerpc/powernv: hold device_hotplug_lock when calling device_online()
>   powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
>   memory-hotplug.txt: Add some details about locking internals
> 
>  Documentation/memory-hotplug.txt              | 39 +++++++++++-
>  arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
>  .../platforms/pseries/hotplug-memory.c        |  8 +--
>  drivers/acpi/acpi_memhotplug.c                |  4 +-
>  drivers/base/memory.c                         | 22 +++----
>  drivers/xen/balloon.c                         |  3 +
>  include/linux/memory_hotplug.h                |  4 +-
>  mm/memory_hotplug.c                           | 59 +++++++++++++++----
>  8 files changed, 115 insertions(+), 38 deletions(-)
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-30 12:31   ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-30 12:31 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Andrew Morton, Balbir Singh, Benjamin Herrenschmidt,
	Boris Ostrovsky, Dan Williams, Greg Kroah-Hartman, Haiyang Zhang,
	Heiko Carstens, John Allen, Jonathan Corbet, Joonsoo Kim,
	Juergen Gross, Kate Stewart, K. Y. Srinivasan, Len Brown,
	Martin Schwidefsky, Mathieu Malaterre, Michael Ellerman,
	Michael Neuling, Michal Hocko, Nathan Fontenot, Oscar Salvador,
	Paul Mackerras, Pavel Tatashin, Philippe Ombredanne,
	Rafael J. Wysocki, Rashmica Gupta, Stephen Hemminger,
	Thomas Gleixner, Vlastimil Babka, YASUAKI ISHIMATSU

On 21.08.2018 12:44, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.
> 

I'll be on vacation for two weeks starting on Saturday. If there are no
comments I'll send this as !RFC once I return.

Thanks!

> --------------------------------------------------------------------------
> 
> Reading through the code and studying how mem_hotplug_lock is to be used,
> I noticed that there are two places where we can end up calling
> device_online()/device_offline() - online_pages()/offline_pages() without
> the mem_hotplug_lock. And there are other places where we call
> device_online()/device_offline() without the device_hotplug_lock.
> 
> While e.g.
> 	echo "online" > /sys/devices/system/memory/memory9/state
> is fine, e.g.
> 	echo 1 > /sys/devices/system/memory/memory9/online
> Will not take the mem_hotplug_lock. However the device_lock() and
> device_hotplug_lock.
> 
> E.g. via memory_probe_store(), we can end up calling
> add_memory()->online_pages() without the device_hotplug_lock. So we can
> have concurrent callers in online_pages(). We e.g. touch in online_pages()
> basically unprotected zone->present_pages then.
> 
> Looks like there is a longer history to that (see Patch #2 for details),
> and fixing it to work the way it was intended is not really possible. We
> would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
> sounds wrong.
> 
> Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
> More details can be found in patch 3 and patch 6.
> 
> I propose the general rules (documentation added in patch 6):
> 
> 1. add_memory/add_memory_resource() must only be called with
>    device_hotplug_lock.
> 2. remove_memory() must only be called with device_hotplug_lock. This is
>    already documented and holds for all callers.
> 3. device_online()/device_offline() must only be called with
>    device_hotplug_lock. This is already documented and true for now in core
>    code. Other callers (related to memory hotplug) have to be fixed up.
> 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
>    online_pages/offline_pages.
> 
> To me, this looks way cleaner than what we have right now (and easier to
> verify). And looking at the documentation of remove_memory, using
> lock_device_hotplug also for add_memory() feels natural.
> 
> 
> RFC -> RFCv2:
> - Don't export device_hotplug_lock, provide proper remove_memory/add_memory
>   wrappers.
> - Split up the patches a bit.
> - Try to improve powernv memtrace locking
> - Add some documentation for locking that matches my knowledge
> 
> David Hildenbrand (6):
>   mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
>   mm/memory_hotplug: make add_memory() take the device_hotplug_lock
>   mm/memory_hotplug: fix online/offline_pages called w.o.
>     mem_hotplug_lock
>   powerpc/powernv: hold device_hotplug_lock when calling device_online()
>   powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
>   memory-hotplug.txt: Add some details about locking internals
> 
>  Documentation/memory-hotplug.txt              | 39 +++++++++++-
>  arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
>  .../platforms/pseries/hotplug-memory.c        |  8 +--
>  drivers/acpi/acpi_memhotplug.c                |  4 +-
>  drivers/base/memory.c                         | 22 +++----
>  drivers/xen/balloon.c                         |  3 +
>  include/linux/memory_hotplug.h                |  4 +-
>  mm/memory_hotplug.c                           | 59 +++++++++++++++----
>  8 files changed, 115 insertions(+), 38 deletions(-)
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
                   ` (13 preceding siblings ...)
  (?)
@ 2018-08-30 12:31 ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-30 12:31 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	K. Y. Srinivasan, Boris Ostrovsky, Michael Neuling,
	Stephen Hemminger, Jonathan Corbet, Michael Ellerman,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim

On 21.08.2018 12:44, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.
> 

I'll be on vacation for two weeks starting on Saturday. If there are no
comments I'll send this as !RFC once I return.

Thanks!

> --------------------------------------------------------------------------
> 
> Reading through the code and studying how mem_hotplug_lock is to be used,
> I noticed that there are two places where we can end up calling
> device_online()/device_offline() - online_pages()/offline_pages() without
> the mem_hotplug_lock. And there are other places where we call
> device_online()/device_offline() without the device_hotplug_lock.
> 
> While e.g.
> 	echo "online" > /sys/devices/system/memory/memory9/state
> is fine, e.g.
> 	echo 1 > /sys/devices/system/memory/memory9/online
> Will not take the mem_hotplug_lock. However the device_lock() and
> device_hotplug_lock.
> 
> E.g. via memory_probe_store(), we can end up calling
> add_memory()->online_pages() without the device_hotplug_lock. So we can
> have concurrent callers in online_pages(). We e.g. touch in online_pages()
> basically unprotected zone->present_pages then.
> 
> Looks like there is a longer history to that (see Patch #2 for details),
> and fixing it to work the way it was intended is not really possible. We
> would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
> sounds wrong.
> 
> Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
> More details can be found in patch 3 and patch 6.
> 
> I propose the general rules (documentation added in patch 6):
> 
> 1. add_memory/add_memory_resource() must only be called with
>    device_hotplug_lock.
> 2. remove_memory() must only be called with device_hotplug_lock. This is
>    already documented and holds for all callers.
> 3. device_online()/device_offline() must only be called with
>    device_hotplug_lock. This is already documented and true for now in core
>    code. Other callers (related to memory hotplug) have to be fixed up.
> 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
>    online_pages/offline_pages.
> 
> To me, this looks way cleaner than what we have right now (and easier to
> verify). And looking at the documentation of remove_memory, using
> lock_device_hotplug also for add_memory() feels natural.
> 
> 
> RFC -> RFCv2:
> - Don't export device_hotplug_lock, provide proper remove_memory/add_memory
>   wrappers.
> - Split up the patches a bit.
> - Try to improve powernv memtrace locking
> - Add some documentation for locking that matches my knowledge
> 
> David Hildenbrand (6):
>   mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
>   mm/memory_hotplug: make add_memory() take the device_hotplug_lock
>   mm/memory_hotplug: fix online/offline_pages called w.o.
>     mem_hotplug_lock
>   powerpc/powernv: hold device_hotplug_lock when calling device_online()
>   powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
>   memory-hotplug.txt: Add some details about locking internals
> 
>  Documentation/memory-hotplug.txt              | 39 +++++++++++-
>  arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
>  .../platforms/pseries/hotplug-memory.c        |  8 +--
>  drivers/acpi/acpi_memhotplug.c                |  4 +-
>  drivers/base/memory.c                         | 22 +++----
>  drivers/xen/balloon.c                         |  3 +
>  include/linux/memory_hotplug.h                |  4 +-
>  mm/memory_hotplug.c                           | 59 +++++++++++++++----
>  8 files changed, 115 insertions(+), 38 deletions(-)
> 


-- 

Thanks,

David / dhildenb

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-30 12:31   ` David Hildenbrand
  (?)
@ 2018-08-30 15:54     ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 15:54 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Andrew Morton, Balbir Singh, Benjamin Herrenschmidt,
	Boris Ostrovsky, Dan Williams, Greg Kroah-Hartman, Haiyang Zhang,
	Heiko Carstens, John Allen, Jonathan



On 8/30/18 8:31 AM, David Hildenbrand wrote:
> On 21.08.2018 12:44, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
>>
> 
> I'll be on vacation for two weeks starting on Saturday. If there are no
> comments I'll send this as !RFC once I return.
>
I am studying this series, will send my comments later today.

Pavel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-30 15:54     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 15:54 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Andrew Morton, Balbir Singh, Benjamin Herrenschmidt,
	Boris Ostrovsky, Dan Williams, Greg Kroah-Hartman, Haiyang Zhang,
	Heiko Carstens, John Allen, Jonathan Corbet, Joonsoo Kim,
	Juergen Gross, Kate Stewart, KY Srinivasan, Len Brown,
	Martin Schwidefsky, Mathieu Malaterre, Michael Ellerman,
	Michael Neuling, Michal Hocko, Nathan Fontenot, Oscar Salvador,
	Paul Mackerras, Philippe Ombredanne, Rafael J. Wysocki,
	Rashmica Gupta, Stephen Hemminger, Thomas Gleixner,
	Vlastimil Babka, YASUAKI ISHIMATSU



On 8/30/18 8:31 AM, David Hildenbrand wrote:
> On 21.08.2018 12:44, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
>>
> 
> I'll be on vacation for two weeks starting on Saturday. If there are no
> comments I'll send this as !RFC once I return.
>
I am studying this series, will send my comments later today.

Pavel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-30 15:54     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 15:54 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Andrew Morton, Balbir Singh, Benjamin Herrenschmidt,
	Boris Ostrovsky, Dan Williams, Greg Kroah-Hartman, Haiyang Zhang,
	Heiko Carstens, John Allen, Jonathan Corbet, Joonsoo Kim,
	Juergen Gross, Kate Stewart, KY Srinivasan, Len Brown,
	Martin Schwidefsky, Mathieu Malaterre, Michael Ellerman,
	Michael Neuling, Michal Hocko, Nathan Fontenot, Oscar Salvador,
	Paul Mackerras, Philippe Ombredanne, Rafael J. Wysocki,
	Rashmica Gupta, Stephen Hemminger, Thomas Gleixner,
	Vlastimil Babka, YASUAKI ISHIMATSU

DQoNCk9uIDgvMzAvMTggODozMSBBTSwgRGF2aWQgSGlsZGVuYnJhbmQgd3JvdGU6DQo+IE9uIDIx
LjA4LjIwMTggMTI6NDQsIERhdmlkIEhpbGRlbmJyYW5kIHdyb3RlOg0KPj4gVGhpcyBpcyB0aGUg
c2FtZSBhcHByb2FjaCBhcyBpbiB0aGUgZmlyc3QgUkZDLCBidXQgdGhpcyB0aW1lIHdpdGhvdXQN
Cj4+IGV4cG9ydGluZyBkZXZpY2VfaG90cGx1Z19sb2NrIChyZXF1ZXN0ZWQgYnkgR3JlZykgYW5k
IHdpdGggc29tZSBtb3JlDQo+PiBkZXRhaWxzIGFuZCBkb2N1bWVudGF0aW9uIHJlZ2FyZGluZyBs
b2NraW5nLiBUZXN0ZWQgb25seSBvbiB4ODYgc28gZmFyLg0KPj4NCj4gDQo+IEknbGwgYmUgb24g
dmFjYXRpb24gZm9yIHR3byB3ZWVrcyBzdGFydGluZyBvbiBTYXR1cmRheS4gSWYgdGhlcmUgYXJl
IG5vDQo+IGNvbW1lbnRzIEknbGwgc2VuZCB0aGlzIGFzICFSRkMgb25jZSBJIHJldHVybi4NCj4N
CkkgYW0gc3R1ZHlpbmcgdGhpcyBzZXJpZXMsIHdpbGwgc2VuZCBteSBjb21tZW50cyBsYXRlciB0
b2RheS4NCg0KUGF2ZWw=

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-30 12:31   ` David Hildenbrand
  (?)
@ 2018-08-30 15:54   ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 15:54 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	KY Srinivasan, Boris Ostrovsky, Michael Neuling,
	Stephen Hemminger, Jonathan Corbet, Michael Ellerman, linux-acpi,
	xen-devel, Len Brown, Haiyang Zhang, YASUAKI ISHIMATSU,
	Nathan Fontenot, Dan Williams



On 8/30/18 8:31 AM, David Hildenbrand wrote:
> On 21.08.2018 12:44, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
>>
> 
> I'll be on vacation for two weeks starting on Saturday. If there are no
> comments I'll send this as !RFC once I return.
>
I am studying this series, will send my comments later today.

Pavel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
@ 2018-08-30 19:35     ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:35 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Nathan Fontenot, John Allen

> +
> +void __ref remove_memory(int nid, u64 start, u64 size)

Remove __ref, otherwise looks good:

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

> +{
> +	lock_device_hotplug();
> +	__remove_memory(nid, start, size);
> +	unlock_device_hotplug();
> +}
>  EXPORT_SYMBOL_GPL(remove_memory);
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
@ 2018-08-30 19:35     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:35 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Nathan Fontenot, John Allen, Andrew Morton,
	Michal Hocko, Dan Williams, Joonsoo Kim, Vlastimil Babka,
	Greg Kroah-Hartman, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

> +
> +void __ref remove_memory(int nid, u64 start, u64 size)

Remove __ref, otherwise looks good:

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

> +{
> +	lock_device_hotplug();
> +	__remove_memory(nid, start, size);
> +	unlock_device_hotplug();
> +}
>  EXPORT_SYMBOL_GPL(remove_memory);
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
@ 2018-08-30 19:35     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:35 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Nathan Fontenot, John Allen, Andrew Morton,
	Michal Hocko, Dan Williams, Joonsoo Kim, Vlastimil Babka,
	Greg Kroah-Hartman, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

PiArDQo+ICt2b2lkIF9fcmVmIHJlbW92ZV9tZW1vcnkoaW50IG5pZCwgdTY0IHN0YXJ0LCB1NjQg
c2l6ZSkNCg0KUmVtb3ZlIF9fcmVmLCBvdGhlcndpc2UgbG9va3MgZ29vZDoNCg0KUmV2aWV3ZWQt
Ynk6IFBhdmVsIFRhdGFzaGluIDxwYXZlbC50YXRhc2hpbkBtaWNyb3NvZnQuY29tPg0KDQo+ICt7
DQo+ICsJbG9ja19kZXZpY2VfaG90cGx1ZygpOw0KPiArCV9fcmVtb3ZlX21lbW9yeShuaWQsIHN0
YXJ0LCBzaXplKTsNCj4gKwl1bmxvY2tfZGV2aWNlX2hvdHBsdWcoKTsNCj4gK30NCj4gIEVYUE9S
VF9TWU1CT0xfR1BMKHJlbW92ZV9tZW1vcnkpOw0KPiAgI2VuZGlmIC8qIENPTkZJR19NRU1PUllf
SE9UUkVNT1ZFICovDQo+IA==

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
@ 2018-08-30 19:35   ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:35 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	Paul Mackerras, Rashmica Gupta, Michael Neuling,
	Michael Ellerman, linux-acpi, xen-devel, Len Brown,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Mathieu Malaterre,
	Greg Kroah-Hartman, Rafael J. Wysocki

> +
> +void __ref remove_memory(int nid, u64 start, u64 size)

Remove __ref, otherwise looks good:

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

> +{
> +	lock_device_hotplug();
> +	__remove_memory(nid, start, size);
> +	unlock_device_hotplug();
> +}
>  EXPORT_SYMBOL_GPL(remove_memory);
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
@ 2018-08-30 19:36     ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Paul Mackerras,
	Dan Williams, Michael Ellerman, linux-acpi, xen-devel, Len Brown,
	YASUAKI ISHIMATSU, Nathan Fontenot, Boris Ostrovsky, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Mathieu Malaterre, Greg Kroah-Hartman, Rafael J. Wysocki,
	linux-kernel, John

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> add_memory() currently does not take the device_hotplug_lock, however
> is aleady called under the lock from
> 	arch/powerpc/platforms/pseries/hotplug-memory.c
> 	drivers/acpi/acpi_memhotplug.c
> to synchronize against CPU hot-remove and similar.
> 
> In general, we should hold the device_hotplug_lock when adding memory
> to synchronize against online/offline request (e.g. from user space) -
> which already resulted in lock inversions due to device_lock() and
> mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
> hot-add deadlock"). add_memory()/add_memory_resource() will create memory
> block devices, so this really feels like the right thing to do.
> 
> Holding the device_hotplug_lock makes sure that a memory block device
> can really only be accessed (e.g. via .online/.state) from user space,
> once the memory has been fully added to the system.
> 
> The lock is not held yet in
> 	drivers/xen/balloon.c
> 	arch/powerpc/platforms/powernv/memtrace.c
> 	drivers/s390/char/sclp_cmd.c
> 	drivers/hv/hv_balloon.c
> So, let's either use the locked variants or take the lock.
> 
> Don't export add_memory_resource(), as it once was exported to be used
> by XEN, which is never built as a module. If somebody requires it, we
> also have to export a locked variant (as device_hotplug_lock is never
> exported).

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
@ 2018-08-30 19:36     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	Boris Ostrovsky, Juergen Gross, Nathan Fontenot, John Allen,
	Andrew Morton, Michal Hocko, Dan Williams, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Mathieu Malaterre,
	YASUAKI ISHIMATSU

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> add_memory() currently does not take the device_hotplug_lock, however
> is aleady called under the lock from
> 	arch/powerpc/platforms/pseries/hotplug-memory.c
> 	drivers/acpi/acpi_memhotplug.c
> to synchronize against CPU hot-remove and similar.
> 
> In general, we should hold the device_hotplug_lock when adding memory
> to synchronize against online/offline request (e.g. from user space) -
> which already resulted in lock inversions due to device_lock() and
> mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
> hot-add deadlock"). add_memory()/add_memory_resource() will create memory
> block devices, so this really feels like the right thing to do.
> 
> Holding the device_hotplug_lock makes sure that a memory block device
> can really only be accessed (e.g. via .online/.state) from user space,
> once the memory has been fully added to the system.
> 
> The lock is not held yet in
> 	drivers/xen/balloon.c
> 	arch/powerpc/platforms/powernv/memtrace.c
> 	drivers/s390/char/sclp_cmd.c
> 	drivers/hv/hv_balloon.c
> So, let's either use the locked variants or take the lock.
> 
> Don't export add_memory_resource(), as it once was exported to be used
> by XEN, which is never built as a module. If somebody requires it, we
> also have to export a locked variant (as device_hotplug_lock is never
> exported).

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
@ 2018-08-30 19:36     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	Boris Ostrovsky, Juergen Gross, Nathan Fontenot, John Allen,
	Andrew Morton, Michal Hocko, Dan Williams, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Mathieu Malaterre,
	YASUAKI ISHIMATSU

T24gOC8yMS8xOCA2OjQ0IEFNLCBEYXZpZCBIaWxkZW5icmFuZCB3cm90ZToNCj4gYWRkX21lbW9y
eSgpIGN1cnJlbnRseSBkb2VzIG5vdCB0YWtlIHRoZSBkZXZpY2VfaG90cGx1Z19sb2NrLCBob3dl
dmVyDQo+IGlzIGFsZWFkeSBjYWxsZWQgdW5kZXIgdGhlIGxvY2sgZnJvbQ0KPiAJYXJjaC9wb3dl
cnBjL3BsYXRmb3Jtcy9wc2VyaWVzL2hvdHBsdWctbWVtb3J5LmMNCj4gCWRyaXZlcnMvYWNwaS9h
Y3BpX21lbWhvdHBsdWcuYw0KPiB0byBzeW5jaHJvbml6ZSBhZ2FpbnN0IENQVSBob3QtcmVtb3Zl
IGFuZCBzaW1pbGFyLg0KPiANCj4gSW4gZ2VuZXJhbCwgd2Ugc2hvdWxkIGhvbGQgdGhlIGRldmlj
ZV9ob3RwbHVnX2xvY2sgd2hlbiBhZGRpbmcgbWVtb3J5DQo+IHRvIHN5bmNocm9uaXplIGFnYWlu
c3Qgb25saW5lL29mZmxpbmUgcmVxdWVzdCAoZS5nLiBmcm9tIHVzZXIgc3BhY2UpIC0NCj4gd2hp
Y2ggYWxyZWFkeSByZXN1bHRlZCBpbiBsb2NrIGludmVyc2lvbnMgZHVlIHRvIGRldmljZV9sb2Nr
KCkgYW5kDQo+IG1lbV9ob3RwbHVnX2xvY2sgLSBzZWUgMzA0NjdlMGIzYmUgKCJtbSwgaG90cGx1
ZzogZml4IGNvbmN1cnJlbnQgbWVtb3J5DQo+IGhvdC1hZGQgZGVhZGxvY2siKS4gYWRkX21lbW9y
eSgpL2FkZF9tZW1vcnlfcmVzb3VyY2UoKSB3aWxsIGNyZWF0ZSBtZW1vcnkNCj4gYmxvY2sgZGV2
aWNlcywgc28gdGhpcyByZWFsbHkgZmVlbHMgbGlrZSB0aGUgcmlnaHQgdGhpbmcgdG8gZG8uDQo+
IA0KPiBIb2xkaW5nIHRoZSBkZXZpY2VfaG90cGx1Z19sb2NrIG1ha2VzIHN1cmUgdGhhdCBhIG1l
bW9yeSBibG9jayBkZXZpY2UNCj4gY2FuIHJlYWxseSBvbmx5IGJlIGFjY2Vzc2VkIChlLmcuIHZp
YSAub25saW5lLy5zdGF0ZSkgZnJvbSB1c2VyIHNwYWNlLA0KPiBvbmNlIHRoZSBtZW1vcnkgaGFz
IGJlZW4gZnVsbHkgYWRkZWQgdG8gdGhlIHN5c3RlbS4NCj4gDQo+IFRoZSBsb2NrIGlzIG5vdCBo
ZWxkIHlldCBpbg0KPiAJZHJpdmVycy94ZW4vYmFsbG9vbi5jDQo+IAlhcmNoL3Bvd2VycGMvcGxh
dGZvcm1zL3Bvd2VybnYvbWVtdHJhY2UuYw0KPiAJZHJpdmVycy9zMzkwL2NoYXIvc2NscF9jbWQu
Yw0KPiAJZHJpdmVycy9odi9odl9iYWxsb29uLmMNCj4gU28sIGxldCdzIGVpdGhlciB1c2UgdGhl
IGxvY2tlZCB2YXJpYW50cyBvciB0YWtlIHRoZSBsb2NrLg0KPiANCj4gRG9uJ3QgZXhwb3J0IGFk
ZF9tZW1vcnlfcmVzb3VyY2UoKSwgYXMgaXQgb25jZSB3YXMgZXhwb3J0ZWQgdG8gYmUgdXNlZA0K
PiBieSBYRU4sIHdoaWNoIGlzIG5ldmVyIGJ1aWx0IGFzIGEgbW9kdWxlLiBJZiBzb21lYm9keSBy
ZXF1aXJlcyBpdCwgd2UNCj4gYWxzbyBoYXZlIHRvIGV4cG9ydCBhIGxvY2tlZCB2YXJpYW50IChh
cyBkZXZpY2VfaG90cGx1Z19sb2NrIGlzIG5ldmVyDQo+IGV4cG9ydGVkKS4NCg0KUmV2aWV3ZWQt
Ynk6IFBhdmVsIFRhdGFzaGluIDxwYXZlbC50YXRhc2hpbkBtaWNyb3NvZnQuY29tPg==

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
  (?)
@ 2018-08-30 19:36   ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Paul Mackerras,
	Dan Williams, Michael Ellerman, linux-acpi, xen-devel, Len Brown,
	YASUAKI ISHIMATSU, Nathan Fontenot, Boris Ostrovsky, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Mathieu Malaterre, Greg Kroah-Hartman, Rafael J. Wysocki,
	linux-kernel, John

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> add_memory() currently does not take the device_hotplug_lock, however
> is aleady called under the lock from
> 	arch/powerpc/platforms/pseries/hotplug-memory.c
> 	drivers/acpi/acpi_memhotplug.c
> to synchronize against CPU hot-remove and similar.
> 
> In general, we should hold the device_hotplug_lock when adding memory
> to synchronize against online/offline request (e.g. from user space) -
> which already resulted in lock inversions due to device_lock() and
> mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
> hot-add deadlock"). add_memory()/add_memory_resource() will create memory
> block devices, so this really feels like the right thing to do.
> 
> Holding the device_hotplug_lock makes sure that a memory block device
> can really only be accessed (e.g. via .online/.state) from user space,
> once the memory has been fully added to the system.
> 
> The lock is not held yet in
> 	drivers/xen/balloon.c
> 	arch/powerpc/platforms/powernv/memtrace.c
> 	drivers/s390/char/sclp_cmd.c
> 	drivers/hv/hv_balloon.c
> So, let's either use the locked variants or take the lock.
> 
> Don't export add_memory_resource(), as it once was exported to be used
> by XEN, which is never built as a module. If somebody requires it, we
> also have to export a locked variant (as device_hotplug_lock is never
> exported).

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
@ 2018-08-30 19:37     ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:37 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Martin Schwidefsky

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
> fix concurrent memory hot-add deadlock"), which tried to fix a possible
> lock inversion reported and discussed in [1] due to the two locks
> 	a) device_lock()
> 	b) mem_hotplug_lock
> 
> While add_memory() first takes b), followed by a) during
> bus_probe_device(), onlining of memory from user space first took b),
> followed by a), exposing a possible deadlock.
> 
> In [1], and it was decided to not make use of device_hotplug_lock, but
> rather to enforce a locking order.
> 
> The problems I spotted related to this:
> 
> 1. Memory block device attributes: While .state first calls
>    mem_hotplug_begin() and the calls device_online() - which takes
>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>    effectively calls online_pages() without mem_hotplug_lock.
> 
> 2. device_online() should be called under device_hotplug_lock, however
>    onlining memory during add_memory() does not take care of that.
> 
> In addition, I think there is also something wrong about the locking in
> 
> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>    without locks. This was introduced after 30467e0b3be. And skimming over
>    the code, I assume it could need some more care in regards to locking
>    (e.g. device_online() called without device_hotplug_lock - but I'll
>    not touch that for now).
> 
> Now that we hold the device_hotplug_lock when
> - adding memory (e.g. via add_memory()/add_memory_resource())
> - removing memory (e.g. via remove_memory())
> - device_online()/device_offline()
> 
> We can move mem_hotplug_lock usage back into
> online_pages()/offline_pages().
> 
> Why is mem_hotplug_lock still needed? Essentially to make
> get_online_mems()/put_online_mems() be very fast (relying on
> device_hotplug_lock would be very slow), and to serialize against
> addition of memory that does not create memory block devices (hmm).
> 
> [1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
>     2015-February/065324.html
> 
> This patch is partly based on a patch by Vitaly Kuznetsov.

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-30 19:37     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:37 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Martin Schwidefsky,
	Heiko Carstens, Boris Ostrovsky, Juergen Gross, Rashmica Gupta,
	Michael Neuling, Balbir Singh, Kate Stewart, Thomas Gleixner,
	Philippe Ombredanne, Andrew Morton, Michal Hocko,
	Vlastimil Babka, Dan Williams, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
> fix concurrent memory hot-add deadlock"), which tried to fix a possible
> lock inversion reported and discussed in [1] due to the two locks
> 	a) device_lock()
> 	b) mem_hotplug_lock
> 
> While add_memory() first takes b), followed by a) during
> bus_probe_device(), onlining of memory from user space first took b),
> followed by a), exposing a possible deadlock.
> 
> In [1], and it was decided to not make use of device_hotplug_lock, but
> rather to enforce a locking order.
> 
> The problems I spotted related to this:
> 
> 1. Memory block device attributes: While .state first calls
>    mem_hotplug_begin() and the calls device_online() - which takes
>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>    effectively calls online_pages() without mem_hotplug_lock.
> 
> 2. device_online() should be called under device_hotplug_lock, however
>    onlining memory during add_memory() does not take care of that.
> 
> In addition, I think there is also something wrong about the locking in
> 
> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>    without locks. This was introduced after 30467e0b3be. And skimming over
>    the code, I assume it could need some more care in regards to locking
>    (e.g. device_online() called without device_hotplug_lock - but I'll
>    not touch that for now).
> 
> Now that we hold the device_hotplug_lock when
> - adding memory (e.g. via add_memory()/add_memory_resource())
> - removing memory (e.g. via remove_memory())
> - device_online()/device_offline()
> 
> We can move mem_hotplug_lock usage back into
> online_pages()/offline_pages().
> 
> Why is mem_hotplug_lock still needed? Essentially to make
> get_online_mems()/put_online_mems() be very fast (relying on
> device_hotplug_lock would be very slow), and to serialize against
> addition of memory that does not create memory block devices (hmm).
> 
> [1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
>     2015-February/065324.html
> 
> This patch is partly based on a patch by Vitaly Kuznetsov.

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-30 19:37     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:37 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Martin Schwidefsky,
	Heiko Carstens, Boris Ostrovsky, Juergen Gross, Rashmica Gupta,
	Michael Neuling, Balbir Singh, Kate Stewart, Thomas Gleixner,
	Philippe Ombredanne, Andrew Morton, Michal Hocko,
	Vlastimil Babka, Dan Williams, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

T24gOC8yMS8xOCA2OjQ0IEFNLCBEYXZpZCBIaWxkZW5icmFuZCB3cm90ZToNCj4gVGhlcmUgc2Vl
bSB0byBiZSBzb21lIHByb2JsZW1zIGFzIHJlc3VsdCBvZiAzMDQ2N2UwYjNiZSAoIm1tLCBob3Rw
bHVnOg0KPiBmaXggY29uY3VycmVudCBtZW1vcnkgaG90LWFkZCBkZWFkbG9jayIpLCB3aGljaCB0
cmllZCB0byBmaXggYSBwb3NzaWJsZQ0KPiBsb2NrIGludmVyc2lvbiByZXBvcnRlZCBhbmQgZGlz
Y3Vzc2VkIGluIFsxXSBkdWUgdG8gdGhlIHR3byBsb2Nrcw0KPiAJYSkgZGV2aWNlX2xvY2soKQ0K
PiAJYikgbWVtX2hvdHBsdWdfbG9jaw0KPiANCj4gV2hpbGUgYWRkX21lbW9yeSgpIGZpcnN0IHRh
a2VzIGIpLCBmb2xsb3dlZCBieSBhKSBkdXJpbmcNCj4gYnVzX3Byb2JlX2RldmljZSgpLCBvbmxp
bmluZyBvZiBtZW1vcnkgZnJvbSB1c2VyIHNwYWNlIGZpcnN0IHRvb2sgYiksDQo+IGZvbGxvd2Vk
IGJ5IGEpLCBleHBvc2luZyBhIHBvc3NpYmxlIGRlYWRsb2NrLg0KPiANCj4gSW4gWzFdLCBhbmQg
aXQgd2FzIGRlY2lkZWQgdG8gbm90IG1ha2UgdXNlIG9mIGRldmljZV9ob3RwbHVnX2xvY2ssIGJ1
dA0KPiByYXRoZXIgdG8gZW5mb3JjZSBhIGxvY2tpbmcgb3JkZXIuDQo+IA0KPiBUaGUgcHJvYmxl
bXMgSSBzcG90dGVkIHJlbGF0ZWQgdG8gdGhpczoNCj4gDQo+IDEuIE1lbW9yeSBibG9jayBkZXZp
Y2UgYXR0cmlidXRlczogV2hpbGUgLnN0YXRlIGZpcnN0IGNhbGxzDQo+ICAgIG1lbV9ob3RwbHVn
X2JlZ2luKCkgYW5kIHRoZSBjYWxscyBkZXZpY2Vfb25saW5lKCkgLSB3aGljaCB0YWtlcw0KPiAg
ICBkZXZpY2VfbG9jaygpIC0gLm9ubGluZSBkb2VzIG5vIGxvbmdlciBjYWxsIG1lbV9ob3RwbHVn
X2JlZ2luKCksIHNvDQo+ICAgIGVmZmVjdGl2ZWx5IGNhbGxzIG9ubGluZV9wYWdlcygpIHdpdGhv
dXQgbWVtX2hvdHBsdWdfbG9jay4NCj4gDQo+IDIuIGRldmljZV9vbmxpbmUoKSBzaG91bGQgYmUg
Y2FsbGVkIHVuZGVyIGRldmljZV9ob3RwbHVnX2xvY2ssIGhvd2V2ZXINCj4gICAgb25saW5pbmcg
bWVtb3J5IGR1cmluZyBhZGRfbWVtb3J5KCkgZG9lcyBub3QgdGFrZSBjYXJlIG9mIHRoYXQuDQo+
IA0KPiBJbiBhZGRpdGlvbiwgSSB0aGluayB0aGVyZSBpcyBhbHNvIHNvbWV0aGluZyB3cm9uZyBh
Ym91dCB0aGUgbG9ja2luZyBpbg0KPiANCj4gMy4gYXJjaC9wb3dlcnBjL3BsYXRmb3Jtcy9wb3dl
cm52L21lbXRyYWNlLmMgY2FsbHMgb2ZmbGluZV9wYWdlcygpDQo+ICAgIHdpdGhvdXQgbG9ja3Mu
IFRoaXMgd2FzIGludHJvZHVjZWQgYWZ0ZXIgMzA0NjdlMGIzYmUuIEFuZCBza2ltbWluZyBvdmVy
DQo+ICAgIHRoZSBjb2RlLCBJIGFzc3VtZSBpdCBjb3VsZCBuZWVkIHNvbWUgbW9yZSBjYXJlIGlu
IHJlZ2FyZHMgdG8gbG9ja2luZw0KPiAgICAoZS5nLiBkZXZpY2Vfb25saW5lKCkgY2FsbGVkIHdp
dGhvdXQgZGV2aWNlX2hvdHBsdWdfbG9jayAtIGJ1dCBJJ2xsDQo+ICAgIG5vdCB0b3VjaCB0aGF0
IGZvciBub3cpLg0KPiANCj4gTm93IHRoYXQgd2UgaG9sZCB0aGUgZGV2aWNlX2hvdHBsdWdfbG9j
ayB3aGVuDQo+IC0gYWRkaW5nIG1lbW9yeSAoZS5nLiB2aWEgYWRkX21lbW9yeSgpL2FkZF9tZW1v
cnlfcmVzb3VyY2UoKSkNCj4gLSByZW1vdmluZyBtZW1vcnkgKGUuZy4gdmlhIHJlbW92ZV9tZW1v
cnkoKSkNCj4gLSBkZXZpY2Vfb25saW5lKCkvZGV2aWNlX29mZmxpbmUoKQ0KPiANCj4gV2UgY2Fu
IG1vdmUgbWVtX2hvdHBsdWdfbG9jayB1c2FnZSBiYWNrIGludG8NCj4gb25saW5lX3BhZ2VzKCkv
b2ZmbGluZV9wYWdlcygpLg0KPiANCj4gV2h5IGlzIG1lbV9ob3RwbHVnX2xvY2sgc3RpbGwgbmVl
ZGVkPyBFc3NlbnRpYWxseSB0byBtYWtlDQo+IGdldF9vbmxpbmVfbWVtcygpL3B1dF9vbmxpbmVf
bWVtcygpIGJlIHZlcnkgZmFzdCAocmVseWluZyBvbg0KPiBkZXZpY2VfaG90cGx1Z19sb2NrIHdv
dWxkIGJlIHZlcnkgc2xvdyksIGFuZCB0byBzZXJpYWxpemUgYWdhaW5zdA0KPiBhZGRpdGlvbiBv
ZiBtZW1vcnkgdGhhdCBkb2VzIG5vdCBjcmVhdGUgbWVtb3J5IGJsb2NrIGRldmljZXMgKGhtbSku
DQo+IA0KPiBbMV0gaHR0cDovL2RyaXZlcmRldi5saW51eGRyaXZlcnByb2plY3Qub3JnL3BpcGVy
bWFpbC8gZHJpdmVyZGV2LWRldmVsLw0KPiAgICAgMjAxNS1GZWJydWFyeS8wNjUzMjQuaHRtbA0K
PiANCj4gVGhpcyBwYXRjaCBpcyBwYXJ0bHkgYmFzZWQgb24gYSBwYXRjaCBieSBWaXRhbHkgS3V6
bmV0c292Lg0KDQpSZXZpZXdlZC1ieTogUGF2ZWwgVGF0YXNoaW4gPHBhdmVsLnRhdGFzaGluQG1p
Y3Jvc29mdC5jb20+

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
  (?)
  (?)
@ 2018-08-30 19:37   ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:37 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	KY Srinivasan, Thomas Gleixner, Michael Neuling,
	Stephen Hemminger, Michael Ellerman, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, Dan Williams, YASUAKI ISHIMATSU,
	Boris Ostrovsky, Vlastimil Babka

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
> fix concurrent memory hot-add deadlock"), which tried to fix a possible
> lock inversion reported and discussed in [1] due to the two locks
> 	a) device_lock()
> 	b) mem_hotplug_lock
> 
> While add_memory() first takes b), followed by a) during
> bus_probe_device(), onlining of memory from user space first took b),
> followed by a), exposing a possible deadlock.
> 
> In [1], and it was decided to not make use of device_hotplug_lock, but
> rather to enforce a locking order.
> 
> The problems I spotted related to this:
> 
> 1. Memory block device attributes: While .state first calls
>    mem_hotplug_begin() and the calls device_online() - which takes
>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>    effectively calls online_pages() without mem_hotplug_lock.
> 
> 2. device_online() should be called under device_hotplug_lock, however
>    onlining memory during add_memory() does not take care of that.
> 
> In addition, I think there is also something wrong about the locking in
> 
> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>    without locks. This was introduced after 30467e0b3be. And skimming over
>    the code, I assume it could need some more care in regards to locking
>    (e.g. device_online() called without device_hotplug_lock - but I'll
>    not touch that for now).
> 
> Now that we hold the device_hotplug_lock when
> - adding memory (e.g. via add_memory()/add_memory_resource())
> - removing memory (e.g. via remove_memory())
> - device_online()/device_offline()
> 
> We can move mem_hotplug_lock usage back into
> online_pages()/offline_pages().
> 
> Why is mem_hotplug_lock still needed? Essentially to make
> get_online_mems()/put_online_mems() be very fast (relying on
> device_hotplug_lock would be very slow), and to serialize against
> addition of memory that does not create memory block devices (hmm).
> 
> [1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
>     2015-February/065324.html
> 
> This patch is partly based on a patch by Vitaly Kuznetsov.

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
  2018-08-21 10:44   ` David Hildenbrand
@ 2018-08-30 19:38     ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rashmica Gupta, Balbir Singh, Michael Neuling

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> device_online() should be called with device_hotplug_lock() held.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Rashmica Gupta <rashmica.g@gmail.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/powerpc/platforms/powernv/memtrace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
> index 8f1cd4f3bfd5..ef7181d4fe68 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -229,9 +229,11 @@ static int memtrace_online(void)
>  		 * we need to online the memory ourselves.
>  		 */
>  		if (!memhp_auto_online) {
> +			lock_device_hotplug();
>  			walk_memory_range(PFN_DOWN(ent->start),
>  					  PFN_UP(ent->start + ent->size - 1),
>  					  NULL, online_mem_block);
> +			unlock_device_hotplug();
>  		}
>  
>  		/*
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
@ 2018-08-30 19:38     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rashmica Gupta, Balbir Singh, Michael Neuling

UmV2aWV3ZWQtYnk6IFBhdmVsIFRhdGFzaGluIDxwYXZlbC50YXRhc2hpbkBtaWNyb3NvZnQuY29t
Pg0KDQpPbiA4LzIxLzE4IDY6NDQgQU0sIERhdmlkIEhpbGRlbmJyYW5kIHdyb3RlOg0KPiBkZXZp
Y2Vfb25saW5lKCkgc2hvdWxkIGJlIGNhbGxlZCB3aXRoIGRldmljZV9ob3RwbHVnX2xvY2soKSBo
ZWxkLg0KPiANCj4gQ2M6IEJlbmphbWluIEhlcnJlbnNjaG1pZHQgPGJlbmhAa2VybmVsLmNyYXNo
aW5nLm9yZz4NCj4gQ2M6IFBhdWwgTWFja2VycmFzIDxwYXVsdXNAc2FtYmEub3JnPg0KPiBDYzog
TWljaGFlbCBFbGxlcm1hbiA8bXBlQGVsbGVybWFuLmlkLmF1Pg0KPiBDYzogUmFzaG1pY2EgR3Vw
dGEgPHJhc2htaWNhLmdAZ21haWwuY29tPg0KPiBDYzogQmFsYmlyIFNpbmdoIDxic2luZ2hhcm9y
YUBnbWFpbC5jb20+DQo+IENjOiBNaWNoYWVsIE5ldWxpbmcgPG1pa2V5QG5ldWxpbmcub3JnPg0K
PiBTaWduZWQtb2ZmLWJ5OiBEYXZpZCBIaWxkZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT4NCj4g
LS0tDQo+ICBhcmNoL3Bvd2VycGMvcGxhdGZvcm1zL3Bvd2VybnYvbWVtdHJhY2UuYyB8IDIgKysN
Cj4gIDEgZmlsZSBjaGFuZ2VkLCAyIGluc2VydGlvbnMoKykNCj4gDQo+IGRpZmYgLS1naXQgYS9h
cmNoL3Bvd2VycGMvcGxhdGZvcm1zL3Bvd2VybnYvbWVtdHJhY2UuYyBiL2FyY2gvcG93ZXJwYy9w
bGF0Zm9ybXMvcG93ZXJudi9tZW10cmFjZS5jDQo+IGluZGV4IDhmMWNkNGYzYmZkNS4uZWY3MTgx
ZDRmZTY4IDEwMDY0NA0KPiAtLS0gYS9hcmNoL3Bvd2VycGMvcGxhdGZvcm1zL3Bvd2VybnYvbWVt
dHJhY2UuYw0KPiArKysgYi9hcmNoL3Bvd2VycGMvcGxhdGZvcm1zL3Bvd2VybnYvbWVtdHJhY2Uu
Yw0KPiBAQCAtMjI5LDkgKzIyOSwxMSBAQCBzdGF0aWMgaW50IG1lbXRyYWNlX29ubGluZSh2b2lk
KQ0KPiAgCQkgKiB3ZSBuZWVkIHRvIG9ubGluZSB0aGUgbWVtb3J5IG91cnNlbHZlcy4NCj4gIAkJ
ICovDQo+ICAJCWlmICghbWVtaHBfYXV0b19vbmxpbmUpIHsNCj4gKwkJCWxvY2tfZGV2aWNlX2hv
dHBsdWcoKTsNCj4gIAkJCXdhbGtfbWVtb3J5X3JhbmdlKFBGTl9ET1dOKGVudC0+c3RhcnQpLA0K
PiAgCQkJCQkgIFBGTl9VUChlbnQtPnN0YXJ0ICsgZW50LT5zaXplIC0gMSksDQo+ICAJCQkJCSAg
TlVMTCwgb25saW5lX21lbV9ibG9jayk7DQo+ICsJCQl1bmxvY2tfZGV2aWNlX2hvdHBsdWcoKTsN
Cj4gIAkJfQ0KPiAgDQo+ICAJCS8qDQo+IA==

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online()
  2018-08-21 10:44   ` David Hildenbrand
  (?)
  (?)
@ 2018-08-30 19:38   ` Pasha Tatashin
  -1 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michael Neuling, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	linux-kernel, linux-acpi, Paul Mackerras, Michael Ellerman,
	xen-devel, devel, Rashmica Gupta, linuxppc-dev

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> device_online() should be called with device_hotplug_lock() held.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Rashmica Gupta <rashmica.g@gmail.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/powerpc/platforms/powernv/memtrace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
> index 8f1cd4f3bfd5..ef7181d4fe68 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -229,9 +229,11 @@ static int memtrace_online(void)
>  		 * we need to online the memory ourselves.
>  		 */
>  		if (!memhp_auto_online) {
> +			lock_device_hotplug();
>  			walk_memory_range(PFN_DOWN(ent->start),
>  					  PFN_UP(ent->start + ent->size - 1),
>  					  NULL, online_mem_block);
> +			unlock_device_hotplug();
>  		}
>  
>  		/*
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-30 19:38     ` Pasha Tatashin
  2018-08-30 19:38   ` Pasha Tatashin
  1 sibling, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rashmica Gupta, Balbir Singh, Michael Neuling

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> Let's perform all checking + offlining + removing under
> device_hotplug_lock, so nobody can mess with these devices via
> sysfs concurrently.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Rashmica Gupta <rashmica.g@gmail.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/powerpc/platforms/powernv/memtrace.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
> index ef7181d4fe68..473e59842ec5 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -74,9 +74,13 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
>  {
>  	u64 end_pfn = start_pfn + nr_pages - 1;
>  
> +	lock_device_hotplug();
> +
>  	if (walk_memory_range(start_pfn, end_pfn, NULL,
> -	    check_memblock_online))
> +	    check_memblock_online)) {
> +		unlock_device_hotplug();
>  		return false;
> +	}
>  
>  	walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE,
>  			  change_memblock_state);
> @@ -84,14 +88,16 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
>  	if (offline_pages(start_pfn, nr_pages)) {
>  		walk_memory_range(start_pfn, end_pfn, (void *)MEM_ONLINE,
>  				  change_memblock_state);
> +		unlock_device_hotplug();
>  		return false;
>  	}
>  
>  	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
>  			  change_memblock_state);
>  
> -	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
> +	__remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
>  
> +	unlock_device_hotplug();
>  	return true;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
@ 2018-08-30 19:38     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rashmica Gupta, Balbir Singh, Michael Neuling

UmV2aWV3ZWQtYnk6IFBhdmVsIFRhdGFzaGluIDxwYXZlbC50YXRhc2hpbkBtaWNyb3NvZnQuY29t
Pg0KDQpPbiA4LzIxLzE4IDY6NDQgQU0sIERhdmlkIEhpbGRlbmJyYW5kIHdyb3RlOg0KPiBMZXQn
cyBwZXJmb3JtIGFsbCBjaGVja2luZyArIG9mZmxpbmluZyArIHJlbW92aW5nIHVuZGVyDQo+IGRl
dmljZV9ob3RwbHVnX2xvY2ssIHNvIG5vYm9keSBjYW4gbWVzcyB3aXRoIHRoZXNlIGRldmljZXMg
dmlhDQo+IHN5c2ZzIGNvbmN1cnJlbnRseS4NCj4gDQo+IENjOiBCZW5qYW1pbiBIZXJyZW5zY2ht
aWR0IDxiZW5oQGtlcm5lbC5jcmFzaGluZy5vcmc+DQo+IENjOiBQYXVsIE1hY2tlcnJhcyA8cGF1
bHVzQHNhbWJhLm9yZz4NCj4gQ2M6IE1pY2hhZWwgRWxsZXJtYW4gPG1wZUBlbGxlcm1hbi5pZC5h
dT4NCj4gQ2M6IFJhc2htaWNhIEd1cHRhIDxyYXNobWljYS5nQGdtYWlsLmNvbT4NCj4gQ2M6IEJh
bGJpciBTaW5naCA8YnNpbmdoYXJvcmFAZ21haWwuY29tPg0KPiBDYzogTWljaGFlbCBOZXVsaW5n
IDxtaWtleUBuZXVsaW5nLm9yZz4NCj4gU2lnbmVkLW9mZi1ieTogRGF2aWQgSGlsZGVuYnJhbmQg
PGRhdmlkQHJlZGhhdC5jb20+DQo+IC0tLQ0KPiAgYXJjaC9wb3dlcnBjL3BsYXRmb3Jtcy9wb3dl
cm52L21lbXRyYWNlLmMgfCAxMCArKysrKysrKy0tDQo+ICAxIGZpbGUgY2hhbmdlZCwgOCBpbnNl
cnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQ0KPiANCj4gZGlmZiAtLWdpdCBhL2FyY2gvcG93ZXJw
Yy9wbGF0Zm9ybXMvcG93ZXJudi9tZW10cmFjZS5jIGIvYXJjaC9wb3dlcnBjL3BsYXRmb3Jtcy9w
b3dlcm52L21lbXRyYWNlLmMNCj4gaW5kZXggZWY3MTgxZDRmZTY4Li40NzNlNTk4NDJlYzUgMTAw
NjQ0DQo+IC0tLSBhL2FyY2gvcG93ZXJwYy9wbGF0Zm9ybXMvcG93ZXJudi9tZW10cmFjZS5jDQo+
ICsrKyBiL2FyY2gvcG93ZXJwYy9wbGF0Zm9ybXMvcG93ZXJudi9tZW10cmFjZS5jDQo+IEBAIC03
NCw5ICs3NCwxMyBAQCBzdGF0aWMgYm9vbCBtZW10cmFjZV9vZmZsaW5lX3BhZ2VzKHUzMiBuaWQs
IHU2NCBzdGFydF9wZm4sIHU2NCBucl9wYWdlcykNCj4gIHsNCj4gIAl1NjQgZW5kX3BmbiA9IHN0
YXJ0X3BmbiArIG5yX3BhZ2VzIC0gMTsNCj4gIA0KPiArCWxvY2tfZGV2aWNlX2hvdHBsdWcoKTsN
Cj4gKw0KPiAgCWlmICh3YWxrX21lbW9yeV9yYW5nZShzdGFydF9wZm4sIGVuZF9wZm4sIE5VTEws
DQo+IC0JICAgIGNoZWNrX21lbWJsb2NrX29ubGluZSkpDQo+ICsJICAgIGNoZWNrX21lbWJsb2Nr
X29ubGluZSkpIHsNCj4gKwkJdW5sb2NrX2RldmljZV9ob3RwbHVnKCk7DQo+ICAJCXJldHVybiBm
YWxzZTsNCj4gKwl9DQo+ICANCj4gIAl3YWxrX21lbW9yeV9yYW5nZShzdGFydF9wZm4sIGVuZF9w
Zm4sICh2b2lkICopTUVNX0dPSU5HX09GRkxJTkUsDQo+ICAJCQkgIGNoYW5nZV9tZW1ibG9ja19z
dGF0ZSk7DQo+IEBAIC04NCwxNCArODgsMTYgQEAgc3RhdGljIGJvb2wgbWVtdHJhY2Vfb2ZmbGlu
ZV9wYWdlcyh1MzIgbmlkLCB1NjQgc3RhcnRfcGZuLCB1NjQgbnJfcGFnZXMpDQo+ICAJaWYgKG9m
ZmxpbmVfcGFnZXMoc3RhcnRfcGZuLCBucl9wYWdlcykpIHsNCj4gIAkJd2Fsa19tZW1vcnlfcmFu
Z2Uoc3RhcnRfcGZuLCBlbmRfcGZuLCAodm9pZCAqKU1FTV9PTkxJTkUsDQo+ICAJCQkJICBjaGFu
Z2VfbWVtYmxvY2tfc3RhdGUpOw0KPiArCQl1bmxvY2tfZGV2aWNlX2hvdHBsdWcoKTsNCj4gIAkJ
cmV0dXJuIGZhbHNlOw0KPiAgCX0NCj4gIA0KPiAgCXdhbGtfbWVtb3J5X3JhbmdlKHN0YXJ0X3Bm
biwgZW5kX3BmbiwgKHZvaWQgKilNRU1fT0ZGTElORSwNCj4gIAkJCSAgY2hhbmdlX21lbWJsb2Nr
X3N0YXRlKTsNCj4gIA0KPiAtCXJlbW92ZV9tZW1vcnkobmlkLCBzdGFydF9wZm4gPDwgUEFHRV9T
SElGVCwgbnJfcGFnZXMgPDwgUEFHRV9TSElGVCk7DQo+ICsJX19yZW1vdmVfbWVtb3J5KG5pZCwg
c3RhcnRfcGZuIDw8IFBBR0VfU0hJRlQsIG5yX3BhZ2VzIDw8IFBBR0VfU0hJRlQpOw0KPiAgDQo+
ICsJdW5sb2NrX2RldmljZV9ob3RwbHVnKCk7DQo+ICAJcmV0dXJuIHRydWU7DQo+ICB9DQo+ICAN
Cj4g

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  2018-08-21 10:44 ` David Hildenbrand
  2018-08-30 19:38     ` Pasha Tatashin
@ 2018-08-30 19:38   ` Pasha Tatashin
  1 sibling, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michael Neuling, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	linux-kernel, linux-acpi, Paul Mackerras, Michael Ellerman,
	xen-devel, devel, Rashmica Gupta, linuxppc-dev

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> Let's perform all checking + offlining + removing under
> device_hotplug_lock, so nobody can mess with these devices via
> sysfs concurrently.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Rashmica Gupta <rashmica.g@gmail.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/powerpc/platforms/powernv/memtrace.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
> index ef7181d4fe68..473e59842ec5 100644
> --- a/arch/powerpc/platforms/powernv/memtrace.c
> +++ b/arch/powerpc/platforms/powernv/memtrace.c
> @@ -74,9 +74,13 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
>  {
>  	u64 end_pfn = start_pfn + nr_pages - 1;
>  
> +	lock_device_hotplug();
> +
>  	if (walk_memory_range(start_pfn, end_pfn, NULL,
> -	    check_memblock_online))
> +	    check_memblock_online)) {
> +		unlock_device_hotplug();
>  		return false;
> +	}
>  
>  	walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE,
>  			  change_memblock_state);
> @@ -84,14 +88,16 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages)
>  	if (offline_pages(start_pfn, nr_pages)) {
>  		walk_memory_range(start_pfn, end_pfn, (void *)MEM_ONLINE,
>  				  change_memblock_state);
> +		unlock_device_hotplug();
>  		return false;
>  	}
>  
>  	walk_memory_range(start_pfn, end_pfn, (void *)MEM_OFFLINE,
>  			  change_memblock_state);
>  
> -	remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
> +	__remove_memory(nid, start_pfn << PAGE_SHIFT, nr_pages << PAGE_SHIFT);
>  
> +	unlock_device_hotplug();
>  	return true;
>  }
>  
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
  2018-08-21 10:44 ` [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals David Hildenbrand
  2018-08-30 19:38   ` Pasha Tatashin
@ 2018-08-30 19:38     ` Pasha Tatashin
  1 sibling, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michal Hocko, linux-doc, Jonathan Corbet, linux-kernel,
	linux-acpi, xen-devel, devel, Andrew Morton, linuxppc-dev


Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> Let's document the magic a bit, especially why device_hotplug_lock is
> required when adding/removing memory and how it all play together with
> requests to online/offline memory from user space.
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
@ 2018-08-30 19:38     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Jonathan Corbet, Michal Hocko, Andrew Morton


Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> Let's document the magic a bit, especially why device_hotplug_lock is
> required when adding/removing memory and how it all play together with
> requests to online/offline memory from user space.
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
@ 2018-08-30 19:38     ` Pasha Tatashin
  0 siblings, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Jonathan Corbet, Michal Hocko, Andrew Morton

DQpSZXZpZXdlZC1ieTogUGF2ZWwgVGF0YXNoaW4gPHBhdmVsLnRhdGFzaGluQG1pY3Jvc29mdC5j
b20+DQoNCk9uIDgvMjEvMTggNjo0NCBBTSwgRGF2aWQgSGlsZGVuYnJhbmQgd3JvdGU6DQo+IExl
dCdzIGRvY3VtZW50IHRoZSBtYWdpYyBhIGJpdCwgZXNwZWNpYWxseSB3aHkgZGV2aWNlX2hvdHBs
dWdfbG9jayBpcw0KPiByZXF1aXJlZCB3aGVuIGFkZGluZy9yZW1vdmluZyBtZW1vcnkgYW5kIGhv
dyBpdCBhbGwgcGxheSB0b2dldGhlciB3aXRoDQo+IHJlcXVlc3RzIHRvIG9ubGluZS9vZmZsaW5l
IG1lbW9yeSBmcm9tIHVzZXIgc3BhY2UuDQo+IA==

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals
  2018-08-21 10:44 ` [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals David Hildenbrand
@ 2018-08-30 19:38   ` Pasha Tatashin
  2018-08-30 19:38     ` Pasha Tatashin
  1 sibling, 0 replies; 65+ messages in thread
From: Pasha Tatashin @ 2018-08-30 19:38 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Michal Hocko, linux-doc, Jonathan Corbet, linux-kernel,
	linux-acpi, xen-devel, devel, Andrew Morton, linuxppc-dev


Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

On 8/21/18 6:44 AM, David Hildenbrand wrote:
> Let's document the magic a bit, especially why device_hotplug_lock is
> required when adding/removing memory and how it all play together with
> requests to online/offline memory from user space.
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-30 19:35     ` Pasha Tatashin
@ 2018-08-31 13:12       ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-31 13:12 UTC (permalink / raw)
  To: Pasha Tatashin, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Nathan Fontenot, John Allen

On 30.08.2018 21:35, Pasha Tatashin wrote:
>> +
>> +void __ref remove_memory(int nid, u64 start, u64 size)
> 
> Remove __ref, otherwise looks good:

Indeed, will do.

Thanks for the review. Will resend in two weeks when I'm back from vacation.

Cheers!

> 
> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
> 
>> +{
>> +	lock_device_hotplug();
>> +	__remove_memory(nid, start, size);
>> +	unlock_device_hotplug();
>> +}
>>  EXPORT_SYMBOL_GPL(remove_memory);
>>  #endif /* CONFIG_MEMORY_HOTREMOVE */


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
@ 2018-08-31 13:12       ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-31 13:12 UTC (permalink / raw)
  To: Pasha Tatashin, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Rashmica Gupta, Michael Neuling,
	Balbir Singh, Nathan Fontenot, John Allen, Andrew Morton,
	Michal Hocko, Dan Williams, Joonsoo Kim, Vlastimil Babka,
	Greg Kroah-Hartman, Oscar Salvador, YASUAKI ISHIMATSU,
	Mathieu Malaterre

On 30.08.2018 21:35, Pasha Tatashin wrote:
>> +
>> +void __ref remove_memory(int nid, u64 start, u64 size)
> 
> Remove __ref, otherwise looks good:

Indeed, will do.

Thanks for the review. Will resend in two weeks when I'm back from vacation.

Cheers!

> 
> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
> 
>> +{
>> +	lock_device_hotplug();
>> +	__remove_memory(nid, start, size);
>> +	unlock_device_hotplug();
>> +}
>>  EXPORT_SYMBOL_GPL(remove_memory);
>>  #endif /* CONFIG_MEMORY_HOTREMOVE */


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2018-08-30 19:35     ` Pasha Tatashin
                       ` (2 preceding siblings ...)
  (?)
@ 2018-08-31 13:12     ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-31 13:12 UTC (permalink / raw)
  To: Pasha Tatashin, linux-mm
  Cc: Michal Hocko, linux-doc, Benjamin Herrenschmidt, Balbir Singh,
	Paul Mackerras, Rashmica Gupta, Michael Neuling,
	Michael Ellerman, linux-acpi, xen-devel, Len Brown,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim,
	Vlastimil Babka, Oscar Salvador, Mathieu Malaterre,
	Greg Kroah-Hartman, Rafael J. Wysocki

On 30.08.2018 21:35, Pasha Tatashin wrote:
>> +
>> +void __ref remove_memory(int nid, u64 start, u64 size)
> 
> Remove __ref, otherwise looks good:

Indeed, will do.

Thanks for the review. Will resend in two weeks when I'm back from vacation.

Cheers!

> 
> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
> 
>> +{
>> +	lock_device_hotplug();
>> +	__remove_memory(nid, start, size);
>> +	unlock_device_hotplug();
>> +}
>>  EXPORT_SYMBOL_GPL(remove_memory);
>>  #endif /* CONFIG_MEMORY_HOTREMOVE */


-- 

Thanks,

David / dhildenb

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
@ 2018-08-31 20:54   ` Oscar Salvador
  -1 siblings, 0 replies; 65+ messages in thread
From: Oscar Salvador @ 2018-08-31 20:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, linux-mm, Paul Mackerras,
	Rashmica Gupta, Boris Ostrovsky, Michael Neuling,
	Stephen Hemminger, Jonathan Corbet, Michael Ellerman,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim,
	Vlastimil

On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.

Hi David,

I would like to review this but I am on vacation, so I will not be able to get to it
soon.
I plan to do it once I am back.

Thanks
-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-31 20:54   ` Oscar Salvador
  0 siblings, 0 replies; 65+ messages in thread
From: Oscar Salvador @ 2018-08-31 20:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, linux-doc, linuxppc-dev, linux-acpi,
	xen-devel, devel, Andrew Morton, Balbir Singh,
	Benjamin Herrenschmidt, Boris Ostrovsky, Dan Williams,
	Greg Kroah-Hartman, Haiyang Zhang, Heiko Carstens, John Allen,
	Jonathan Corbet, Joonsoo Kim, Juergen Gross, Kate Stewart,
	K. Y. Srinivasan, Len Brown, Martin Schwidefsky,
	Mathieu Malaterre, Michael Ellerman, Michael Neuling,
	Michal Hocko, Nathan Fontenot, Oscar Salvador, Paul Mackerras,
	Pavel Tatashin, Philippe Ombredanne, Rafael J. Wysocki,
	Rashmica Gupta, Stephen Hemminger, Thomas Gleixner,
	Vlastimil Babka, YASUAKI ISHIMATSU

On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.

Hi David,

I would like to review this but I am on vacation, so I will not be able to get to it
soon.
I plan to do it once I am back.

Thanks
-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44 ` David Hildenbrand
                   ` (14 preceding siblings ...)
  (?)
@ 2018-08-31 20:54 ` Oscar Salvador
  -1 siblings, 0 replies; 65+ messages in thread
From: Oscar Salvador @ 2018-08-31 20:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, linux-mm, Paul Mackerras,
	Rashmica Gupta, K. Y. Srinivasan, Boris Ostrovsky,
	Michael Neuling, Stephen Hemminger, Jonathan Corbet,
	Michael Ellerman, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, YASUAKI ISHIMATSU, Nathan Fontenot,
	Dan Williams, Joonsoo

On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
> This is the same approach as in the first RFC, but this time without
> exporting device_hotplug_lock (requested by Greg) and with some more
> details and documentation regarding locking. Tested only on x86 so far.

Hi David,

I would like to review this but I am on vacation, so I will not be able to get to it
soon.
I plan to do it once I am back.

Thanks
-- 
Oscar Salvador
SUSE L3

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-31 20:54   ` Oscar Salvador
@ 2018-09-01 14:03     ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-01 14:03 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, linux-mm, Paul Mackerras,
	Rashmica Gupta, Boris Ostrovsky, Michael Neuling,
	Stephen Hemminger, Jonathan Corbet, Michael Ellerman,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim,
	Vlastimil

On 31.08.2018 22:54, Oscar Salvador wrote:
> On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
> 
> Hi David,
> 
> I would like to review this but I am on vacation, so I will not be able to get to it
> soon.
> I plan to do it once I am back.

Sure, I won't be resending within next two weeks either way, as I am
also on vacation.

Have a nice vacation!

> 
> Thanks
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-09-01 14:03     ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-01 14:03 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-mm, linux-kernel, linux-doc, linuxppc-dev, linux-acpi,
	xen-devel, devel, Andrew Morton, Balbir Singh,
	Benjamin Herrenschmidt, Boris Ostrovsky, Dan Williams,
	Greg Kroah-Hartman, Haiyang Zhang, Heiko Carstens, John Allen,
	Jonathan Corbet, Joonsoo Kim, Juergen Gross, Kate Stewart,
	K. Y. Srinivasan, Len Brown, Martin Schwidefsky,
	Mathieu Malaterre, Michael Ellerman, Michael Neuling,
	Michal Hocko, Nathan Fontenot, Oscar Salvador, Paul Mackerras,
	Pavel Tatashin, Philippe Ombredanne, Rafael J. Wysocki,
	Rashmica Gupta, Stephen Hemminger, Thomas Gleixner,
	Vlastimil Babka, YASUAKI ISHIMATSU

On 31.08.2018 22:54, Oscar Salvador wrote:
> On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
> 
> Hi David,
> 
> I would like to review this but I am on vacation, so I will not be able to get to it
> soon.
> I plan to do it once I am back.

Sure, I won't be resending within next two weeks either way, as I am
also on vacation.

Have a nice vacation!

> 
> Thanks
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
  2018-08-31 20:54   ` Oscar Salvador
  (?)
@ 2018-09-01 14:03   ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-01 14:03 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, linux-mm, Paul Mackerras,
	Rashmica Gupta, K. Y. Srinivasan, Boris Ostrovsky,
	Michael Neuling, Stephen Hemminger, Jonathan Corbet,
	Michael Ellerman, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, YASUAKI ISHIMATSU, Nathan Fontenot,
	Dan Williams, Joonsoo

On 31.08.2018 22:54, Oscar Salvador wrote:
> On Tue, Aug 21, 2018 at 12:44:12PM +0200, David Hildenbrand wrote:
>> This is the same approach as in the first RFC, but this time without
>> exporting device_hotplug_lock (requested by Greg) and with some more
>> details and documentation regarding locking. Tested only on x86 so far.
> 
> Hi David,
> 
> I would like to review this but I am on vacation, so I will not be able to get to it
> soon.
> I plan to do it once I am back.

Sure, I won't be resending within next two weeks either way, as I am
also on vacation.

Have a nice vacation!

> 
> Thanks
> 


-- 

Thanks,

David / dhildenb

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-08-21 10:44   ` David Hildenbrand
@ 2018-09-03  0:36     ` Rashmica
  -1 siblings, 0 replies; 65+ messages in thread
From: Rashmica @ 2018-09-03  0:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, K. Y. Srinivasan,
	Thomas Gleixner, Michael Neuling, Stephen Hemminger,
	Michael Ellerman, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, Dan Williams, YASUAKI ISHIMATSU,
	Boris Ostrovsky, Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Math


[-- Attachment #1.1: Type: text/plain, Size: 1721 bytes --]

Hi David,


On 21/08/18 20:44, David Hildenbrand wrote:

> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
> fix concurrent memory hot-add deadlock"), which tried to fix a possible
> lock inversion reported and discussed in [1] due to the two locks
> 	a) device_lock()
> 	b) mem_hotplug_lock
>
> While add_memory() first takes b), followed by a) during
> bus_probe_device(), onlining of memory from user space first took b),
> followed by a), exposing a possible deadlock.

Do you mean "onlining of memory from user space first took a),
followed by b)"? 

> In [1], and it was decided to not make use of device_hotplug_lock, but
> rather to enforce a locking order.
>
> The problems I spotted related to this:
>
> 1. Memory block device attributes: While .state first calls
>    mem_hotplug_begin() and the calls device_online() - which takes
>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>    effectively calls online_pages() without mem_hotplug_lock.
>
> 2. device_online() should be called under device_hotplug_lock, however
>    onlining memory during add_memory() does not take care of that.
>
> In addition, I think there is also something wrong about the locking in
>
> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>    without locks. This was introduced after 30467e0b3be. And skimming over
>    the code, I assume it could need some more care in regards to locking
>    (e.g. device_online() called without device_hotplug_lock - but I'll
>    not touch that for now).

Can you mention that you fixed this in later patches?


The series looks good to me. Feel free to add my reviewed-by:

Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>


[-- Attachment #1.2: Type: text/html, Size: 2277 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-09-03  0:36     ` Rashmica
  0 siblings, 0 replies; 65+ messages in thread
From: Rashmica @ 2018-09-03  0:36 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Martin Schwidefsky, Heiko Carstens, Boris Ostrovsky,
	Juergen Gross, Michael Neuling, Balbir Singh, Kate Stewart,
	Thomas Gleixner, Philippe Ombredanne, Andrew Morton,
	Michal Hocko, Pavel Tatashin, Vlastimil Babka, Dan Williams,
	Oscar Salvador, YASUAKI ISHIMATSU, Mathieu Malaterre

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]

Hi David,


On 21/08/18 20:44, David Hildenbrand wrote:

> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
> fix concurrent memory hot-add deadlock"), which tried to fix a possible
> lock inversion reported and discussed in [1] due to the two locks
> 	a) device_lock()
> 	b) mem_hotplug_lock
>
> While add_memory() first takes b), followed by a) during
> bus_probe_device(), onlining of memory from user space first took b),
> followed by a), exposing a possible deadlock.

Do you mean "onlining of memory from user space first took a),
followed by b)"? 

> In [1], and it was decided to not make use of device_hotplug_lock, but
> rather to enforce a locking order.
>
> The problems I spotted related to this:
>
> 1. Memory block device attributes: While .state first calls
>    mem_hotplug_begin() and the calls device_online() - which takes
>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>    effectively calls online_pages() without mem_hotplug_lock.
>
> 2. device_online() should be called under device_hotplug_lock, however
>    onlining memory during add_memory() does not take care of that.
>
> In addition, I think there is also something wrong about the locking in
>
> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>    without locks. This was introduced after 30467e0b3be. And skimming over
>    the code, I assume it could need some more care in regards to locking
>    (e.g. device_online() called without device_hotplug_lock - but I'll
>    not touch that for now).

Can you mention that you fixed this in later patches?


The series looks good to me. Feel free to add my reviewed-by:

Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>


[-- Attachment #2: Type: text/html, Size: 2277 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-09-03  0:36     ` Rashmica
@ 2018-09-17  7:32       ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-17  7:32 UTC (permalink / raw)
  To: Rashmica, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Martin Schwidefsky, Heiko Carstens, Boris Ostrovsky,
	Juergen Gross, Michael Neuling, Balbir

Am 03.09.18 um 02:36 schrieb Rashmica:
> Hi David,
> 
> 
> On 21/08/18 20:44, David Hildenbrand wrote:
> 
>> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
>> fix concurrent memory hot-add deadlock"), which tried to fix a possible
>> lock inversion reported and discussed in [1] due to the two locks
>> 	a) device_lock()
>> 	b) mem_hotplug_lock
>>
>> While add_memory() first takes b), followed by a) during
>> bus_probe_device(), onlining of memory from user space first took b),
>> followed by a), exposing a possible deadlock.
> 
> Do you mean "onlining of memory from user space first took a),
> followed by b)"? 

Very right, thanks.

> 
>> In [1], and it was decided to not make use of device_hotplug_lock, but
>> rather to enforce a locking order.
>>
>> The problems I spotted related to this:
>>
>> 1. Memory block device attributes: While .state first calls
>>    mem_hotplug_begin() and the calls device_online() - which takes
>>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>>    effectively calls online_pages() without mem_hotplug_lock.
>>
>> 2. device_online() should be called under device_hotplug_lock, however
>>    onlining memory during add_memory() does not take care of that.
>>
>> In addition, I think there is also something wrong about the locking in
>>
>> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>>    without locks. This was introduced after 30467e0b3be. And skimming over
>>    the code, I assume it could need some more care in regards to locking
>>    (e.g. device_online() called without device_hotplug_lock - but I'll
>>    not touch that for now).
> 
> Can you mention that you fixed this in later patches?

Sure!

> 
> 
> The series looks good to me. Feel free to add my reviewed-by:
> 
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> 

Thanks, r-b only for this patch or all of the series?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-09-17  7:32       ` David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-17  7:32 UTC (permalink / raw)
  To: Rashmica, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Martin Schwidefsky, Heiko Carstens, Boris Ostrovsky,
	Juergen Gross, Michael Neuling, Balbir Singh, Kate Stewart,
	Thomas Gleixner, Philippe Ombredanne, Andrew Morton,
	Michal Hocko, Pavel Tatashin, Vlastimil Babka, Dan Williams,
	Oscar Salvador, YASUAKI ISHIMATSU, Mathieu Malaterre

Am 03.09.18 um 02:36 schrieb Rashmica:
> Hi David,
> 
> 
> On 21/08/18 20:44, David Hildenbrand wrote:
> 
>> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
>> fix concurrent memory hot-add deadlock"), which tried to fix a possible
>> lock inversion reported and discussed in [1] due to the two locks
>> 	a) device_lock()
>> 	b) mem_hotplug_lock
>>
>> While add_memory() first takes b), followed by a) during
>> bus_probe_device(), onlining of memory from user space first took b),
>> followed by a), exposing a possible deadlock.
> 
> Do you mean "onlining of memory from user space first took a),
> followed by b)"? 

Very right, thanks.

> 
>> In [1], and it was decided to not make use of device_hotplug_lock, but
>> rather to enforce a locking order.
>>
>> The problems I spotted related to this:
>>
>> 1. Memory block device attributes: While .state first calls
>>    mem_hotplug_begin() and the calls device_online() - which takes
>>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>>    effectively calls online_pages() without mem_hotplug_lock.
>>
>> 2. device_online() should be called under device_hotplug_lock, however
>>    onlining memory during add_memory() does not take care of that.
>>
>> In addition, I think there is also something wrong about the locking in
>>
>> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>>    without locks. This was introduced after 30467e0b3be. And skimming over
>>    the code, I assume it could need some more care in regards to locking
>>    (e.g. device_online() called without device_hotplug_lock - but I'll
>>    not touch that for now).
> 
> Can you mention that you fixed this in later patches?

Sure!

> 
> 
> The series looks good to me. Feel free to add my reviewed-by:
> 
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> 

Thanks, r-b only for this patch or all of the series?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-09-03  0:36     ` Rashmica
  (?)
@ 2018-09-17  7:32     ` David Hildenbrand
  -1 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-09-17  7:32 UTC (permalink / raw)
  To: Rashmica, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, K. Y. Srinivasan,
	Thomas Gleixner, Michael Neuling, Stephen Hemminger,
	Michael Ellerman, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, Dan Williams, YASUAKI ISHIMATSU,
	Boris Ostrovsky, Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Math

Am 03.09.18 um 02:36 schrieb Rashmica:
> Hi David,
> 
> 
> On 21/08/18 20:44, David Hildenbrand wrote:
> 
>> There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
>> fix concurrent memory hot-add deadlock"), which tried to fix a possible
>> lock inversion reported and discussed in [1] due to the two locks
>> 	a) device_lock()
>> 	b) mem_hotplug_lock
>>
>> While add_memory() first takes b), followed by a) during
>> bus_probe_device(), onlining of memory from user space first took b),
>> followed by a), exposing a possible deadlock.
> 
> Do you mean "onlining of memory from user space first took a),
> followed by b)"? 

Very right, thanks.

> 
>> In [1], and it was decided to not make use of device_hotplug_lock, but
>> rather to enforce a locking order.
>>
>> The problems I spotted related to this:
>>
>> 1. Memory block device attributes: While .state first calls
>>    mem_hotplug_begin() and the calls device_online() - which takes
>>    device_lock() - .online does no longer call mem_hotplug_begin(), so
>>    effectively calls online_pages() without mem_hotplug_lock.
>>
>> 2. device_online() should be called under device_hotplug_lock, however
>>    onlining memory during add_memory() does not take care of that.
>>
>> In addition, I think there is also something wrong about the locking in
>>
>> 3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
>>    without locks. This was introduced after 30467e0b3be. And skimming over
>>    the code, I assume it could need some more care in regards to locking
>>    (e.g. device_online() called without device_hotplug_lock - but I'll
>>    not touch that for now).
> 
> Can you mention that you fixed this in later patches?

Sure!

> 
> 
> The series looks good to me. Feel free to add my reviewed-by:
> 
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> 

Thanks, r-b only for this patch or all of the series?

-- 

Thanks,

David / dhildenb

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-09-17  7:32       ` David Hildenbrand
@ 2018-09-25  1:26         ` Rashmica Gupta
  -1 siblings, 0 replies; 65+ messages in thread
From: Rashmica Gupta @ 2018-09-25  1:26 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Thomas Gleixner,
	Michael Neuling, Stephen Hemminger, Michael Ellerman,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	Dan Williams, YASUAKI ISHIMATSU, Boris Ostrovsky,
	Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Mathieu Malaterre, Greg

On Mon, 2018-09-17 at 09:32 +0200, David Hildenbrand wrote:
> Am 03.09.18 um 02:36 schrieb Rashmica:
> > Hi David,
> > 
> > 
> > On 21/08/18 20:44, David Hildenbrand wrote:
> > 
> > > There seem to be some problems as result of 30467e0b3be ("mm,
> > > hotplug:
> > > fix concurrent memory hot-add deadlock"), which tried to fix a
> > > possible
> > > lock inversion reported and discussed in [1] due to the two locks
> > > 	a) device_lock()
> > > 	b) mem_hotplug_lock
> > > 
> > > While add_memory() first takes b), followed by a) during
> > > bus_probe_device(), onlining of memory from user space first took
> > > b),
> > > followed by a), exposing a possible deadlock.
> > 
> > Do you mean "onlining of memory from user space first took a),
> > followed by b)"? 
> 
> Very right, thanks.
> 
> > 
> > > In [1], and it was decided to not make use of
> > > device_hotplug_lock, but
> > > rather to enforce a locking order.
> > > 
> > > The problems I spotted related to this:
> > > 
> > > 1. Memory block device attributes: While .state first calls
> > >    mem_hotplug_begin() and the calls device_online() - which
> > > takes
> > >    device_lock() - .online does no longer call
> > > mem_hotplug_begin(), so
> > >    effectively calls online_pages() without mem_hotplug_lock.
> > > 
> > > 2. device_online() should be called under device_hotplug_lock,
> > > however
> > >    onlining memory during add_memory() does not take care of
> > > that.
> > > 
> > > In addition, I think there is also something wrong about the
> > > locking in
> > > 
> > > 3. arch/powerpc/platforms/powernv/memtrace.c calls
> > > offline_pages()
> > >    without locks. This was introduced after 30467e0b3be. And
> > > skimming over
> > >    the code, I assume it could need some more care in regards to
> > > locking
> > >    (e.g. device_online() called without device_hotplug_lock - but
> > > I'll
> > >    not touch that for now).
> > 
> > Can you mention that you fixed this in later patches?
> 
> Sure!
> 
> > 
> > 
> > The series looks good to me. Feel free to add my reviewed-by:
> > 
> > Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> > 
> 
> Thanks, r-b only for this patch or all of the series?

Sorry, I somehow missed this. To all of the series.
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
@ 2018-09-25  1:26         ` Rashmica Gupta
  0 siblings, 0 replies; 65+ messages in thread
From: Rashmica Gupta @ 2018-09-25  1:26 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, linux-doc, linuxppc-dev, linux-acpi, xen-devel,
	devel, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Martin Schwidefsky, Heiko Carstens, Boris Ostrovsky,
	Juergen Gross, Michael Neuling, Balbir Singh, Kate Stewart,
	Thomas Gleixner, Philippe Ombredanne, Andrew Morton,
	Michal Hocko, Pavel Tatashin, Vlastimil Babka, Dan Williams,
	Oscar Salvador, YASUAKI ISHIMATSU, Mathieu Malaterre

On Mon, 2018-09-17 at 09:32 +0200, David Hildenbrand wrote:
> Am 03.09.18 um 02:36 schrieb Rashmica:
> > Hi David,
> > 
> > 
> > On 21/08/18 20:44, David Hildenbrand wrote:
> > 
> > > There seem to be some problems as result of 30467e0b3be ("mm,
> > > hotplug:
> > > fix concurrent memory hot-add deadlock"), which tried to fix a
> > > possible
> > > lock inversion reported and discussed in [1] due to the two locks
> > > 	a) device_lock()
> > > 	b) mem_hotplug_lock
> > > 
> > > While add_memory() first takes b), followed by a) during
> > > bus_probe_device(), onlining of memory from user space first took
> > > b),
> > > followed by a), exposing a possible deadlock.
> > 
> > Do you mean "onlining of memory from user space first took a),
> > followed by b)"? 
> 
> Very right, thanks.
> 
> > 
> > > In [1], and it was decided to not make use of
> > > device_hotplug_lock, but
> > > rather to enforce a locking order.
> > > 
> > > The problems I spotted related to this:
> > > 
> > > 1. Memory block device attributes: While .state first calls
> > >    mem_hotplug_begin() and the calls device_online() - which
> > > takes
> > >    device_lock() - .online does no longer call
> > > mem_hotplug_begin(), so
> > >    effectively calls online_pages() without mem_hotplug_lock.
> > > 
> > > 2. device_online() should be called under device_hotplug_lock,
> > > however
> > >    onlining memory during add_memory() does not take care of
> > > that.
> > > 
> > > In addition, I think there is also something wrong about the
> > > locking in
> > > 
> > > 3. arch/powerpc/platforms/powernv/memtrace.c calls
> > > offline_pages()
> > >    without locks. This was introduced after 30467e0b3be. And
> > > skimming over
> > >    the code, I assume it could need some more care in regards to
> > > locking
> > >    (e.g. device_online() called without device_hotplug_lock - but
> > > I'll
> > >    not touch that for now).
> > 
> > Can you mention that you fixed this in later patches?
> 
> Sure!
> 
> > 
> > 
> > The series looks good to me. Feel free to add my reviewed-by:
> > 
> > Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> > 
> 
> Thanks, r-b only for this patch or all of the series?

Sorry, I somehow missed this. To all of the series.
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
  2018-09-17  7:32       ` David Hildenbrand
  (?)
  (?)
@ 2018-09-25  1:26       ` Rashmica Gupta
  -1 siblings, 0 replies; 65+ messages in thread
From: Rashmica Gupta @ 2018-09-25  1:26 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, K. Y. Srinivasan,
	Thomas Gleixner, Michael Neuling, Stephen Hemminger,
	Michael Ellerman, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, Dan Williams, YASUAKI ISHIMATSU,
	Boris Ostrovsky, Vlastimil Babka, Oscar Salvador, Juergen Gross,
	Math

On Mon, 2018-09-17 at 09:32 +0200, David Hildenbrand wrote:
> Am 03.09.18 um 02:36 schrieb Rashmica:
> > Hi David,
> > 
> > 
> > On 21/08/18 20:44, David Hildenbrand wrote:
> > 
> > > There seem to be some problems as result of 30467e0b3be ("mm,
> > > hotplug:
> > > fix concurrent memory hot-add deadlock"), which tried to fix a
> > > possible
> > > lock inversion reported and discussed in [1] due to the two locks
> > > 	a) device_lock()
> > > 	b) mem_hotplug_lock
> > > 
> > > While add_memory() first takes b), followed by a) during
> > > bus_probe_device(), onlining of memory from user space first took
> > > b),
> > > followed by a), exposing a possible deadlock.
> > 
> > Do you mean "onlining of memory from user space first took a),
> > followed by b)"? 
> 
> Very right, thanks.
> 
> > 
> > > In [1], and it was decided to not make use of
> > > device_hotplug_lock, but
> > > rather to enforce a locking order.
> > > 
> > > The problems I spotted related to this:
> > > 
> > > 1. Memory block device attributes: While .state first calls
> > >    mem_hotplug_begin() and the calls device_online() - which
> > > takes
> > >    device_lock() - .online does no longer call
> > > mem_hotplug_begin(), so
> > >    effectively calls online_pages() without mem_hotplug_lock.
> > > 
> > > 2. device_online() should be called under device_hotplug_lock,
> > > however
> > >    onlining memory during add_memory() does not take care of
> > > that.
> > > 
> > > In addition, I think there is also something wrong about the
> > > locking in
> > > 
> > > 3. arch/powerpc/platforms/powernv/memtrace.c calls
> > > offline_pages()
> > >    without locks. This was introduced after 30467e0b3be. And
> > > skimming over
> > >    the code, I assume it could need some more care in regards to
> > > locking
> > >    (e.g. device_online() called without device_hotplug_lock - but
> > > I'll
> > >    not touch that for now).
> > 
> > Can you mention that you fixed this in later patches?
> 
> Sure!
> 
> > 
> > 
> > The series looks good to me. Feel free to add my reviewed-by:
> > 
> > Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> > 
> 
> Thanks, r-b only for this patch or all of the series?

Sorry, I somehow missed this. To all of the series.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-21 10:44 David Hildenbrand
  0 siblings, 0 replies; 65+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	K. Y. Srinivasan, Boris Ostrovsky, Michael Neuling,
	Stephen Hemminger, Jonathan Corbet, Michael Ellerman,
	David Hildenbrand, Pavel Tatashin, linux-acpi, xen-devel,
	Len Brown, Haiyang Zhang, YASUAKI ISHIMATSU, Nathan Fontenot,
	Dan Williams

This is the same approach as in the first RFC, but this time without
exporting device_hotplug_lock (requested by Greg) and with some more
details and documentation regarding locking. Tested only on x86 so far.

--------------------------------------------------------------------------

Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.

While e.g.
	echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
	echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.

E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in online_pages()
basically unprotected zone->present_pages then.

Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.

Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.

I propose the general rules (documentation added in patch 6):

1. add_memory/add_memory_resource() must only be called with
   device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
   already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
   device_hotplug_lock. This is already documented and true for now in core
   code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
   online_pages/offline_pages.

To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.


RFC -> RFCv2:
- Don't export device_hotplug_lock, provide proper remove_memory/add_memory
  wrappers.
- Split up the patches a bit.
- Try to improve powernv memtrace locking
- Add some documentation for locking that matches my knowledge

David Hildenbrand (6):
  mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  mm/memory_hotplug: fix online/offline_pages called w.o.
    mem_hotplug_lock
  powerpc/powernv: hold device_hotplug_lock when calling device_online()
  powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  memory-hotplug.txt: Add some details about locking internals

 Documentation/memory-hotplug.txt              | 39 +++++++++++-
 arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
 .../platforms/pseries/hotplug-memory.c        |  8 +--
 drivers/acpi/acpi_memhotplug.c                |  4 +-
 drivers/base/memory.c                         | 22 +++----
 drivers/xen/balloon.c                         |  3 +
 include/linux/memory_hotplug.h                |  4 +-
 mm/memory_hotplug.c                           | 59 +++++++++++++++----
 8 files changed, 115 insertions(+), 38 deletions(-)

-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2018-09-25  1:27 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21 10:44 [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:35   ` Pasha Tatashin
2018-08-30 19:35   ` Pasha Tatashin
2018-08-30 19:35     ` Pasha Tatashin
2018-08-30 19:35     ` Pasha Tatashin
2018-08-31 13:12     ` David Hildenbrand
2018-08-31 13:12       ` David Hildenbrand
2018-08-31 13:12     ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() " David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:36   ` Pasha Tatashin
2018-08-30 19:36     ` Pasha Tatashin
2018-08-30 19:36     ` Pasha Tatashin
2018-08-30 19:36   ` Pasha Tatashin
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:37   ` Pasha Tatashin
2018-08-30 19:37     ` Pasha Tatashin
2018-08-30 19:37     ` Pasha Tatashin
2018-08-30 19:37   ` Pasha Tatashin
2018-09-03  0:36   ` Rashmica
2018-09-03  0:36     ` Rashmica
2018-09-17  7:32     ` David Hildenbrand
2018-09-17  7:32     ` David Hildenbrand
2018-09-17  7:32       ` David Hildenbrand
2018-09-25  1:26       ` Rashmica Gupta
2018-09-25  1:26         ` Rashmica Gupta
2018-09-25  1:26       ` Rashmica Gupta
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online() David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-21 10:44 ` [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages() David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-21 10:44 ` [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-21 10:44 ` David Hildenbrand
2018-08-30 12:31 ` [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-30 12:31   ` David Hildenbrand
2018-08-30 15:54   ` Pasha Tatashin
2018-08-30 15:54   ` Pasha Tatashin
2018-08-30 15:54     ` Pasha Tatashin
2018-08-30 15:54     ` Pasha Tatashin
2018-08-30 12:31 ` David Hildenbrand
2018-08-31 20:54 ` Oscar Salvador
2018-08-31 20:54 ` Oscar Salvador
2018-08-31 20:54   ` Oscar Salvador
2018-09-01 14:03   ` David Hildenbrand
2018-09-01 14:03   ` David Hildenbrand
2018-09-01 14:03     ` David Hildenbrand
  -- strict thread matches above, loose matches on Subject: below --
2018-08-21 10:44 David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.