All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock
@ 2018-08-21 10:44 ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2018-08-21 10:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kate Stewart, Michal Hocko, linux-doc, Benjamin Herrenschmidt,
	Balbir Singh, Heiko Carstens, Paul Mackerras, Rashmica Gupta,
	Boris Ostrovsky, Michael Neuling, Stephen Hemminger,
	Jonathan Corbet, Michael Ellerman, David Hildenbrand,
	Pavel Tatashin, linux-acpi, xen-devel, Len Brown, Haiyang Zhang,
	YASUAKI ISHIMATSU, Nathan Fontenot, Dan Williams, Joonsoo Kim

This is the same approach as in the first RFC, but this time without
exporting device_hotplug_lock (requested by Greg) and with some more
details and documentation regarding locking. Tested only on x86 so far.

--------------------------------------------------------------------------

Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.

While e.g.
	echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
	echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.

E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in online_pages()
basically unprotected zone->present_pages then.

Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.

Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.

I propose the general rules (documentation added in patch 6):

1. add_memory/add_memory_resource() must only be called with
   device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
   already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
   device_hotplug_lock. This is already documented and true for now in core
   code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
   online_pages/offline_pages.

To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.


RFC -> RFCv2:
- Don't export device_hotplug_lock, provide proper remove_memory/add_memory
  wrappers.
- Split up the patches a bit.
- Try to improve powernv memtrace locking
- Add some documentation for locking that matches my knowledge

David Hildenbrand (6):
  mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  mm/memory_hotplug: make add_memory() take the device_hotplug_lock
  mm/memory_hotplug: fix online/offline_pages called w.o.
    mem_hotplug_lock
  powerpc/powernv: hold device_hotplug_lock when calling device_online()
  powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages()
  memory-hotplug.txt: Add some details about locking internals

 Documentation/memory-hotplug.txt              | 39 +++++++++++-
 arch/powerpc/platforms/powernv/memtrace.c     | 14 +++--
 .../platforms/pseries/hotplug-memory.c        |  8 +--
 drivers/acpi/acpi_memhotplug.c                |  4 +-
 drivers/base/memory.c                         | 22 +++----
 drivers/xen/balloon.c                         |  3 +
 include/linux/memory_hotplug.h                |  4 +-
 mm/memory_hotplug.c                           | 59 +++++++++++++++----
 8 files changed, 115 insertions(+), 38 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2018-09-25  1:27 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21 10:44 [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:35   ` Pasha Tatashin
2018-08-30 19:35   ` Pasha Tatashin
2018-08-30 19:35     ` Pasha Tatashin
2018-08-30 19:35     ` Pasha Tatashin
2018-08-31 13:12     ` David Hildenbrand
2018-08-31 13:12       ` David Hildenbrand
2018-08-31 13:12     ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 2/6] mm/memory_hotplug: make add_memory() " David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:36   ` Pasha Tatashin
2018-08-30 19:36     ` Pasha Tatashin
2018-08-30 19:36     ` Pasha Tatashin
2018-08-30 19:36   ` Pasha Tatashin
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:37   ` Pasha Tatashin
2018-08-30 19:37     ` Pasha Tatashin
2018-08-30 19:37     ` Pasha Tatashin
2018-08-30 19:37   ` Pasha Tatashin
2018-09-03  0:36   ` Rashmica
2018-09-03  0:36     ` Rashmica
2018-09-17  7:32     ` David Hildenbrand
2018-09-17  7:32     ` David Hildenbrand
2018-09-17  7:32       ` David Hildenbrand
2018-09-25  1:26       ` Rashmica Gupta
2018-09-25  1:26         ` Rashmica Gupta
2018-09-25  1:26       ` Rashmica Gupta
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44 ` [PATCH RFCv2 4/6] powerpc/powernv: hold device_hotplug_lock when calling device_online() David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-21 10:44   ` David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-21 10:44 ` [PATCH RFCv2 5/6] powerpc/powernv: hold device_hotplug_lock in memtrace_offline_pages() David Hildenbrand
2018-08-21 10:44 ` David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-21 10:44 ` [PATCH RFCv2 6/6] memory-hotplug.txt: Add some details about locking internals David Hildenbrand
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38   ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-30 19:38     ` Pasha Tatashin
2018-08-21 10:44 ` David Hildenbrand
2018-08-30 12:31 ` [PATCH RFCv2 0/6] mm: online/offline_pages called w.o. mem_hotplug_lock David Hildenbrand
2018-08-30 12:31   ` David Hildenbrand
2018-08-30 15:54   ` Pasha Tatashin
2018-08-30 15:54   ` Pasha Tatashin
2018-08-30 15:54     ` Pasha Tatashin
2018-08-30 15:54     ` Pasha Tatashin
2018-08-30 12:31 ` David Hildenbrand
2018-08-31 20:54 ` Oscar Salvador
2018-08-31 20:54 ` Oscar Salvador
2018-08-31 20:54   ` Oscar Salvador
2018-09-01 14:03   ` David Hildenbrand
2018-09-01 14:03   ` David Hildenbrand
2018-09-01 14:03     ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.