[patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock

All of lore.kernel.org
 help / color / mirror / Atom feed

* [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
@ 2017-07-04  9:32 ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

Andrey reported a potential deadlock with the memory hotplug lock and the
cpu hotplug lock.

The following series addresses this by reworking the memory hotplug locking
and fixing up the potential deadlock scenarios.

Applies against Linus head. All preliminaries are merged there already

Thanks,

	tglx
---
 include/linux/swap.h |    1 
 mm/memory_hotplug.c  |   89 ++++++++-------------------------------------------
 mm/page_alloc.c      |    2 -
 mm/swap.c            |   11 ++++--
 4 files changed, 25 insertions(+), 78 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
@ 2017-07-04  9:32 ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

Andrey reported a potential deadlock with the memory hotplug lock and the
cpu hotplug lock.

The following series addresses this by reworking the memory hotplug locking
and fixing up the potential deadlock scenarios.

Applies against Linus head. All preliminaries are merged there already

Thanks,

	tglx
---
 include/linux/swap.h |    1 
 mm/memory_hotplug.c  |   89 ++++++++-------------------------------------------
 mm/page_alloc.c      |    2 -
 mm/swap.c            |   11 ++++--
 4 files changed, 25 insertions(+), 78 deletions(-)



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04  9:32 ` Thomas Gleixner
@ 2017-07-04  9:32   ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

[-- Attachment #1: mm--swap--Provide-lru_add_drain_cpuslocked--.patch --]
[-- Type: text/plain, Size: 2274 bytes --]

The rework of the cpu hotplug locking unearthed potential deadlocks with
the memory hotplug locking code.

The solution for these is to rework the memory hotplug locking code as well
and take the cpu hotplug lock before the memory hotplug lock in
mem_hotplug_begin(), but this will cause a recursive locking of the cpu
hotplug lock when the memory hotplug code calls lru_add_drain_all().

Split out the inner workings of lru_add_drain_all() into
lru_add_drain_all_cpuslocked() so this function can be invoked from the
memory hotplug code with the cpu hotplug lock held.

Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
---
 include/linux/swap.h |    1 +
 mm/swap.c            |   11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
 extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
+extern void lru_add_drain_all_cpuslocked(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
 
 static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
 
-void lru_add_drain_all(void)
+void lru_add_drain_all_cpuslocked(void)
 {
 	static DEFINE_MUTEX(lock);
 	static struct cpumask has_work;
@@ -701,7 +701,6 @@ void lru_add_drain_all(void)
 		return;
 
 	mutex_lock(&lock);
-	get_online_cpus();
 	cpumask_clear(&has_work);
 
 	for_each_online_cpu(cpu) {
@@ -721,10 +720,16 @@ void lru_add_drain_all(void)
 	for_each_cpu(cpu, &has_work)
 		flush_work(&per_cpu(lru_add_drain_work, cpu));
 
-	put_online_cpus();
 	mutex_unlock(&lock);
 }
 
+void lru_add_drain_all(void)
+{
+	get_online_cpus();
+	lru_add_drain_all_cpuslocked();
+	put_online_cpus();
+}
+
 /**
  * release_pages - batched put_page()
  * @pages: array of pages to release

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04  9:32   ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

[-- Attachment #1: mm--swap--Provide-lru_add_drain_cpuslocked--.patch --]
[-- Type: text/plain, Size: 2501 bytes --]

The rework of the cpu hotplug locking unearthed potential deadlocks with
the memory hotplug locking code.

The solution for these is to rework the memory hotplug locking code as well
and take the cpu hotplug lock before the memory hotplug lock in
mem_hotplug_begin(), but this will cause a recursive locking of the cpu
hotplug lock when the memory hotplug code calls lru_add_drain_all().

Split out the inner workings of lru_add_drain_all() into
lru_add_drain_all_cpuslocked() so this function can be invoked from the
memory hotplug code with the cpu hotplug lock held.

Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
---
 include/linux/swap.h |    1 +
 mm/swap.c            |   11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
 extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
+extern void lru_add_drain_all_cpuslocked(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
 
 static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
 
-void lru_add_drain_all(void)
+void lru_add_drain_all_cpuslocked(void)
 {
 	static DEFINE_MUTEX(lock);
 	static struct cpumask has_work;
@@ -701,7 +701,6 @@ void lru_add_drain_all(void)
 		return;
 
 	mutex_lock(&lock);
-	get_online_cpus();
 	cpumask_clear(&has_work);
 
 	for_each_online_cpu(cpu) {
@@ -721,10 +720,16 @@ void lru_add_drain_all(void)
 	for_each_cpu(cpu, &has_work)
 		flush_work(&per_cpu(lru_add_drain_work, cpu));
 
-	put_online_cpus();
 	mutex_unlock(&lock);
 }
 
+void lru_add_drain_all(void)
+{
+	get_online_cpus();
+	lru_add_drain_all_cpuslocked();
+	put_online_cpus();
+}
+
 /**
  * release_pages - batched put_page()
  * @pages: array of pages to release


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04  9:32 ` Thomas Gleixner
@ 2017-07-04  9:32   ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

[-- Attachment #1: mmmemory-hotplug_Switch_locking_to_a_percpu_rwsem.patch --]
[-- Type: text/plain, Size: 5444 bytes --]

Andrey reported a potential deadlock with the memory hotplug lock and the
cpu hotplug lock.

The reason is that memory hotplug takes the memory hotplug lock and then
calls stop_machine() which calls get_online_cpus(). That's the reverse lock
order to get_online_cpus(); get_online_mems(); in mm/slub_common.c

The problem has been there forever. The reason why this was never reported
is that the cpu hotplug locking had this homebrewn recursive reader writer
semaphore construct which due to the recursion evaded the full lock dep
coverage. The memory hotplug code copied that construct verbatim and
therefor has similar issues.

Three steps to fix this:

1) Convert the memory hotplug locking to a per cpu rwsem so the potential
   issues get reported proper by lockdep.

2) Lock the online cpus in mem_hotplug_begin() before taking the memory
   hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
   and use to avoid recursive locking.

3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
   hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
   invoking lru_add_drain_all_cpuslocked() instead.

Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
---
 mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
 mm/page_alloc.c     |    2 -
 2 files changed, 16 insertions(+), 75 deletions(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -52,32 +52,17 @@ static void generic_online_page(struct p
 static online_page_callback_t online_page_callback = generic_online_page;
 static DEFINE_MUTEX(online_page_callback_lock);
 
-/* The same as the cpu_hotplug lock, but for memory hotplug. */
-static struct {
-	struct task_struct *active_writer;
-	struct mutex lock; /* Synchronizes accesses to refcount, */
-	/*
-	 * Also blocks the new readers during
-	 * an ongoing mem hotplug operation.
-	 */
-	int refcount;
+DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock);
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
-} mem_hotplug = {
-	.active_writer = NULL,
-	.lock = __MUTEX_INITIALIZER(mem_hotplug.lock),
-	.refcount = 0,
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	.dep_map = {.name = "mem_hotplug.lock" },
-#endif
-};
+void get_online_mems(void)
+{
+	percpu_down_read(&mem_hotplug_lock);
+}
 
-/* Lockdep annotations for get/put_online_mems() and mem_hotplug_begin/end() */
-#define memhp_lock_acquire_read() lock_map_acquire_read(&mem_hotplug.dep_map)
-#define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
-#define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
+void put_online_mems(void)
+{
+	percpu_up_read(&mem_hotplug_lock);
+}
 
 #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
 bool memhp_auto_online;
@@ -97,60 +82,16 @@ static int __init setup_memhp_default_st
 }
 __setup("memhp_default_state=", setup_memhp_default_state);
 
-void get_online_mems(void)
-{
-	might_sleep();
-	if (mem_hotplug.active_writer == current)
-		return;
-	memhp_lock_acquire_read();
-	mutex_lock(&mem_hotplug.lock);
-	mem_hotplug.refcount++;
-	mutex_unlock(&mem_hotplug.lock);
-
-}
-
-void put_online_mems(void)
-{
-	if (mem_hotplug.active_writer == current)
-		return;
-	mutex_lock(&mem_hotplug.lock);
-
-	if (WARN_ON(!mem_hotplug.refcount))
-		mem_hotplug.refcount++; /* try to fix things up */
-
-	if (!--mem_hotplug.refcount && unlikely(mem_hotplug.active_writer))
-		wake_up_process(mem_hotplug.active_writer);
-	mutex_unlock(&mem_hotplug.lock);
-	memhp_lock_release();
-
-}
-
-/* Serializes write accesses to mem_hotplug.active_writer. */
-static DEFINE_MUTEX(memory_add_remove_lock);
-
 void mem_hotplug_begin(void)
 {
-	mutex_lock(&memory_add_remove_lock);
-
-	mem_hotplug.active_writer = current;
-
-	memhp_lock_acquire();
-	for (;;) {
-		mutex_lock(&mem_hotplug.lock);
-		if (likely(!mem_hotplug.refcount))
-			break;
-		__set_current_state(TASK_UNINTERRUPTIBLE);
-		mutex_unlock(&mem_hotplug.lock);
-		schedule();
-	}
+	cpus_read_lock();
+	percpu_down_write(&mem_hotplug_lock);
 }
 
 void mem_hotplug_done(void)
 {
-	mem_hotplug.active_writer = NULL;
-	mutex_unlock(&mem_hotplug.lock);
-	memhp_lock_release();
-	mutex_unlock(&memory_add_remove_lock);
+	percpu_up_write(&mem_hotplug_lock);
+	cpus_read_unlock();
 }
 
 /* add this memory to iomem resource */
@@ -1919,7 +1860,7 @@ static int __ref __offline_pages(unsigne
 		goto failed_removal;
 	ret = 0;
 	if (drain) {
-		lru_add_drain_all();
+		lru_add_drain_all_cpuslocked();
 		cond_resched();
 		drain_all_pages(zone);
 	}
@@ -1940,7 +1881,7 @@ static int __ref __offline_pages(unsigne
 		}
 	}
 	/* drain all zone's lru pagevec, this is asynchronous... */
-	lru_add_drain_all();
+	lru_add_drain_all_cpuslocked();
 	yield();
 	/* drain pcp pages, this is synchronous. */
 	drain_all_pages(zone);
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5216,7 +5216,7 @@ void __ref build_all_zonelists(pg_data_t
 #endif
 		/* we have to stop all cpus to guarantee there is no user
 		   of zonelist */
-		stop_machine(__build_all_zonelists, pgdat, NULL);
+		stop_machine_cpuslocked(__build_all_zonelists, pgdat, NULL);
 		/* cpuset refresh routine should be here */
 	}
 	vm_total_pages = nr_free_pagecache_pages();

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04  9:32   ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04  9:32 UTC (permalink / raw)
  To: LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

[-- Attachment #1: mmmemory-hotplug_Switch_locking_to_a_percpu_rwsem.patch --]
[-- Type: text/plain, Size: 5671 bytes --]

Andrey reported a potential deadlock with the memory hotplug lock and the
cpu hotplug lock.

The reason is that memory hotplug takes the memory hotplug lock and then
calls stop_machine() which calls get_online_cpus(). That's the reverse lock
order to get_online_cpus(); get_online_mems(); in mm/slub_common.c

The problem has been there forever. The reason why this was never reported
is that the cpu hotplug locking had this homebrewn recursive reader writer
semaphore construct which due to the recursion evaded the full lock dep
coverage. The memory hotplug code copied that construct verbatim and
therefor has similar issues.

Three steps to fix this:

1) Convert the memory hotplug locking to a per cpu rwsem so the potential
   issues get reported proper by lockdep.

2) Lock the online cpus in mem_hotplug_begin() before taking the memory
   hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
   and use to avoid recursive locking.

3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
   hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
   invoking lru_add_drain_all_cpuslocked() instead.

Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
---
 mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
 mm/page_alloc.c     |    2 -
 2 files changed, 16 insertions(+), 75 deletions(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -52,32 +52,17 @@ static void generic_online_page(struct p
 static online_page_callback_t online_page_callback = generic_online_page;
 static DEFINE_MUTEX(online_page_callback_lock);
 
-/* The same as the cpu_hotplug lock, but for memory hotplug. */
-static struct {
-	struct task_struct *active_writer;
-	struct mutex lock; /* Synchronizes accesses to refcount, */
-	/*
-	 * Also blocks the new readers during
-	 * an ongoing mem hotplug operation.
-	 */
-	int refcount;
+DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock);
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
-} mem_hotplug = {
-	.active_writer = NULL,
-	.lock = __MUTEX_INITIALIZER(mem_hotplug.lock),
-	.refcount = 0,
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	.dep_map = {.name = "mem_hotplug.lock" },
-#endif
-};
+void get_online_mems(void)
+{
+	percpu_down_read(&mem_hotplug_lock);
+}
 
-/* Lockdep annotations for get/put_online_mems() and mem_hotplug_begin/end() */
-#define memhp_lock_acquire_read() lock_map_acquire_read(&mem_hotplug.dep_map)
-#define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
-#define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
+void put_online_mems(void)
+{
+	percpu_up_read(&mem_hotplug_lock);
+}
 
 #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
 bool memhp_auto_online;
@@ -97,60 +82,16 @@ static int __init setup_memhp_default_st
 }
 __setup("memhp_default_state=", setup_memhp_default_state);
 
-void get_online_mems(void)
-{
-	might_sleep();
-	if (mem_hotplug.active_writer == current)
-		return;
-	memhp_lock_acquire_read();
-	mutex_lock(&mem_hotplug.lock);
-	mem_hotplug.refcount++;
-	mutex_unlock(&mem_hotplug.lock);
-
-}
-
-void put_online_mems(void)
-{
-	if (mem_hotplug.active_writer == current)
-		return;
-	mutex_lock(&mem_hotplug.lock);
-
-	if (WARN_ON(!mem_hotplug.refcount))
-		mem_hotplug.refcount++; /* try to fix things up */
-
-	if (!--mem_hotplug.refcount && unlikely(mem_hotplug.active_writer))
-		wake_up_process(mem_hotplug.active_writer);
-	mutex_unlock(&mem_hotplug.lock);
-	memhp_lock_release();
-
-}
-
-/* Serializes write accesses to mem_hotplug.active_writer. */
-static DEFINE_MUTEX(memory_add_remove_lock);
-
 void mem_hotplug_begin(void)
 {
-	mutex_lock(&memory_add_remove_lock);
-
-	mem_hotplug.active_writer = current;
-
-	memhp_lock_acquire();
-	for (;;) {
-		mutex_lock(&mem_hotplug.lock);
-		if (likely(!mem_hotplug.refcount))
-			break;
-		__set_current_state(TASK_UNINTERRUPTIBLE);
-		mutex_unlock(&mem_hotplug.lock);
-		schedule();
-	}
+	cpus_read_lock();
+	percpu_down_write(&mem_hotplug_lock);
 }
 
 void mem_hotplug_done(void)
 {
-	mem_hotplug.active_writer = NULL;
-	mutex_unlock(&mem_hotplug.lock);
-	memhp_lock_release();
-	mutex_unlock(&memory_add_remove_lock);
+	percpu_up_write(&mem_hotplug_lock);
+	cpus_read_unlock();
 }
 
 /* add this memory to iomem resource */
@@ -1919,7 +1860,7 @@ static int __ref __offline_pages(unsigne
 		goto failed_removal;
 	ret = 0;
 	if (drain) {
-		lru_add_drain_all();
+		lru_add_drain_all_cpuslocked();
 		cond_resched();
 		drain_all_pages(zone);
 	}
@@ -1940,7 +1881,7 @@ static int __ref __offline_pages(unsigne
 		}
 	}
 	/* drain all zone's lru pagevec, this is asynchronous... */
-	lru_add_drain_all();
+	lru_add_drain_all_cpuslocked();
 	yield();
 	/* drain pcp pages, this is synchronous. */
 	drain_all_pages(zone);
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5216,7 +5216,7 @@ void __ref build_all_zonelists(pg_data_t
 #endif
 		/* we have to stop all cpus to guarantee there is no user
 		   of zonelist */
-		stop_machine(__build_all_zonelists, pgdat, NULL);
+		stop_machine_cpuslocked(__build_all_zonelists, pgdat, NULL);
 		/* cpuset refresh routine should be here */
 	}
 	vm_total_pages = nr_free_pagecache_pages();


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04  9:32   ` Thomas Gleixner
@ 2017-07-04 10:58     ` Michal Hocko
  -1 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 10:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.

You have added callers in the later patch in the series AFAICS which
is OK but I think it would be better to have them in this patch
already. Nothing earth shattering (maybe a rebase artifact).

> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/swap.h |    1 +
>  mm/swap.c            |   11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>  	static DEFINE_MUTEX(lock);
>  	static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>  		return;
>  
>  	mutex_lock(&lock);
> -	get_online_cpus();
>  	cpumask_clear(&has_work);
>  
>  	for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>  	for_each_cpu(cpu, &has_work)
>  		flush_work(&per_cpu(lru_add_drain_work, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&lock);
>  }
>  
> +void lru_add_drain_all(void)
> +{
> +	get_online_cpus();
> +	lru_add_drain_all_cpuslocked();
> +	put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04 10:58     ` Michal Hocko
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 10:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.

You have added callers in the later patch in the series AFAICS which
is OK but I think it would be better to have them in this patch
already. Nothing earth shattering (maybe a rebase artifact).

> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/swap.h |    1 +
>  mm/swap.c            |   11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>  	static DEFINE_MUTEX(lock);
>  	static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>  		return;
>  
>  	mutex_lock(&lock);
> -	get_online_cpus();
>  	cpumask_clear(&has_work);
>  
>  	for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>  	for_each_cpu(cpu, &has_work)
>  		flush_work(&per_cpu(lru_add_drain_work, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&lock);
>  }
>  
> +void lru_add_drain_all(void)
> +{
> +	get_online_cpus();
> +	lru_add_drain_all_cpuslocked();
> +	put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04  9:32   ` Thomas Gleixner
@ 2017-07-04 10:59     ` Michal Hocko
  -1 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 10:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 11:32:34, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> 
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
> 
> Three steps to fix this:
> 
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>    issues get reported proper by lockdep.
> 
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
>    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
>    and use to avoid recursive locking.
> 
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
>    hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
>    invoking lru_add_drain_all_cpuslocked() instead.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
>  mm/page_alloc.c     |    2 -
>  2 files changed, 16 insertions(+), 75 deletions(-)
> 
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -52,32 +52,17 @@ static void generic_online_page(struct p
>  static online_page_callback_t online_page_callback = generic_online_page;
>  static DEFINE_MUTEX(online_page_callback_lock);
>  
> -/* The same as the cpu_hotplug lock, but for memory hotplug. */
> -static struct {
> -	struct task_struct *active_writer;
> -	struct mutex lock; /* Synchronizes accesses to refcount, */
> -	/*
> -	 * Also blocks the new readers during
> -	 * an ongoing mem hotplug operation.
> -	 */
> -	int refcount;
> +DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock);
>  
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -	struct lockdep_map dep_map;
> -#endif
> -} mem_hotplug = {
> -	.active_writer = NULL,
> -	.lock = __MUTEX_INITIALIZER(mem_hotplug.lock),
> -	.refcount = 0,
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -	.dep_map = {.name = "mem_hotplug.lock" },
> -#endif
> -};
> +void get_online_mems(void)
> +{
> +	percpu_down_read(&mem_hotplug_lock);
> +}
>  
> -/* Lockdep annotations for get/put_online_mems() and mem_hotplug_begin/end() */
> -#define memhp_lock_acquire_read() lock_map_acquire_read(&mem_hotplug.dep_map)
> -#define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> -#define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> +void put_online_mems(void)
> +{
> +	percpu_up_read(&mem_hotplug_lock);
> +}
>  
>  #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
>  bool memhp_auto_online;
> @@ -97,60 +82,16 @@ static int __init setup_memhp_default_st
>  }
>  __setup("memhp_default_state=", setup_memhp_default_state);
>  
> -void get_online_mems(void)
> -{
> -	might_sleep();
> -	if (mem_hotplug.active_writer == current)
> -		return;
> -	memhp_lock_acquire_read();
> -	mutex_lock(&mem_hotplug.lock);
> -	mem_hotplug.refcount++;
> -	mutex_unlock(&mem_hotplug.lock);
> -
> -}
> -
> -void put_online_mems(void)
> -{
> -	if (mem_hotplug.active_writer == current)
> -		return;
> -	mutex_lock(&mem_hotplug.lock);
> -
> -	if (WARN_ON(!mem_hotplug.refcount))
> -		mem_hotplug.refcount++; /* try to fix things up */
> -
> -	if (!--mem_hotplug.refcount && unlikely(mem_hotplug.active_writer))
> -		wake_up_process(mem_hotplug.active_writer);
> -	mutex_unlock(&mem_hotplug.lock);
> -	memhp_lock_release();
> -
> -}
> -
> -/* Serializes write accesses to mem_hotplug.active_writer. */
> -static DEFINE_MUTEX(memory_add_remove_lock);
> -
>  void mem_hotplug_begin(void)
>  {
> -	mutex_lock(&memory_add_remove_lock);
> -
> -	mem_hotplug.active_writer = current;
> -
> -	memhp_lock_acquire();
> -	for (;;) {
> -		mutex_lock(&mem_hotplug.lock);
> -		if (likely(!mem_hotplug.refcount))
> -			break;
> -		__set_current_state(TASK_UNINTERRUPTIBLE);
> -		mutex_unlock(&mem_hotplug.lock);
> -		schedule();
> -	}
> +	cpus_read_lock();
> +	percpu_down_write(&mem_hotplug_lock);
>  }
>  
>  void mem_hotplug_done(void)
>  {
> -	mem_hotplug.active_writer = NULL;
> -	mutex_unlock(&mem_hotplug.lock);
> -	memhp_lock_release();
> -	mutex_unlock(&memory_add_remove_lock);
> +	percpu_up_write(&mem_hotplug_lock);
> +	cpus_read_unlock();
>  }
>  
>  /* add this memory to iomem resource */
> @@ -1919,7 +1860,7 @@ static int __ref __offline_pages(unsigne
>  		goto failed_removal;
>  	ret = 0;
>  	if (drain) {
> -		lru_add_drain_all();
> +		lru_add_drain_all_cpuslocked();
>  		cond_resched();
>  		drain_all_pages(zone);
>  	}
> @@ -1940,7 +1881,7 @@ static int __ref __offline_pages(unsigne
>  		}
>  	}
>  	/* drain all zone's lru pagevec, this is asynchronous... */
> -	lru_add_drain_all();
> +	lru_add_drain_all_cpuslocked();
>  	yield();
>  	/* drain pcp pages, this is synchronous. */
>  	drain_all_pages(zone);
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5216,7 +5216,7 @@ void __ref build_all_zonelists(pg_data_t
>  #endif
>  		/* we have to stop all cpus to guarantee there is no user
>  		   of zonelist */
> -		stop_machine(__build_all_zonelists, pgdat, NULL);
> +		stop_machine_cpuslocked(__build_all_zonelists, pgdat, NULL);
>  		/* cpuset refresh routine should be here */
>  	}
>  	vm_total_pages = nr_free_pagecache_pages();
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 10:59     ` Michal Hocko
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 10:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 11:32:34, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> 
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
> 
> Three steps to fix this:
> 
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>    issues get reported proper by lockdep.
> 
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
>    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
>    and use to avoid recursive locking.
> 
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
>    hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
>    invoking lru_add_drain_all_cpuslocked() instead.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
>  mm/page_alloc.c     |    2 -
>  2 files changed, 16 insertions(+), 75 deletions(-)
> 
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -52,32 +52,17 @@ static void generic_online_page(struct p
>  static online_page_callback_t online_page_callback = generic_online_page;
>  static DEFINE_MUTEX(online_page_callback_lock);
>  
> -/* The same as the cpu_hotplug lock, but for memory hotplug. */
> -static struct {
> -	struct task_struct *active_writer;
> -	struct mutex lock; /* Synchronizes accesses to refcount, */
> -	/*
> -	 * Also blocks the new readers during
> -	 * an ongoing mem hotplug operation.
> -	 */
> -	int refcount;
> +DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock);
>  
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -	struct lockdep_map dep_map;
> -#endif
> -} mem_hotplug = {
> -	.active_writer = NULL,
> -	.lock = __MUTEX_INITIALIZER(mem_hotplug.lock),
> -	.refcount = 0,
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -	.dep_map = {.name = "mem_hotplug.lock" },
> -#endif
> -};
> +void get_online_mems(void)
> +{
> +	percpu_down_read(&mem_hotplug_lock);
> +}
>  
> -/* Lockdep annotations for get/put_online_mems() and mem_hotplug_begin/end() */
> -#define memhp_lock_acquire_read() lock_map_acquire_read(&mem_hotplug.dep_map)
> -#define memhp_lock_acquire()      lock_map_acquire(&mem_hotplug.dep_map)
> -#define memhp_lock_release()      lock_map_release(&mem_hotplug.dep_map)
> +void put_online_mems(void)
> +{
> +	percpu_up_read(&mem_hotplug_lock);
> +}
>  
>  #ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
>  bool memhp_auto_online;
> @@ -97,60 +82,16 @@ static int __init setup_memhp_default_st
>  }
>  __setup("memhp_default_state=", setup_memhp_default_state);
>  
> -void get_online_mems(void)
> -{
> -	might_sleep();
> -	if (mem_hotplug.active_writer == current)
> -		return;
> -	memhp_lock_acquire_read();
> -	mutex_lock(&mem_hotplug.lock);
> -	mem_hotplug.refcount++;
> -	mutex_unlock(&mem_hotplug.lock);
> -
> -}
> -
> -void put_online_mems(void)
> -{
> -	if (mem_hotplug.active_writer == current)
> -		return;
> -	mutex_lock(&mem_hotplug.lock);
> -
> -	if (WARN_ON(!mem_hotplug.refcount))
> -		mem_hotplug.refcount++; /* try to fix things up */
> -
> -	if (!--mem_hotplug.refcount && unlikely(mem_hotplug.active_writer))
> -		wake_up_process(mem_hotplug.active_writer);
> -	mutex_unlock(&mem_hotplug.lock);
> -	memhp_lock_release();
> -
> -}
> -
> -/* Serializes write accesses to mem_hotplug.active_writer. */
> -static DEFINE_MUTEX(memory_add_remove_lock);
> -
>  void mem_hotplug_begin(void)
>  {
> -	mutex_lock(&memory_add_remove_lock);
> -
> -	mem_hotplug.active_writer = current;
> -
> -	memhp_lock_acquire();
> -	for (;;) {
> -		mutex_lock(&mem_hotplug.lock);
> -		if (likely(!mem_hotplug.refcount))
> -			break;
> -		__set_current_state(TASK_UNINTERRUPTIBLE);
> -		mutex_unlock(&mem_hotplug.lock);
> -		schedule();
> -	}
> +	cpus_read_lock();
> +	percpu_down_write(&mem_hotplug_lock);
>  }
>  
>  void mem_hotplug_done(void)
>  {
> -	mem_hotplug.active_writer = NULL;
> -	mutex_unlock(&mem_hotplug.lock);
> -	memhp_lock_release();
> -	mutex_unlock(&memory_add_remove_lock);
> +	percpu_up_write(&mem_hotplug_lock);
> +	cpus_read_unlock();
>  }
>  
>  /* add this memory to iomem resource */
> @@ -1919,7 +1860,7 @@ static int __ref __offline_pages(unsigne
>  		goto failed_removal;
>  	ret = 0;
>  	if (drain) {
> -		lru_add_drain_all();
> +		lru_add_drain_all_cpuslocked();
>  		cond_resched();
>  		drain_all_pages(zone);
>  	}
> @@ -1940,7 +1881,7 @@ static int __ref __offline_pages(unsigne
>  		}
>  	}
>  	/* drain all zone's lru pagevec, this is asynchronous... */
> -	lru_add_drain_all();
> +	lru_add_drain_all_cpuslocked();
>  	yield();
>  	/* drain pcp pages, this is synchronous. */
>  	drain_all_pages(zone);
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5216,7 +5216,7 @@ void __ref build_all_zonelists(pg_data_t
>  #endif
>  		/* we have to stop all cpus to guarantee there is no user
>  		   of zonelist */
> -		stop_machine(__build_all_zonelists, pgdat, NULL);
> +		stop_machine_cpuslocked(__build_all_zonelists, pgdat, NULL);
>  		/* cpuset refresh routine should be here */
>  	}
>  	vm_total_pages = nr_free_pagecache_pages();
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04  9:32   ` Thomas Gleixner
@ 2017-07-04 12:07     ` Vlastimil Babka
  -1 siblings, 0 replies; 34+ messages in thread
From: Vlastimil Babka @ 2017-07-04 12:07 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

A question below.

> ---
>  include/linux/swap.h |    1 +
>  mm/swap.c            |   11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>  	static DEFINE_MUTEX(lock);
>  	static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>  		return;
>  
>  	mutex_lock(&lock);
> -	get_online_cpus();

Is there a an assertion check that we are locked, that could be put in
e.g. VM_WARN_ON_ONCE()?

>  	cpumask_clear(&has_work);
>  
>  	for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>  	for_each_cpu(cpu, &has_work)
>  		flush_work(&per_cpu(lru_add_drain_work, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&lock);
>  }
>  
> +void lru_add_drain_all(void)
> +{
> +	get_online_cpus();
> +	lru_add_drain_all_cpuslocked();
> +	put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04 12:07     ` Vlastimil Babka
  0 siblings, 0 replies; 34+ messages in thread
From: Vlastimil Babka @ 2017-07-04 12:07 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> The rework of the cpu hotplug locking unearthed potential deadlocks with
> the memory hotplug locking code.
> 
> The solution for these is to rework the memory hotplug locking code as well
> and take the cpu hotplug lock before the memory hotplug lock in
> mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> hotplug lock when the memory hotplug code calls lru_add_drain_all().
> 
> Split out the inner workings of lru_add_drain_all() into
> lru_add_drain_all_cpuslocked() so this function can be invoked from the
> memory hotplug code with the cpu hotplug lock held.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

A question below.

> ---
>  include/linux/swap.h |    1 +
>  mm/swap.c            |   11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -277,6 +277,7 @@ extern void mark_page_accessed(struct pa
>  extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
> +extern void lru_add_drain_all_cpuslocked(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -687,7 +687,7 @@ static void lru_add_drain_per_cpu(struct
>  
>  static DEFINE_PER_CPU(struct work_struct, lru_add_drain_work);
>  
> -void lru_add_drain_all(void)
> +void lru_add_drain_all_cpuslocked(void)
>  {
>  	static DEFINE_MUTEX(lock);
>  	static struct cpumask has_work;
> @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
>  		return;
>  
>  	mutex_lock(&lock);
> -	get_online_cpus();

Is there a an assertion check that we are locked, that could be put in
e.g. VM_WARN_ON_ONCE()?

>  	cpumask_clear(&has_work);
>  
>  	for_each_online_cpu(cpu) {
> @@ -721,10 +720,16 @@ void lru_add_drain_all(void)
>  	for_each_cpu(cpu, &has_work)
>  		flush_work(&per_cpu(lru_add_drain_work, cpu));
>  
> -	put_online_cpus();
>  	mutex_unlock(&lock);
>  }
>  
> +void lru_add_drain_all(void)
> +{
> +	get_online_cpus();
> +	lru_add_drain_all_cpuslocked();
> +	put_online_cpus();
> +}
> +
>  /**
>   * release_pages - batched put_page()
>   * @pages: array of pages to release
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04  9:32   ` Thomas Gleixner
@ 2017-07-04 12:10     ` Vlastimil Babka
  -1 siblings, 0 replies; 34+ messages in thread
From: Vlastimil Babka @ 2017-07-04 12:10 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> 
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
> 
> Three steps to fix this:
> 
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>    issues get reported proper by lockdep.
> 
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
>    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
>    and use to avoid recursive locking.

     ^ s/and use // ?

> 
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
>    hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
>    invoking lru_add_drain_all_cpuslocked() instead.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
>  mm/page_alloc.c     |    2 -
>  2 files changed, 16 insertions(+), 75 deletions(-)

Nice! Glad to see the crazy code go.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 12:10     ` Vlastimil Babka
  0 siblings, 0 replies; 34+ messages in thread
From: Vlastimil Babka @ 2017-07-04 12:10 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The reason is that memory hotplug takes the memory hotplug lock and then
> calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> 
> The problem has been there forever. The reason why this was never reported
> is that the cpu hotplug locking had this homebrewn recursive reader writer
> semaphore construct which due to the recursion evaded the full lock dep
> coverage. The memory hotplug code copied that construct verbatim and
> therefor has similar issues.
> 
> Three steps to fix this:
> 
> 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>    issues get reported proper by lockdep.
> 
> 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
>    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
>    and use to avoid recursive locking.

     ^ s/and use // ?

> 
> 3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
>    hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this by
>    invoking lru_add_drain_all_cpuslocked() instead.
> 
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/memory_hotplug.c |   89 ++++++++--------------------------------------------
>  mm/page_alloc.c     |    2 -
>  2 files changed, 16 insertions(+), 75 deletions(-)

Nice! Glad to see the crazy code go.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04 12:07     ` Vlastimil Babka
@ 2017-07-04 12:35       ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:35 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Vlastimil Babka wrote:
> >  
> > -void lru_add_drain_all(void)
> > +void lru_add_drain_all_cpuslocked(void)
> >  {
> >  	static DEFINE_MUTEX(lock);
> >  	static struct cpumask has_work;
> > @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
> >  		return;
> >  
> >  	mutex_lock(&lock);
> > -	get_online_cpus();
> 
> Is there a an assertion check that we are locked, that could be put in
> e.g. VM_WARN_ON_ONCE()?

There is a lockdep assertion lockdep_assert_cpus_held() which could be
used. Forgot to add it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04 12:35       ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:35 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Vlastimil Babka wrote:
> >  
> > -void lru_add_drain_all(void)
> > +void lru_add_drain_all_cpuslocked(void)
> >  {
> >  	static DEFINE_MUTEX(lock);
> >  	static struct cpumask has_work;
> > @@ -701,7 +701,6 @@ void lru_add_drain_all(void)
> >  		return;
> >  
> >  	mutex_lock(&lock);
> > -	get_online_cpus();
> 
> Is there a an assertion check that we are locked, that could be put in
> e.g. VM_WARN_ON_ONCE()?

There is a lockdep assertion lockdep_assert_cpus_held() which could be
used. Forgot to add it.

Thanks,

	tglx


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04 10:58     ` Michal Hocko
@ 2017-07-04 12:48       ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Michal Hocko wrote:
> On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> > The rework of the cpu hotplug locking unearthed potential deadlocks with
> > the memory hotplug locking code.
> > 
> > The solution for these is to rework the memory hotplug locking code as well
> > and take the cpu hotplug lock before the memory hotplug lock in
> > mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> > hotplug lock when the memory hotplug code calls lru_add_drain_all().
> > 
> > Split out the inner workings of lru_add_drain_all() into
> > lru_add_drain_all_cpuslocked() so this function can be invoked from the
> > memory hotplug code with the cpu hotplug lock held.
> 
> You have added callers in the later patch in the series AFAICS which
> is OK but I think it would be better to have them in this patch
> already. Nothing earth shattering (maybe a rebase artifact).

The requirement for changing that comes with the extra hotplug locking in
mem_hotplug_begin(). That is required to establish the proper lock order
and then causes the recursive locking in the next patch. Adding the caller
here would be wrong, because then lru_add_drain_all_cpuslocked() would be
called unprotected. Hens and eggs as usual :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04 12:48       ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Michal Hocko wrote:
> On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> > The rework of the cpu hotplug locking unearthed potential deadlocks with
> > the memory hotplug locking code.
> > 
> > The solution for these is to rework the memory hotplug locking code as well
> > and take the cpu hotplug lock before the memory hotplug lock in
> > mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> > hotplug lock when the memory hotplug code calls lru_add_drain_all().
> > 
> > Split out the inner workings of lru_add_drain_all() into
> > lru_add_drain_all_cpuslocked() so this function can be invoked from the
> > memory hotplug code with the cpu hotplug lock held.
> 
> You have added callers in the later patch in the series AFAICS which
> is OK but I think it would be better to have them in this patch
> already. Nothing earth shattering (maybe a rebase artifact).

The requirement for changing that comes with the extra hotplug locking in
mem_hotplug_begin(). That is required to establish the proper lock order
and then causes the recursive locking in the next patch. Adding the caller
here would be wrong, because then lru_add_drain_all_cpuslocked() would be
called unprotected. Hens and eggs as usual :)

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04 12:10     ` Vlastimil Babka
@ 2017-07-04 12:49       ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:49 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Vlastimil Babka wrote:

> On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> > Andrey reported a potential deadlock with the memory hotplug lock and the
> > cpu hotplug lock.
> > 
> > The reason is that memory hotplug takes the memory hotplug lock and then
> > calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> > order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> > 
> > The problem has been there forever. The reason why this was never reported
> > is that the cpu hotplug locking had this homebrewn recursive reader writer
> > semaphore construct which due to the recursion evaded the full lock dep
> > coverage. The memory hotplug code copied that construct verbatim and
> > therefor has similar issues.
> > 
> > Three steps to fix this:
> > 
> > 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
> >    issues get reported proper by lockdep.
> > 
> > 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
> >    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
> >    and use to avoid recursive locking.
> 
>      ^ s/and use // ?

Ooops, yes.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 12:49       ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 12:49 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Vlastimil Babka wrote:

> On 07/04/2017 11:32 AM, Thomas Gleixner wrote:
> > Andrey reported a potential deadlock with the memory hotplug lock and the
> > cpu hotplug lock.
> > 
> > The reason is that memory hotplug takes the memory hotplug lock and then
> > calls stop_machine() which calls get_online_cpus(). That's the reverse lock
> > order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
> > 
> > The problem has been there forever. The reason why this was never reported
> > is that the cpu hotplug locking had this homebrewn recursive reader writer
> > semaphore construct which due to the recursion evaded the full lock dep
> > coverage. The memory hotplug code copied that construct verbatim and
> > therefor has similar issues.
> > 
> > Three steps to fix this:
> > 
> > 1) Convert the memory hotplug locking to a per cpu rwsem so the potential
> >    issues get reported proper by lockdep.
> > 
> > 2) Lock the online cpus in mem_hotplug_begin() before taking the memory
> >    hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc code
> >    and use to avoid recursive locking.
> 
>      ^ s/and use // ?

Ooops, yes.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
  2017-07-04 12:48       ` Thomas Gleixner
@ 2017-07-04 12:52         ` Michal Hocko
  -1 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 12:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 14:48:56, Thomas Gleixner wrote:
> On Tue, 4 Jul 2017, Michal Hocko wrote:
> > On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> > > The rework of the cpu hotplug locking unearthed potential deadlocks with
> > > the memory hotplug locking code.
> > > 
> > > The solution for these is to rework the memory hotplug locking code as well
> > > and take the cpu hotplug lock before the memory hotplug lock in
> > > mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> > > hotplug lock when the memory hotplug code calls lru_add_drain_all().
> > > 
> > > Split out the inner workings of lru_add_drain_all() into
> > > lru_add_drain_all_cpuslocked() so this function can be invoked from the
> > > memory hotplug code with the cpu hotplug lock held.
> > 
> > You have added callers in the later patch in the series AFAICS which
> > is OK but I think it would be better to have them in this patch
> > already. Nothing earth shattering (maybe a rebase artifact).
> 
> The requirement for changing that comes with the extra hotplug locking in
> mem_hotplug_begin(). That is required to establish the proper lock order
> and then causes the recursive locking in the next patch. Adding the caller
> here would be wrong, because then lru_add_drain_all_cpuslocked() would be
> called unprotected. Hens and eggs as usual :)

Yeah, you are right. My bad I should have noticed that.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked()
@ 2017-07-04 12:52         ` Michal Hocko
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Hocko @ 2017-07-04 12:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Andrew Morton, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue 04-07-17 14:48:56, Thomas Gleixner wrote:
> On Tue, 4 Jul 2017, Michal Hocko wrote:
> > On Tue 04-07-17 11:32:33, Thomas Gleixner wrote:
> > > The rework of the cpu hotplug locking unearthed potential deadlocks with
> > > the memory hotplug locking code.
> > > 
> > > The solution for these is to rework the memory hotplug locking code as well
> > > and take the cpu hotplug lock before the memory hotplug lock in
> > > mem_hotplug_begin(), but this will cause a recursive locking of the cpu
> > > hotplug lock when the memory hotplug code calls lru_add_drain_all().
> > > 
> > > Split out the inner workings of lru_add_drain_all() into
> > > lru_add_drain_all_cpuslocked() so this function can be invoked from the
> > > memory hotplug code with the cpu hotplug lock held.
> > 
> > You have added callers in the later patch in the series AFAICS which
> > is OK but I think it would be better to have them in this patch
> > already. Nothing earth shattering (maybe a rebase artifact).
> 
> The requirement for changing that comes with the extra hotplug locking in
> mem_hotplug_begin(). That is required to establish the proper lock order
> and then causes the recursive locking in the next patch. Adding the caller
> here would be wrong, because then lru_add_drain_all_cpuslocked() would be
> called unprotected. Hens and eggs as usual :)

Yeah, you are right. My bad I should have noticed that.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04  9:32   ` Thomas Gleixner
@ 2017-07-04 15:01     ` Davidlohr Bueso
  -1 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Thomas Gleixner wrote:

>Andrey reported a potential deadlock with the memory hotplug lock and the
>cpu hotplug lock.
>
>The reason is that memory hotplug takes the memory hotplug lock and then
>calls stop_machine() which calls get_online_cpus(). That's the reverse lock
>order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
>
>The problem has been there forever. The reason why this was never reported
>is that the cpu hotplug locking had this homebrewn recursive reader writer
>semaphore construct which due to the recursion evaded the full lock dep
>coverage. The memory hotplug code copied that construct verbatim and
>therefor has similar issues.
>
>Three steps to fix this:
>
>1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>   issues get reported proper by lockdep.

I particularly like how the mem hotplug is well suited for pcpu-rwsem.
As a side effect you end up optimizing get/put_online_mems() at the cost
of more overhead for the actual hotplug operation, which is rare and of less
performance importance.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 15:01     ` Davidlohr Bueso
  0 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Thomas Gleixner wrote:

>Andrey reported a potential deadlock with the memory hotplug lock and the
>cpu hotplug lock.
>
>The reason is that memory hotplug takes the memory hotplug lock and then
>calls stop_machine() which calls get_online_cpus(). That's the reverse lock
>order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
>
>The problem has been there forever. The reason why this was never reported
>is that the cpu hotplug locking had this homebrewn recursive reader writer
>semaphore construct which due to the recursion evaded the full lock dep
>coverage. The memory hotplug code copied that construct verbatim and
>therefor has similar issues.
>
>Three steps to fix this:
>
>1) Convert the memory hotplug locking to a per cpu rwsem so the potential
>   issues get reported proper by lockdep.

I particularly like how the mem hotplug is well suited for pcpu-rwsem.
As a side effect you end up optimizing get/put_online_mems() at the cost
of more overhead for the actual hotplug operation, which is rare and of less
performance importance.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04 15:01     ` Davidlohr Bueso
@ 2017-07-04 15:22       ` Davidlohr Bueso
  -1 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:22 UTC (permalink / raw)
  To: Thomas Gleixner, LKML, linux-mm, Andrey Ryabinin, Michal Hocko,
	Andrew Morton, Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Davidlohr Bueso wrote:

>As a side effect you end up optimizing get/put_online_mems() at the cost
>of more overhead for the actual hotplug operation, which is rare and of less
>performance importance.

So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
due to at least two full barriers for each get/put operation.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 15:22       ` Davidlohr Bueso
  0 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:22 UTC (permalink / raw)
  To: Thomas Gleixner, LKML, linux-mm, Andrey Ryabinin, Michal Hocko,
	Andrew Morton, Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Davidlohr Bueso wrote:

>As a side effect you end up optimizing get/put_online_mems() at the cost
>of more overhead for the actual hotplug operation, which is rare and of less
>performance importance.

So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
due to at least two full barriers for each get/put operation.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04 15:22       ` Davidlohr Bueso
@ 2017-07-04 15:32         ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 15:32 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Davidlohr Bueso wrote:
> On Tue, 04 Jul 2017, Davidlohr Bueso wrote:
> 
> > As a side effect you end up optimizing get/put_online_mems() at the cost
> > of more overhead for the actual hotplug operation, which is rare and of less
> > performance importance.
> 
> So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
> due to at least two full barriers for each get/put operation.

Compared to a mutex_lock/unlock() pair on a global mutex ....

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 15:32         ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-04 15:32 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 4 Jul 2017, Davidlohr Bueso wrote:
> On Tue, 04 Jul 2017, Davidlohr Bueso wrote:
> 
> > As a side effect you end up optimizing get/put_online_mems() at the cost
> > of more overhead for the actual hotplug operation, which is rare and of less
> > performance importance.
> 
> So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
> due to at least two full barriers for each get/put operation.

Compared to a mutex_lock/unlock() pair on a global mutex ....

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
  2017-07-04 15:32         ` Thomas Gleixner
@ 2017-07-04 15:42           ` Davidlohr Bueso
  -1 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Thomas Gleixner wrote:

>On Tue, 4 Jul 2017, Davidlohr Bueso wrote:
>> On Tue, 04 Jul 2017, Davidlohr Bueso wrote:
>>
>> > As a side effect you end up optimizing get/put_online_mems() at the cost
>> > of more overhead for the actual hotplug operation, which is rare and of less
>> > performance importance.
>>
>> So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
>> due to at least two full barriers for each get/put operation.
>
>Compared to a mutex_lock/unlock() pair on a global mutex ....

Ah, right, I was thrown off the:

    if (mem_hotplug.active_writer == current)
       return;

checks, which is only true within hotplug_begin/end. So normally we'd take
the lock, which was my first impression. Sorry for the noise.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem
@ 2017-07-04 15:42           ` Davidlohr Bueso
  0 siblings, 0 replies; 34+ messages in thread
From: Davidlohr Bueso @ 2017-07-04 15:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Andrew Morton,
	Vlastimil Babka, Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017, Thomas Gleixner wrote:

>On Tue, 4 Jul 2017, Davidlohr Bueso wrote:
>> On Tue, 04 Jul 2017, Davidlohr Bueso wrote:
>>
>> > As a side effect you end up optimizing get/put_online_mems() at the cost
>> > of more overhead for the actual hotplug operation, which is rare and of less
>> > performance importance.
>>
>> So nm this, the reader side actually gets _more_ expensive with pcpu-rwsems
>> due to at least two full barriers for each get/put operation.
>
>Compared to a mutex_lock/unlock() pair on a global mutex ....

Ah, right, I was thrown off the:

    if (mem_hotplug.active_writer == current)
       return;

checks, which is only true within hotplug_begin/end. So normally we'd take
the lock, which was my first impression. Sorry for the noise.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
  2017-07-04  9:32 ` Thomas Gleixner
@ 2017-07-05 21:53   ` Andrew Morton
  -1 siblings, 0 replies; 34+ messages in thread
From: Andrew Morton @ 2017-07-05 21:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017 11:32:32 +0200 Thomas Gleixner <tglx@linutronix.de> wrote:

> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The following series addresses this by reworking the memory hotplug locking
> and fixing up the potential deadlock scenarios.

Do you think we should squeeze this into 4.13-rc1, or can we afford to
take the more cautious route?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
@ 2017-07-05 21:53   ` Andrew Morton
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Morton @ 2017-07-05 21:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Tue, 04 Jul 2017 11:32:32 +0200 Thomas Gleixner <tglx@linutronix.de> wrote:

> Andrey reported a potential deadlock with the memory hotplug lock and the
> cpu hotplug lock.
> 
> The following series addresses this by reworking the memory hotplug locking
> and fixing up the potential deadlock scenarios.

Do you think we should squeeze this into 4.13-rc1, or can we afford to
take the more cautious route?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
  2017-07-05 21:53   ` Andrew Morton
@ 2017-07-06  6:34     ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-06  6:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Wed, 5 Jul 2017, Andrew Morton wrote:
> On Tue, 04 Jul 2017 11:32:32 +0200 Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Andrey reported a potential deadlock with the memory hotplug lock and the
> > cpu hotplug lock.
> > 
> > The following series addresses this by reworking the memory hotplug locking
> > and fixing up the potential deadlock scenarios.
> 
> Do you think we should squeeze this into 4.13-rc1, or can we afford to
> take the more cautious route?

The deadlocks are real and the lockdep splats are triggering on Linus head,
so it should go into 4.13-rc1 if possible.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock
@ 2017-07-06  6:34     ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2017-07-06  6:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, linux-mm, Andrey Ryabinin, Michal Hocko, Vlastimil Babka,
	Vladimir Davydov, Peter Zijlstra

On Wed, 5 Jul 2017, Andrew Morton wrote:
> On Tue, 04 Jul 2017 11:32:32 +0200 Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Andrey reported a potential deadlock with the memory hotplug lock and the
> > cpu hotplug lock.
> > 
> > The following series addresses this by reworking the memory hotplug locking
> > and fixing up the potential deadlock scenarios.
> 
> Do you think we should squeeze this into 4.13-rc1, or can we afford to
> take the more cautious route?

The deadlocks are real and the lockdep splats are triggering on Linus head,
so it should go into 4.13-rc1 if possible.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-07-06  6:34 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-04  9:32 [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock Thomas Gleixner
2017-07-04  9:32 ` Thomas Gleixner
2017-07-04  9:32 ` [patch V2 1/2] mm: swap: Provide lru_add_drain_all_cpuslocked() Thomas Gleixner
2017-07-04  9:32   ` Thomas Gleixner
2017-07-04 10:58   ` Michal Hocko
2017-07-04 10:58     ` Michal Hocko
2017-07-04 12:48     ` Thomas Gleixner
2017-07-04 12:48       ` Thomas Gleixner
2017-07-04 12:52       ` Michal Hocko
2017-07-04 12:52         ` Michal Hocko
2017-07-04 12:07   ` Vlastimil Babka
2017-07-04 12:07     ` Vlastimil Babka
2017-07-04 12:35     ` Thomas Gleixner
2017-07-04 12:35       ` Thomas Gleixner
2017-07-04  9:32 ` [patch V2 2/2] mm/memory-hotplug: Switch locking to a percpu rwsem Thomas Gleixner
2017-07-04  9:32   ` Thomas Gleixner
2017-07-04 10:59   ` Michal Hocko
2017-07-04 10:59     ` Michal Hocko
2017-07-04 12:10   ` Vlastimil Babka
2017-07-04 12:10     ` Vlastimil Babka
2017-07-04 12:49     ` Thomas Gleixner
2017-07-04 12:49       ` Thomas Gleixner
2017-07-04 15:01   ` Davidlohr Bueso
2017-07-04 15:01     ` Davidlohr Bueso
2017-07-04 15:22     ` Davidlohr Bueso
2017-07-04 15:22       ` Davidlohr Bueso
2017-07-04 15:32       ` Thomas Gleixner
2017-07-04 15:32         ` Thomas Gleixner
2017-07-04 15:42         ` Davidlohr Bueso
2017-07-04 15:42           ` Davidlohr Bueso
2017-07-05 21:53 ` [patch V2 0/2] mm/memory_hotplug: Cure potential deadlocks vs. cpu hotplug lock Andrew Morton
2017-07-05 21:53   ` Andrew Morton
2017-07-06  6:34   ` Thomas Gleixner
2017-07-06  6:34     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.