Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
@ 2020-01-15 17:29 Qian Cai
  2020-01-16 14:28 ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Qian Cai @ 2020-01-15 17:29 UTC (permalink / raw)
  To: akpm
  Cc: mhocko, sergey.senozhatsky.work, pmladek, rostedt, peterz, david,
	linux-mm, linux-kernel, Qian Cai

It is guaranteed to trigger a lockdep splat if calling printk() with
zone->lock held because there are many places (tty, console drivers,
debugobjects etc) would allocate some memory with another lock
held which is proved to be difficult to fix them all.

A common workaround until the onging effort to make all printk() as
deferred happens is to use printk_deferred() in those places similar to
the recent commit [1] merged into the random and -next trees, but memory
offline will call dump_page() which needs to be deferred after the lock.

So change has_unmovable_pages() so that it no longer calls dump_page()
itself - instead it returns a "struct page *" of the unmovable page back
to the caller so that in the case of a has_unmovable_pages() failure,
the caller can call dump_page() after releasing zone->lock. Also, make
dump_page() is able to report a CMA page as well, so the reason string
from has_unmovable_pages() can be removed.

While at it, remove a similar but unnecessary debug-only printk() as
well. A few sample lockdep splats can be founnd here [2].

[1] https://lore.kernel.org/lkml/1573679785-21068-1-git-send-email-cai@lca.pw/
[2] https://lore.kernel.org/lkml/7CD27FC6-CFFF-4519-A57D-85179E9815FE@lca.pw/

Signed-off-by: Qian Cai <cai@lca.pw>
---

v3: Rebase to next-20200115 for the mm/debug change and update some
    comments thanks to Michal.

v2: Improve the commit log and report CMA in dump_page() per Andrew.
    has_unmovable_pages() returns a "struct page *" to the caller.

 include/linux/page-isolation.h |  4 ++--
 mm/debug.c                     |  4 +++-
 mm/memory_hotplug.c            |  6 ++++--
 mm/page_alloc.c                | 22 +++++++++-------------
 mm/page_isolation.c            | 11 ++++++++++-
 5 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 148e65a9c606..da043ae86488 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -33,8 +33,8 @@ static inline bool is_migrate_isolate(int migratetype)
 #define MEMORY_OFFLINE	0x1
 #define REPORT_FAILURE	0x2
 
-bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
-			 int flags);
+struct page *has_unmovable_pages(struct zone *zone, struct page *page, int
+				 migratetype, int flags);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
 				int migratetype, int *num_movable);
diff --git a/mm/debug.c b/mm/debug.c
index 6a52316af839..784f9da711b0 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -46,6 +46,7 @@ void __dump_page(struct page *page, const char *reason)
 {
 	struct address_space *mapping;
 	bool page_poisoned = PagePoisoned(page);
+	bool page_cma = is_migrate_cma_page(page);
 	int mapcount;
 	char *type = "";
 
@@ -92,7 +93,8 @@ void __dump_page(struct page *page, const char *reason)
 	}
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
-	pr_warn("%sflags: %#lx(%pGp)\n", type, page->flags, &page->flags);
+	pr_warn("%sflags: %#lx(%pGp)%s", type, page->flags, &page->flags,
+		page_cma ? " CMA\n" : "\n");
 
 hex_only:
 	print_hex_dump(KERN_WARNING, "raw: ", DUMP_PREFIX_NONE, 32,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7a6de9b0dcab..06e7dd3eb9a9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1148,8 +1148,10 @@ static bool is_pageblock_removable_nolock(unsigned long pfn)
 	if (!zone_spans_pfn(zone, pfn))
 		return false;
 
-	return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
-				    MEMORY_OFFLINE);
+	if (has_unmovable_pages(zone, page, MIGRATE_MOVABLE, MEMORY_OFFLINE))
+		return false;
+
+	return true;
 }
 
 /* Checks if this range of memory is likely to be hot-removable. */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e56cd1f33242..e90140e879e6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8202,13 +8202,16 @@ void *__init alloc_large_system_hash(const char *tablename,
  * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
  * check without lock_page also may miss some movable non-lru pages at
  * race condition. So you can't expect this function should be exact.
+ *
+ * It returns a page without holding a reference. It should be safe here
+ * because the page cannot go away because it is unmovable, but it must not to
+ * be used for anything else rather than dumping its state.
  */
-bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
-			 int flags)
+struct page *has_unmovable_pages(struct zone *zone, struct page *page,
+				 int migratetype, int flags)
 {
 	unsigned long iter = 0;
 	unsigned long pfn = page_to_pfn(page);
-	const char *reason = "unmovable page";
 
 	/*
 	 * TODO we could make this much more efficient by not checking every
@@ -8225,9 +8228,8 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
 		 * so consider them movable here.
 		 */
 		if (is_migrate_cma(migratetype))
-			return false;
+			return NULL;
 
-		reason = "CMA page";
 		goto unmovable;
 	}
 
@@ -8302,12 +8304,10 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
 		 */
 		goto unmovable;
 	}
-	return false;
+	return NULL;
 unmovable:
 	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
-	if (flags & REPORT_FAILURE)
-		dump_page(pfn_to_page(pfn + iter), reason);
-	return true;
+	return pfn_to_page(pfn + iter);
 }
 
 #ifdef CONFIG_CONTIG_ALLOC
@@ -8711,10 +8711,6 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 		BUG_ON(!PageBuddy(page));
 		order = page_order(page);
 		offlined_pages += 1 << order;
-#ifdef CONFIG_DEBUG_VM
-		pr_info("remove from free list %lx %d %lx\n",
-			pfn, 1 << order, end_pfn);
-#endif
 		del_page_from_free_area(page, &zone->free_area[order]);
 		pfn += (1 << order);
 	}
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 1f8b9dfecbe8..f3af65bac1e0 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -20,6 +20,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	struct zone *zone;
 	unsigned long flags;
 	int ret = -EBUSY;
+	struct page *unmovable = NULL;
 
 	zone = page_zone(page);
 
@@ -37,7 +38,8 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	if (!has_unmovable_pages(zone, page, migratetype, isol_flags)) {
+	unmovable = has_unmovable_pages(zone, page, migratetype, isol_flags);
+	if (!unmovable) {
 		unsigned long nr_pages;
 		int mt = get_pageblock_migratetype(page);
 
@@ -54,6 +56,13 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	spin_unlock_irqrestore(&zone->lock, flags);
 	if (!ret)
 		drain_all_pages(zone);
+	else if (isol_flags & REPORT_FAILURE && unmovable)
+		/*
+		 * printk() with zone->lock held will guarantee to trigger a
+		 * lockdep splat, so defer it here.
+		 */
+		dump_page(unmovable, "unmovable page");
+
 	return ret;
 }
 
-- 
2.21.0 (Apple Git-122.2)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-15 17:29 [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk() Qian Cai
@ 2020-01-16 14:28 ` Michal Hocko
  2020-01-16 14:53   ` Qian Cai
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2020-01-16 14:28 UTC (permalink / raw)
  To: Qian Cai
  Cc: akpm, sergey.senozhatsky.work, pmladek, rostedt, peterz, david,
	linux-mm, linux-kernel

On Wed 15-01-20 12:29:16, Qian Cai wrote:
> It is guaranteed to trigger a lockdep splat if calling printk() with
> zone->lock held because there are many places (tty, console drivers,
> debugobjects etc) would allocate some memory with another lock
> held which is proved to be difficult to fix them all.

I am still not happy with the above much. What would say about something
like below instead?
"
It is not that hard to trigger lockdep splats by calling printk from
under zone->lock. Most of them are false positives caused by lock chains
introduced early in the boot process and they do not cause any real
problems. There are some console drivers which do allocate from the
printk context as well and those should be fixed. In any case false
positives are not that trivial to workaround and it is far from optimal
to lose lockdep functionality for something that is a non-issue.
<An example of such a false positive goes here>
"

> A common workaround until the onging effort to make all printk() as
> deferred happens is to use printk_deferred() in those places similar to
> the recent commit [1] merged into the random and -next trees, but memory
> offline will call dump_page() which needs to be deferred after the lock.

I would remove this paragraph altogether. The real problem is that we do
not really want to make dump_page deferred.

> So change has_unmovable_pages() so that it no longer calls dump_page()
> itself - instead it returns a "struct page *" of the unmovable page back
> to the caller so that in the case of a has_unmovable_pages() failure,
> the caller can call dump_page() after releasing zone->lock.

Again this begs for some more explanation why this is ok. Something like
the following:
"
Even though has_unmovable_pages doesn't hold any reference to the
returned page this should be reasonably safe for the purpose of
reporting the page (dump_page) because it cannot be hotremoved. The
state of the page might change but that is the case even with the
existing code as zone->lock only plays role for free pages.
"

> Also, make
> dump_page() is able to report a CMA page as well, so the reason string
> from has_unmovable_pages() can be removed.
> 
> While at it, remove a similar but unnecessary debug-only printk() as
> well. A few sample lockdep splats can be founnd here [2].
> 
> [1] https://lore.kernel.org/lkml/1573679785-21068-1-git-send-email-cai@lca.pw/
> [2] https://lore.kernel.org/lkml/7CD27FC6-CFFF-4519-A57D-85179E9815FE@lca.pw/
> 
> Signed-off-by: Qian Cai <cai@lca.pw>

With improved changelog and after addressing one more thing below, feel
free to add
Acked-by: Michal Hocko <mhocko@suse.com>

Thanks for working on this. I have to say I have disliked the initial
version because they were really hacky. This one can be reasoned about
at least. has_unmovable_pages returning an unmovable page actually makes
some conceptual sense to me.

[...]
> @@ -8302,12 +8304,10 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
>  		 */
>  		goto unmovable;
>  	}
> -	return false;
> +	return NULL;
>  unmovable:
>  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);

You want to move this WARN_ON as well. Not only because it is using
printk as well but also because we do not really want to trigger the
warning from userspace via is_mem_section_removable. Maybe worth a patch
on its own?
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 14:28 ` Michal Hocko
@ 2020-01-16 14:53   ` Qian Cai
  2020-01-16 15:54     ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Qian Cai @ 2020-01-16 14:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Sergey Senozhatsky, pmladek, rostedt, peterz,
	david, linux-mm, linux-kernel



> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
> 
> On Wed 15-01-20 12:29:16, Qian Cai wrote:
>> It is guaranteed to trigger a lockdep splat if calling printk() with
>> zone->lock held because there are many places (tty, console drivers,
>> debugobjects etc) would allocate some memory with another lock
>> held which is proved to be difficult to fix them all.
> 
> I am still not happy with the above much. What would say about something
> like below instead?
> "
> It is not that hard to trigger lockdep splats by calling printk from
> under zone->lock. Most of them are false positives caused by lock chains
> introduced early in the boot process and they do not cause any real
> problems. There are some console drivers which do allocate from the
> printk context as well and those should be fixed. In any case false
> positives are not that trivial to workaround and it is far from optimal
> to lose lockdep functionality for something that is a non-issue.
> <An example of such a false positive goes here>
> "

I feel like I repeated myself too many times. A call trace for one lock dependency
is sometimes from early boot process because lockdep will save the first one it
encountered, but it does not mean the lock dependency will only not happen in
early boot. I spent some time to study those early boot call traces in the given
lockdep splats, and it looks to me the lock dependency is also possible after
the boot.

> 
>> A common workaround until the onging effort to make all printk() as
>> deferred happens is to use printk_deferred() in those places similar to
>> the recent commit [1] merged into the random and -next trees, but memory
>> offline will call dump_page() which needs to be deferred after the lock.
> 
> I would remove this paragraph altogether. The real problem is that we do
> not really want to make dump_page deferred.
> 
>> So change has_unmovable_pages() so that it no longer calls dump_page()
>> itself - instead it returns a "struct page *" of the unmovable page back
>> to the caller so that in the case of a has_unmovable_pages() failure,
>> the caller can call dump_page() after releasing zone->lock.
> 
> Again this begs for some more explanation why this is ok. Something like
> the following:
> "
> Even though has_unmovable_pages doesn't hold any reference to the
> returned page this should be reasonably safe for the purpose of
> reporting the page (dump_page) because it cannot be hotremoved. The
> state of the page might change but that is the case even with the
> existing code as zone->lock only plays role for free pages.
> "

Looks like a better version. Maybe Andrew could squash that in?

> 
>> Also, make
>> dump_page() is able to report a CMA page as well, so the reason string
>> from has_unmovable_pages() can be removed.
>> 
>> While at it, remove a similar but unnecessary debug-only printk() as
>> well. A few sample lockdep splats can be founnd here [2].
>> 
>> [1] https://lore.kernel.org/lkml/1573679785-21068-1-git-send-email-cai@lca.pw/
>> [2] https://lore.kernel.org/lkml/7CD27FC6-CFFF-4519-A57D-85179E9815FE@lca.pw/
>> 
>> Signed-off-by: Qian Cai <cai@lca.pw>
> 
> With improved changelog and after addressing one more thing below, feel
> free to add
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Thanks for working on this. I have to say I have disliked the initial
> version because they were really hacky. This one can be reasoned about
> at least. has_unmovable_pages returning an unmovable page actually makes
> some conceptual sense to me.
> 
> [...]
>> @@ -8302,12 +8304,10 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int migratetype,
>> 		 */
>> 		goto unmovable;
>> 	}
>> -	return false;
>> +	return NULL;
>> unmovable:
>> 	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> 
> You want to move this WARN_ON as well. Not only because it is using
> printk as well but also because we do not really want to trigger the
> warning from userspace via is_mem_section_removable. Maybe worth a patch
> on its own?

Yes, I have never triggered the warning there before, so I don’t think that is a problem
belong to here.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 14:53   ` Qian Cai
@ 2020-01-16 15:54     ` Michal Hocko
  2020-01-16 16:04       ` David Hildenbrand
  2020-01-16 16:05       ` Qian Cai
  0 siblings, 2 replies; 8+ messages in thread
From: Michal Hocko @ 2020-01-16 15:54 UTC (permalink / raw)
  To: Qian Cai
  Cc: Andrew Morton, Sergey Senozhatsky, pmladek, rostedt, peterz,
	david, linux-mm, linux-kernel

On Thu 16-01-20 09:53:13, Qian Cai wrote:
> 
> 
> > On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > On Wed 15-01-20 12:29:16, Qian Cai wrote:
> >> It is guaranteed to trigger a lockdep splat if calling printk() with
> >> zone->lock held because there are many places (tty, console drivers,
> >> debugobjects etc) would allocate some memory with another lock
> >> held which is proved to be difficult to fix them all.
> > 
> > I am still not happy with the above much. What would say about something
> > like below instead?
> > "
> > It is not that hard to trigger lockdep splats by calling printk from
> > under zone->lock. Most of them are false positives caused by lock chains
> > introduced early in the boot process and they do not cause any real
> > problems. There are some console drivers which do allocate from the
> > printk context as well and those should be fixed. In any case false
> > positives are not that trivial to workaround and it is far from optimal
> > to lose lockdep functionality for something that is a non-issue.
> > <An example of such a false positive goes here>
> > "
> 
> I feel like I repeated myself too many times. A call trace for one lock dependency
> is sometimes from early boot process because lockdep will save the first one it
> encountered, but it does not mean the lock dependency will only not happen in
> early boot. I spent some time to study those early boot call traces in the given
> lockdep splats, and it looks to me the lock dependency is also possible after
> the boot.

Then state it explicitly with an example of the trace and explanation
that the deadlock is real. If the deadlock is real then it shouldn't be
really terribly hard to notice even without lockdep splats which get
disabled after the first false positive, right?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 15:54     ` Michal Hocko
@ 2020-01-16 16:04       ` David Hildenbrand
  2020-01-16 16:27         ` Qian Cai
  2020-01-16 16:05       ` Qian Cai
  1 sibling, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2020-01-16 16:04 UTC (permalink / raw)
  To: Michal Hocko, Qian Cai
  Cc: Andrew Morton, Sergey Senozhatsky, pmladek, rostedt, peterz,
	linux-mm, linux-kernel

On 16.01.20 16:54, Michal Hocko wrote:
> On Thu 16-01-20 09:53:13, Qian Cai wrote:
>>
>>
>>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>>
>>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
>>>> It is guaranteed to trigger a lockdep splat if calling printk() with
>>>> zone->lock held because there are many places (tty, console drivers,
>>>> debugobjects etc) would allocate some memory with another lock
>>>> held which is proved to be difficult to fix them all.
>>>
>>> I am still not happy with the above much. What would say about something
>>> like below instead?
>>> "
>>> It is not that hard to trigger lockdep splats by calling printk from
>>> under zone->lock. Most of them are false positives caused by lock chains
>>> introduced early in the boot process and they do not cause any real
>>> problems. There are some console drivers which do allocate from the
>>> printk context as well and those should be fixed. In any case false
>>> positives are not that trivial to workaround and it is far from optimal
>>> to lose lockdep functionality for something that is a non-issue.
>>> <An example of such a false positive goes here>
>>> "
>>
>> I feel like I repeated myself too many times. A call trace for one lock dependency
>> is sometimes from early boot process because lockdep will save the first one it
>> encountered, but it does not mean the lock dependency will only not happen in
>> early boot. I spent some time to study those early boot call traces in the given
>> lockdep splats, and it looks to me the lock dependency is also possible after
>> the boot.
> 
> Then state it explicitly with an example of the trace and explanation
> that the deadlock is real. If the deadlock is real then it shouldn't be
> really terribly hard to notice even without lockdep splats which get
> disabled after the first false positive, right?

I was asking myself for a long time: did anybody actually see this
deadlock in real life?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 15:54     ` Michal Hocko
  2020-01-16 16:04       ` David Hildenbrand
@ 2020-01-16 16:05       ` Qian Cai
  2020-01-16 17:43         ` Michal Hocko
  1 sibling, 1 reply; 8+ messages in thread
From: Qian Cai @ 2020-01-16 16:05 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Sergey Senozhatsky, pmladek, rostedt, peterz,
	david, linux-mm, linux-kernel



> On Jan 16, 2020, at 10:54 AM, Michal Hocko <mhocko@kernel.org> wrote:
> 
> On Thu 16-01-20 09:53:13, Qian Cai wrote:
>> 
>> 
>>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>> 
>>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
>>>> It is guaranteed to trigger a lockdep splat if calling printk() with
>>>> zone->lock held because there are many places (tty, console drivers,
>>>> debugobjects etc) would allocate some memory with another lock
>>>> held which is proved to be difficult to fix them all.
>>> 
>>> I am still not happy with the above much. What would say about something
>>> like below instead?
>>> "
>>> It is not that hard to trigger lockdep splats by calling printk from
>>> under zone->lock. Most of them are false positives caused by lock chains
>>> introduced early in the boot process and they do not cause any real
>>> problems. There are some console drivers which do allocate from the
>>> printk context as well and those should be fixed. In any case false
>>> positives are not that trivial to workaround and it is far from optimal
>>> to lose lockdep functionality for something that is a non-issue.
>>> <An example of such a false positive goes here>
>>> "
>> 
>> I feel like I repeated myself too many times. A call trace for one lock dependency
>> is sometimes from early boot process because lockdep will save the first one it
>> encountered, but it does not mean the lock dependency will only not happen in
>> early boot. I spent some time to study those early boot call traces in the given
>> lockdep splats, and it looks to me the lock dependency is also possible after
>> the boot.
> 
> Then state it explicitly with an example of the trace and explanation
> that the deadlock is real. If the deadlock is real then it shouldn't be
> really terribly hard to notice even without lockdep splats which get
> disabled after the first false positive, right?

A deadlock could be really hard to trigger though which needs a perfect
timing between multiple threads.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 16:04       ` David Hildenbrand
@ 2020-01-16 16:27         ` Qian Cai
  0 siblings, 0 replies; 8+ messages in thread
From: Qian Cai @ 2020-01-16 16:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Michal Hocko, Andrew Morton, Sergey Senozhatsky, pmladek,
	rostedt, peterz, linux-mm, linux-kernel



> On Jan 16, 2020, at 11:04 AM, David Hildenbrand <david@redhat.com> wrote:
> 
> On 16.01.20 16:54, Michal Hocko wrote:
>> On Thu 16-01-20 09:53:13, Qian Cai wrote:
>>> 
>>> 
>>>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>>> 
>>>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
>>>>> It is guaranteed to trigger a lockdep splat if calling printk() with
>>>>> zone->lock held because there are many places (tty, console drivers,
>>>>> debugobjects etc) would allocate some memory with another lock
>>>>> held which is proved to be difficult to fix them all.
>>>> 
>>>> I am still not happy with the above much. What would say about something
>>>> like below instead?
>>>> "
>>>> It is not that hard to trigger lockdep splats by calling printk from
>>>> under zone->lock. Most of them are false positives caused by lock chains
>>>> introduced early in the boot process and they do not cause any real
>>>> problems. There are some console drivers which do allocate from the
>>>> printk context as well and those should be fixed. In any case false
>>>> positives are not that trivial to workaround and it is far from optimal
>>>> to lose lockdep functionality for something that is a non-issue.
>>>> <An example of such a false positive goes here>
>>>> "
>>> 
>>> I feel like I repeated myself too many times. A call trace for one lock dependency
>>> is sometimes from early boot process because lockdep will save the first one it
>>> encountered, but it does not mean the lock dependency will only not happen in
>>> early boot. I spent some time to study those early boot call traces in the given
>>> lockdep splats, and it looks to me the lock dependency is also possible after
>>> the boot.
>> 
>> Then state it explicitly with an example of the trace and explanation
>> that the deadlock is real. If the deadlock is real then it shouldn't be
>> really terribly hard to notice even without lockdep splats which get
>> disabled after the first false positive, right?
> 
> I was asking myself for a long time: did anybody actually see this
> deadlock in real life?

Nobody knows for sure. I think one reason is that not many people will use
memory offiline even if they do, it will mostly not be a continuous activity in
the system. debugobjects make it way easier to reproduce because it allocates
memory in random places, but then it is not all that popular.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()
  2020-01-16 16:05       ` Qian Cai
@ 2020-01-16 17:43         ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2020-01-16 17:43 UTC (permalink / raw)
  To: Qian Cai
  Cc: Andrew Morton, Sergey Senozhatsky, pmladek, rostedt, peterz,
	david, linux-mm, linux-kernel

On Thu 16-01-20 11:05:07, Qian Cai wrote:
> 
> 
> > On Jan 16, 2020, at 10:54 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > On Thu 16-01-20 09:53:13, Qian Cai wrote:
> >> 
> >> 
> >>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >>> 
> >>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
> >>>> It is guaranteed to trigger a lockdep splat if calling printk() with
> >>>> zone->lock held because there are many places (tty, console drivers,
> >>>> debugobjects etc) would allocate some memory with another lock
> >>>> held which is proved to be difficult to fix them all.
> >>> 
> >>> I am still not happy with the above much. What would say about something
> >>> like below instead?
> >>> "
> >>> It is not that hard to trigger lockdep splats by calling printk from
> >>> under zone->lock. Most of them are false positives caused by lock chains
> >>> introduced early in the boot process and they do not cause any real
> >>> problems. There are some console drivers which do allocate from the
> >>> printk context as well and those should be fixed. In any case false
> >>> positives are not that trivial to workaround and it is far from optimal
> >>> to lose lockdep functionality for something that is a non-issue.
> >>> <An example of such a false positive goes here>
> >>> "
> >> 
> >> I feel like I repeated myself too many times. A call trace for one lock dependency
> >> is sometimes from early boot process because lockdep will save the first one it
> >> encountered, but it does not mean the lock dependency will only not happen in
> >> early boot. I spent some time to study those early boot call traces in the given
> >> lockdep splats, and it looks to me the lock dependency is also possible after
> >> the boot.
> > 
> > Then state it explicitly with an example of the trace and explanation
> > that the deadlock is real. If the deadlock is real then it shouldn't be
> > really terribly hard to notice even without lockdep splats which get
> > disabled after the first false positive, right?
> 
> A deadlock could be really hard to trigger though which needs a perfect
> timing between multiple threads.

All I am saying is: Do not speculate in changelog. Make clear arguments.
So far we have seen many false positives and that is stated in the
wording I have suggested. It is also explained why those suck. There is
also a note that _some_ consoles might indeed deadlock. Compare that to
the original changelog which doesn't really saying anything useful about
those lockdep splats.

I obviously do not insist on my wording but please make the changelog
clear on the actual problem and stick to facts.

Thanks!
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-15 17:29 [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk() Qian Cai
2020-01-16 14:28 ` Michal Hocko
2020-01-16 14:53   ` Qian Cai
2020-01-16 15:54     ` Michal Hocko
2020-01-16 16:04       ` David Hildenbrand
2020-01-16 16:27         ` Qian Cai
2020-01-16 16:05       ` Qian Cai
2020-01-16 17:43         ` Michal Hocko

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git