All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging
@ 2018-11-16  8:30 ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML

Hi,
this has been posted as an RFC [1]. I have screwed during rebasing so
there were few compilation issues in the previous version. I have also
integrated review feedback from Andrew and Anshuman.

I have been promissing to improve memory offlining failures debugging
for quite some time. As things stand now we get only very limited
information in the kernel log when the offlining fails. It is usually
only
[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed
without no further details. We do not know what exactly fails and for
what reason. Whenever I was forced to debug such a failure I've always
had to do a debugging patch to tell me more. We can enable some
tracepoints but it would be much better to get a better picture without
using them.

This patch series does 2 things. The first one is to make dump_page
more usable by printing more information about the mapping patch 1.
Then it reduces the log level from emerg to warning so that this
function is usable from less critical context patch 2. Then I have
added more detailed information about the offlining failure patch 4
and finally add dump_page to isolation and offlining migration paths.
Patch 3 is a trivial cleanup.

Does this look go to you?

[1] http://lkml.kernel.org/r/20181107101830.17405-1-mhocko@kernel.org

Shortlog
Michal Hocko (5):
      mm: print more information about mapping in __dump_page
      mm: lower the printk loglevel for __dump_page messages
      mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
      mm, memory_hotplug: print reason for the offlining failure
      mm, memory_hotplug: be more verbose for memory offline failures

Diffstat
 mm/debug.c          | 23 ++++++++++++++++++-----
 mm/memory_hotplug.c | 52 +++++++++++++++++++++++++++++++---------------------
 mm/page_alloc.c     |  1 +
 3 files changed, 50 insertions(+), 26 deletions(-)



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging
@ 2018-11-16  8:30 ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML

Hi,
this has been posted as an RFC [1]. I have screwed during rebasing so
there were few compilation issues in the previous version. I have also
integrated review feedback from Andrew and Anshuman.

I have been promissing to improve memory offlining failures debugging
for quite some time. As things stand now we get only very limited
information in the kernel log when the offlining fails. It is usually
only
[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed
without no further details. We do not know what exactly fails and for
what reason. Whenever I was forced to debug such a failure I've always
had to do a debugging patch to tell me more. We can enable some
tracepoints but it would be much better to get a better picture without
using them.

This patch series does 2 things. The first one is to make dump_page
more usable by printing more information about the mapping patch 1.
Then it reduces the log level from emerg to warning so that this
function is usable from less critical context patch 2. Then I have
added more detailed information about the offlining failure patch 4
and finally add dump_page to isolation and offlining migration paths.
Patch 3 is a trivial cleanup.

Does this look go to you?

[1] http://lkml.kernel.org/r/20181107101830.17405-1-mhocko@kernel.org

Shortlog
Michal Hocko (5):
      mm: print more information about mapping in __dump_page
      mm: lower the printk loglevel for __dump_page messages
      mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
      mm, memory_hotplug: print reason for the offlining failure
      mm, memory_hotplug: be more verbose for memory offline failures

Diffstat
 mm/debug.c          | 23 ++++++++++++++++++-----
 mm/memory_hotplug.c | 52 +++++++++++++++++++++++++++++++---------------------
 mm/page_alloc.c     |  1 +
 3 files changed, 50 insertions(+), 26 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] mm: print more information about mapping in __dump_page
  2018-11-16  8:30 ` Michal Hocko
@ 2018-11-16  8:30   ` Michal Hocko
  -1 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

__dump_page prints the mapping pointer but that is quite unhelpful
for many reports because the pointer itself only helps to distinguish
anon/ksm mappings from other ones (because of lowest bits
set). Sometimes it would be much more helpful to know what kind of
mapping that is actually and if we know this is a file mapping then also
try to resolve the dentry name.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/debug.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/mm/debug.c b/mm/debug.c
index cdacba12e09a..a33177bfc856 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -44,6 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
 
 void __dump_page(struct page *page, const char *reason)
 {
+	struct address_space *mapping = page_mapping(page);
 	bool page_poisoned = PagePoisoned(page);
 	int mapcount;
 
@@ -70,6 +71,18 @@ void __dump_page(struct page *page, const char *reason)
 	if (PageCompound(page))
 		pr_cont(" compound_mapcount: %d", compound_mapcount(page));
 	pr_cont("\n");
+	if (PageAnon(page))
+		pr_emerg("anon ");
+	else if (PageKsm(page))
+		pr_emerg("ksm ");
+	else if (mapping) {
+		pr_emerg("%ps ", mapping->a_ops);
+		if (mapping->host->i_dentry.first) {
+			struct dentry *dentry;
+			dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
+			pr_emerg("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
+		}
+	}
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
 	pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/5] mm: print more information about mapping in __dump_page
@ 2018-11-16  8:30   ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

__dump_page prints the mapping pointer but that is quite unhelpful
for many reports because the pointer itself only helps to distinguish
anon/ksm mappings from other ones (because of lowest bits
set). Sometimes it would be much more helpful to know what kind of
mapping that is actually and if we know this is a file mapping then also
try to resolve the dentry name.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/debug.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/mm/debug.c b/mm/debug.c
index cdacba12e09a..a33177bfc856 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -44,6 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
 
 void __dump_page(struct page *page, const char *reason)
 {
+	struct address_space *mapping = page_mapping(page);
 	bool page_poisoned = PagePoisoned(page);
 	int mapcount;
 
@@ -70,6 +71,18 @@ void __dump_page(struct page *page, const char *reason)
 	if (PageCompound(page))
 		pr_cont(" compound_mapcount: %d", compound_mapcount(page));
 	pr_cont("\n");
+	if (PageAnon(page))
+		pr_emerg("anon ");
+	else if (PageKsm(page))
+		pr_emerg("ksm ");
+	else if (mapping) {
+		pr_emerg("%ps ", mapping->a_ops);
+		if (mapping->host->i_dentry.first) {
+			struct dentry *dentry;
+			dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
+			pr_emerg("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
+		}
+	}
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
 	pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] mm: lower the printk loglevel for __dump_page messages
  2018-11-16  8:30 ` Michal Hocko
@ 2018-11-16  8:30   ` Michal Hocko
  -1 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

__dump_page messages use KERN_EMERG resp. KERN_ALERT loglevel (this is
the case since 2004). Most callers of this function are really detecting
a critical page state and BUG right after. On the other hand the
function is called also from contexts which just want to inform about
the page state and those would rather not disrupt logs that much (e.g.
some systems route these messages to the normal console).

Reduce the loglevel to KERN_WARNING to make dump_page easier to reuse
for other contexts while those messages will still make it to the kernel
log in most setups. Even if the loglevel setup filters warnings away
those paths that are really critical already print the more targeted
error or panic and that should make it to the kernel log.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/debug.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/debug.c b/mm/debug.c
index a33177bfc856..d18c5cea3320 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -54,7 +54,7 @@ void __dump_page(struct page *page, const char *reason)
 	 * dump_page() when detected.
 	 */
 	if (page_poisoned) {
-		pr_emerg("page:%px is uninitialized and poisoned", page);
+		pr_warn("page:%px is uninitialized and poisoned", page);
 		goto hex_only;
 	}
 
@@ -65,27 +65,27 @@ void __dump_page(struct page *page, const char *reason)
 	 */
 	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
-	pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
+	pr_warn("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
 		  page, page_ref_count(page), mapcount,
 		  page->mapping, page_to_pgoff(page));
 	if (PageCompound(page))
 		pr_cont(" compound_mapcount: %d", compound_mapcount(page));
 	pr_cont("\n");
 	if (PageAnon(page))
-		pr_emerg("anon ");
+		pr_warn("anon ");
 	else if (PageKsm(page))
-		pr_emerg("ksm ");
+		pr_warn("ksm ");
 	else if (mapping) {
-		pr_emerg("%ps ", mapping->a_ops);
+		pr_warn("%ps ", mapping->a_ops);
 		if (mapping->host->i_dentry.first) {
 			struct dentry *dentry;
 			dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
-			pr_emerg("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
+			pr_warn("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
 		}
 	}
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
-	pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
+	pr_warn("flags: %#lx(%pGp)\n", page->flags, &page->flags);
 
 hex_only:
 	print_hex_dump(KERN_ALERT, "raw: ", DUMP_PREFIX_NONE, 32,
@@ -93,11 +93,11 @@ void __dump_page(struct page *page, const char *reason)
 			sizeof(struct page), false);
 
 	if (reason)
-		pr_alert("page dumped because: %s\n", reason);
+		pr_warn("page dumped because: %s\n", reason);
 
 #ifdef CONFIG_MEMCG
 	if (!page_poisoned && page->mem_cgroup)
-		pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup);
+		pr_warn("page->mem_cgroup:%px\n", page->mem_cgroup);
 #endif
 }
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/5] mm: lower the printk loglevel for __dump_page messages
@ 2018-11-16  8:30   ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

__dump_page messages use KERN_EMERG resp. KERN_ALERT loglevel (this is
the case since 2004). Most callers of this function are really detecting
a critical page state and BUG right after. On the other hand the
function is called also from contexts which just want to inform about
the page state and those would rather not disrupt logs that much (e.g.
some systems route these messages to the normal console).

Reduce the loglevel to KERN_WARNING to make dump_page easier to reuse
for other contexts while those messages will still make it to the kernel
log in most setups. Even if the loglevel setup filters warnings away
those paths that are really critical already print the more targeted
error or panic and that should make it to the kernel log.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/debug.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/debug.c b/mm/debug.c
index a33177bfc856..d18c5cea3320 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -54,7 +54,7 @@ void __dump_page(struct page *page, const char *reason)
 	 * dump_page() when detected.
 	 */
 	if (page_poisoned) {
-		pr_emerg("page:%px is uninitialized and poisoned", page);
+		pr_warn("page:%px is uninitialized and poisoned", page);
 		goto hex_only;
 	}
 
@@ -65,27 +65,27 @@ void __dump_page(struct page *page, const char *reason)
 	 */
 	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
-	pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
+	pr_warn("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
 		  page, page_ref_count(page), mapcount,
 		  page->mapping, page_to_pgoff(page));
 	if (PageCompound(page))
 		pr_cont(" compound_mapcount: %d", compound_mapcount(page));
 	pr_cont("\n");
 	if (PageAnon(page))
-		pr_emerg("anon ");
+		pr_warn("anon ");
 	else if (PageKsm(page))
-		pr_emerg("ksm ");
+		pr_warn("ksm ");
 	else if (mapping) {
-		pr_emerg("%ps ", mapping->a_ops);
+		pr_warn("%ps ", mapping->a_ops);
 		if (mapping->host->i_dentry.first) {
 			struct dentry *dentry;
 			dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
-			pr_emerg("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
+			pr_warn("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
 		}
 	}
 	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
 
-	pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
+	pr_warn("flags: %#lx(%pGp)\n", page->flags, &page->flags);
 
 hex_only:
 	print_hex_dump(KERN_ALERT, "raw: ", DUMP_PREFIX_NONE, 32,
@@ -93,11 +93,11 @@ void __dump_page(struct page *page, const char *reason)
 			sizeof(struct page), false);
 
 	if (reason)
-		pr_alert("page dumped because: %s\n", reason);
+		pr_warn("page dumped because: %s\n", reason);
 
 #ifdef CONFIG_MEMCG
 	if (!page_poisoned && page->mem_cgroup)
-		pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup);
+		pr_warn("page->mem_cgroup:%px\n", page->mem_cgroup);
 #endif
 }
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
  2018-11-16  8:30 ` Michal Hocko
@ 2018-11-16  8:30   ` Michal Hocko
  -1 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

This function is never called from a context which would provide
misaligned pfn range so drop the pointless check.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2b2b3ccbbfb5..a92b1b8f6218 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1554,12 +1554,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	struct zone *zone;
 	struct memory_notify arg;
 
-	/* at least, alignment against pageblock is necessary */
-	if (!IS_ALIGNED(start_pfn, pageblock_nr_pages))
-		return -EINVAL;
-	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
-		return -EINVAL;
-
 	mem_hotplug_begin();
 
 	/* This makes hotplug much easier...and readable.
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
@ 2018-11-16  8:30   ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

This function is never called from a context which would provide
misaligned pfn range so drop the pointless check.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2b2b3ccbbfb5..a92b1b8f6218 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1554,12 +1554,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	struct zone *zone;
 	struct memory_notify arg;
 
-	/* at least, alignment against pageblock is necessary */
-	if (!IS_ALIGNED(start_pfn, pageblock_nr_pages))
-		return -EINVAL;
-	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
-		return -EINVAL;
-
 	mem_hotplug_begin();
 
 	/* This makes hotplug much easier...and readable.
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure
  2018-11-16  8:30 ` Michal Hocko
@ 2018-11-16  8:30   ` Michal Hocko
  -1 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

The memory offlining failure reporting is inconsistent and insufficient.
Some error paths simply do not report the failure to the log at all.
When we do report there are no details about the reason of the failure
and there are several of them which makes memory offlining failures
hard to debug.

Make sure that the
	memory offlining [mem %#010llx-%#010llx] failed
message is printed for all failures and also provide a short textual
reason for the failure e.g.

[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff

this tells us that the offlining has failed because of a signal pending
aka user intervention.

[akpm: tweak messages a bit]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 34 +++++++++++++++++++++++-----------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a92b1b8f6218..88d50e74e3fe 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1553,6 +1553,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	unsigned long valid_start, valid_end;
 	struct zone *zone;
 	struct memory_notify arg;
+	char *reason;
 
 	mem_hotplug_begin();
 
@@ -1561,7 +1562,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
 				  &valid_end)) {
 		mem_hotplug_done();
-		return -EINVAL;
+		ret = -EINVAL;
+		reason = "multizone range";
+		goto failed_removal;
 	}
 
 	zone = page_zone(pfn_to_page(valid_start));
@@ -1573,7 +1576,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
 				       MIGRATE_MOVABLE, true);
 	if (ret) {
 		mem_hotplug_done();
-		return ret;
+		reason = "failure to isolate range";
+		goto failed_removal;
 	}
 
 	arg.start_pfn = start_pfn;
@@ -1582,15 +1586,19 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	ret = memory_notify(MEM_GOING_OFFLINE, &arg);
 	ret = notifier_to_errno(ret);
-	if (ret)
-		goto failed_removal;
+	if (ret) {
+		reason = "notifier failure";
+		goto failed_removal_isolated;
+	}
 
 	pfn = start_pfn;
 repeat:
 	/* start memory hot removal */
 	ret = -EINTR;
-	if (signal_pending(current))
-		goto failed_removal;
+	if (signal_pending(current)) {
+		reason = "signal backoff";
+		goto failed_removal_isolated;
+	}
 
 	cond_resched();
 	lru_add_drain_all();
@@ -1607,8 +1615,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * actually in order to make hugetlbfs's object counting consistent.
 	 */
 	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
-	if (ret)
-		goto failed_removal;
+	if (ret) {
+		reason = "failure to dissolve huge pages";
+		goto failed_removal_isolated;
+	}
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
 	if (offlined_pages < 0)
@@ -1648,13 +1658,15 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	mem_hotplug_done();
 	return 0;
 
+failed_removal_isolated:
+	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 failed_removal:
-	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
+	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
 		 (unsigned long long) start_pfn << PAGE_SHIFT,
-		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
+		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
+		 reason);
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
-	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 	mem_hotplug_done();
 	return ret;
 }
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure
@ 2018-11-16  8:30   ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

The memory offlining failure reporting is inconsistent and insufficient.
Some error paths simply do not report the failure to the log at all.
When we do report there are no details about the reason of the failure
and there are several of them which makes memory offlining failures
hard to debug.

Make sure that the
	memory offlining [mem %#010llx-%#010llx] failed
message is printed for all failures and also provide a short textual
reason for the failure e.g.

[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff

this tells us that the offlining has failed because of a signal pending
aka user intervention.

[akpm: tweak messages a bit]
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 34 +++++++++++++++++++++++-----------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a92b1b8f6218..88d50e74e3fe 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1553,6 +1553,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	unsigned long valid_start, valid_end;
 	struct zone *zone;
 	struct memory_notify arg;
+	char *reason;
 
 	mem_hotplug_begin();
 
@@ -1561,7 +1562,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
 				  &valid_end)) {
 		mem_hotplug_done();
-		return -EINVAL;
+		ret = -EINVAL;
+		reason = "multizone range";
+		goto failed_removal;
 	}
 
 	zone = page_zone(pfn_to_page(valid_start));
@@ -1573,7 +1576,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
 				       MIGRATE_MOVABLE, true);
 	if (ret) {
 		mem_hotplug_done();
-		return ret;
+		reason = "failure to isolate range";
+		goto failed_removal;
 	}
 
 	arg.start_pfn = start_pfn;
@@ -1582,15 +1586,19 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	ret = memory_notify(MEM_GOING_OFFLINE, &arg);
 	ret = notifier_to_errno(ret);
-	if (ret)
-		goto failed_removal;
+	if (ret) {
+		reason = "notifier failure";
+		goto failed_removal_isolated;
+	}
 
 	pfn = start_pfn;
 repeat:
 	/* start memory hot removal */
 	ret = -EINTR;
-	if (signal_pending(current))
-		goto failed_removal;
+	if (signal_pending(current)) {
+		reason = "signal backoff";
+		goto failed_removal_isolated;
+	}
 
 	cond_resched();
 	lru_add_drain_all();
@@ -1607,8 +1615,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * actually in order to make hugetlbfs's object counting consistent.
 	 */
 	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
-	if (ret)
-		goto failed_removal;
+	if (ret) {
+		reason = "failure to dissolve huge pages";
+		goto failed_removal_isolated;
+	}
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
 	if (offlined_pages < 0)
@@ -1648,13 +1658,15 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	mem_hotplug_done();
 	return 0;
 
+failed_removal_isolated:
+	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 failed_removal:
-	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
+	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
 		 (unsigned long long) start_pfn << PAGE_SHIFT,
-		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
+		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
+		 reason);
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
-	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 	mem_hotplug_done();
 	return ret;
 }
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
  2018-11-16  8:30 ` Michal Hocko
@ 2018-11-16  8:30   ` Michal Hocko
  -1 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

There is only very limited information printed when the memory offlining
fails:
[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff

This tells us that the failure is triggered by the userspace
intervention but it doesn't tell us much more about the underlying
reason. It might be that the page migration failes repeatedly and the
userspace timeout expires and send a signal or it might be some of the
earlier steps (isolation, memory notifier) takes too long.

If the migration failes then it would be really helpful to see which
page that and its state. The same applies to the isolation phase. If we
fail to isolate a page from the allocator then knowing the state of the
page would be helpful as well.

Dump the page state that fails to get isolated or migrated. This will
tell us more about the failure and what to focus on during debugging.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 12 ++++++++----
 mm/page_alloc.c     |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 88d50e74e3fe..c82193db4be6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1388,10 +1388,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 						    page_is_file_cache(page));
 
 		} else {
-#ifdef CONFIG_DEBUG_VM
-			pr_alert("failed to isolate pfn %lx\n", pfn);
+			pr_warn("failed to isolate pfn %lx\n", pfn);
 			dump_page(page, "isolation failed");
-#endif
 			put_page(page);
 			/* Because we don't have big zone->lock. we should
 			   check this again here. */
@@ -1411,8 +1409,14 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		/* Allocate a new page from the nearest neighbor node */
 		ret = migrate_pages(&source, new_node_page, NULL, 0,
 					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
-		if (ret)
+		if (ret) {
+			list_for_each_entry(page, &source, lru) {
+				pr_warn("migrating pfn %lx failed ret:%d ",
+				       page_to_pfn(page), ret);
+				dump_page(page, "migration failure");
+			}
 			putback_movable_pages(&source);
+		}
 	}
 out:
 	return ret;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a919ba5cb3c8..ec2c7916dc2d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 	return false;
 unmovable:
 	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
+	dump_page(pfn_to_page(pfn+iter), "unmovable page");
 	return true;
 }
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
@ 2018-11-16  8:30   ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16  8:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oscar Salvador, Baoquan He, Anshuman Khandual, linux-mm, LKML,
	Michal Hocko

From: Michal Hocko <mhocko@suse.com>

There is only very limited information printed when the memory offlining
fails:
[ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff

This tells us that the failure is triggered by the userspace
intervention but it doesn't tell us much more about the underlying
reason. It might be that the page migration failes repeatedly and the
userspace timeout expires and send a signal or it might be some of the
earlier steps (isolation, memory notifier) takes too long.

If the migration failes then it would be really helpful to see which
page that and its state. The same applies to the isolation phase. If we
fail to isolate a page from the allocator then knowing the state of the
page would be helpful as well.

Dump the page state that fails to get isolated or migrated. This will
tell us more about the failure and what to focus on during debugging.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 12 ++++++++----
 mm/page_alloc.c     |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 88d50e74e3fe..c82193db4be6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1388,10 +1388,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 						    page_is_file_cache(page));
 
 		} else {
-#ifdef CONFIG_DEBUG_VM
-			pr_alert("failed to isolate pfn %lx\n", pfn);
+			pr_warn("failed to isolate pfn %lx\n", pfn);
 			dump_page(page, "isolation failed");
-#endif
 			put_page(page);
 			/* Because we don't have big zone->lock. we should
 			   check this again here. */
@@ -1411,8 +1409,14 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		/* Allocate a new page from the nearest neighbor node */
 		ret = migrate_pages(&source, new_node_page, NULL, 0,
 					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
-		if (ret)
+		if (ret) {
+			list_for_each_entry(page, &source, lru) {
+				pr_warn("migrating pfn %lx failed ret:%d ",
+				       page_to_pfn(page), ret);
+				dump_page(page, "migration failure");
+			}
 			putback_movable_pages(&source);
+		}
 	}
 out:
 	return ret;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a919ba5cb3c8..ec2c7916dc2d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 	return false;
 unmovable:
 	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
+	dump_page(pfn_to_page(pfn+iter), "unmovable page");
 	return true;
 }
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
  2018-11-16  8:30   ` Michal Hocko
  (?)
@ 2018-11-16 10:34   ` osalvador
  2018-11-16 11:19     ` Michal Hocko
  -1 siblings, 1 reply; 25+ messages in thread
From: osalvador @ 2018-11-16 10:34 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Baoquan He, Anshuman Khandual, linux-mm, LKML, Michal Hocko

On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> This function is never called from a context which would provide
> misaligned pfn range so drop the pointless check.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

I vaguely remember that someone reported a problem about misaligned
range on powerpc.
Not sure at which stage was (online/offline).
Although I am not sure if that was valid at all.

Reviewed-by: Oscar Salvador <osalvador@suse.de>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure
  2018-11-16  8:30   ` Michal Hocko
  (?)
@ 2018-11-16 10:38   ` osalvador
  -1 siblings, 0 replies; 25+ messages in thread
From: osalvador @ 2018-11-16 10:38 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Baoquan He, Anshuman Khandual, linux-mm, LKML, Michal Hocko

On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> The memory offlining failure reporting is inconsistent and
> insufficient.
> Some error paths simply do not report the failure to the log at all.
> When we do report there are no details about the reason of the
> failure
> and there are several of them which makes memory offlining failures
> hard to debug.
> 
> Make sure that the
> 	memory offlining [mem %#010llx-%#010llx] failed
> message is printed for all failures and also provide a short textual
> reason for the failure e.g.
> 
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-
> 0x8267fffffff] failed due to signal backoff
> 
> this tells us that the offlining has failed because of a signal
> pending
> aka user intervention.
> 
> [akpm: tweak messages a bit]
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
  2018-11-16  8:30   ` Michal Hocko
  (?)
@ 2018-11-16 10:47   ` osalvador
  2018-11-16 11:22     ` Michal Hocko
  -1 siblings, 1 reply; 25+ messages in thread
From: osalvador @ 2018-11-16 10:47 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Baoquan He, Anshuman Khandual, linux-mm, LKML, Michal Hocko

On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a919ba5cb3c8..ec2c7916dc2d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone,
> struct page *page, int count,
>  	return false;
>  unmovable:
>  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> +	dump_page(pfn_to_page(pfn+iter), "unmovable page");

Would not be enough to just do:

dump_page(page, "unmovable page".

Unless I am missing something, page should already have the
right pfn?

<---
unsigned long check = pfn + iter;
page = pfn_to_page(check);
--->

The rest looks good to me

Reviewed-by: Oscar Salvador <osalvador@suse.de>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
  2018-11-16 10:34   ` osalvador
@ 2018-11-16 11:19     ` Michal Hocko
  0 siblings, 0 replies; 25+ messages in thread
From: Michal Hocko @ 2018-11-16 11:19 UTC (permalink / raw)
  To: osalvador; +Cc: Andrew Morton, Baoquan He, Anshuman Khandual, linux-mm, LKML

On Fri 16-11-18 11:34:03, Oscar Salvador wrote:
> On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > This function is never called from a context which would provide
> > misaligned pfn range so drop the pointless check.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> I vaguely remember that someone reported a problem about misaligned
> range on powerpc.
> Not sure at which stage was (online/offline).
> Although I am not sure if that was valid at all.

If we are talking about the same thing then this was about partial
memblock initialized (aka struct pages were not initialized).

> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
  2018-11-16 10:47   ` osalvador
@ 2018-11-16 11:22     ` Michal Hocko
  2018-11-16 12:29       ` osalvador
  0 siblings, 1 reply; 25+ messages in thread
From: Michal Hocko @ 2018-11-16 11:22 UTC (permalink / raw)
  To: osalvador; +Cc: Andrew Morton, Baoquan He, Anshuman Khandual, linux-mm, LKML

On Fri 16-11-18 11:47:01, osalvador wrote:
> On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index a919ba5cb3c8..ec2c7916dc2d 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone,
> > struct page *page, int count,
> >  	return false;
> >  unmovable:
> >  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> > +	dump_page(pfn_to_page(pfn+iter), "unmovable page");
> 
> Would not be enough to just do:
> 
> dump_page(page, "unmovable page".
> 
> Unless I am missing something, page should already have the
> right pfn?

What if pfn_valid_within fails? You could have a pointer to the previous
page.

> 
> <---
> unsigned long check = pfn + iter;
> page = pfn_to_page(check);
> --->
> 
> The rest looks good to me
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging
  2018-11-16  8:30 ` Michal Hocko
                   ` (5 preceding siblings ...)
  (?)
@ 2018-11-16 11:55 ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 11:55 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton; +Cc: Oscar Salvador, Baoquan He, linux-mm, LKML


On 11/16/2018 02:00 PM, Michal Hocko wrote:
> Hi,
> this has been posted as an RFC [1]. I have screwed during rebasing so
> there were few compilation issues in the previous version. I have also
> integrated review feedback from Andrew and Anshuman.
> 
> I have been promissing to improve memory offlining failures debugging
> for quite some time. As things stand now we get only very limited
> information in the kernel log when the offlining fails. It is usually
> only
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed
> without no further details. We do not know what exactly fails and for
> what reason. Whenever I was forced to debug such a failure I've always
> had to do a debugging patch to tell me more. We can enable some
> tracepoints but it would be much better to get a better picture without
> using them.
> 
> This patch series does 2 things. The first one is to make dump_page
> more usable by printing more information about the mapping patch 1.
> Then it reduces the log level from emerg to warning so that this
> function is usable from less critical context patch 2. Then I have
> added more detailed information about the offlining failure patch 4
> and finally add dump_page to isolation and offlining migration paths.
> Patch 3 is a trivial cleanup.
> 
> Does this look go to you?
> 
> [1] http://lkml.kernel.org/r/20181107101830.17405-1-mhocko@kernel.org
> 

Agreed. It has been always difficult to debug memory hot plug problems
without a debug patch particularly to understand the unmovable pages
and their isolation failures in the range to be removed. This series
is definitely going to help improve the situation.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] mm: print more information about mapping in __dump_page
  2018-11-16  8:30   ` Michal Hocko
  (?)
@ 2018-11-16 11:55   ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 11:55 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Oscar Salvador, Baoquan He, linux-mm, LKML, Michal Hocko



On 11/16/2018 02:00 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> __dump_page prints the mapping pointer but that is quite unhelpful
> for many reports because the pointer itself only helps to distinguish
> anon/ksm mappings from other ones (because of lowest bits
> set). Sometimes it would be much more helpful to know what kind of
> mapping that is actually and if we know this is a file mapping then also
> try to resolve the dentry name.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/debug.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/mm/debug.c b/mm/debug.c
> index cdacba12e09a..a33177bfc856 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -44,6 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
>  
>  void __dump_page(struct page *page, const char *reason)
>  {
> +	struct address_space *mapping = page_mapping(page);
>  	bool page_poisoned = PagePoisoned(page);
>  	int mapcount;
>  
> @@ -70,6 +71,18 @@ void __dump_page(struct page *page, const char *reason)
>  	if (PageCompound(page))
>  		pr_cont(" compound_mapcount: %d", compound_mapcount(page));
>  	pr_cont("\n");
> +	if (PageAnon(page))
> +		pr_emerg("anon ");
> +	else if (PageKsm(page))
> +		pr_emerg("ksm ");
> +	else if (mapping) {
> +		pr_emerg("%ps ", mapping->a_ops);
> +		if (mapping->host->i_dentry.first) {
> +			struct dentry *dentry;
> +			dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias);
> +			pr_emerg("name:\"%*s\" ", dentry->d_name.len, dentry->d_name.name);
> +		}
> +	}
>  	BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);
>  
>  	pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
> 

Differentiating between anon, ksm mapping and going till dentry information
for file mappings is surely an improvement. 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages
  2018-11-16  8:30   ` Michal Hocko
  (?)
  (?)
@ 2018-11-16 11:56   ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 11:56 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Oscar Salvador, Baoquan He, linux-mm, LKML, Michal Hocko



On 11/16/2018 02:00 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> This function is never called from a context which would provide
> misaligned pfn range so drop the pointless check.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memory_hotplug.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2b2b3ccbbfb5..a92b1b8f6218 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1554,12 +1554,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	struct zone *zone;
>  	struct memory_notify arg;
>  
> -	/* at least, alignment against pageblock is necessary */
> -	if (!IS_ALIGNED(start_pfn, pageblock_nr_pages))
> -		return -EINVAL;
> -	if (!IS_ALIGNED(end_pfn, pageblock_nr_pages))
> -		return -EINVAL;
> -
>  	mem_hotplug_begin();
>  
>  	/* This makes hotplug much easier...and readable.
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] mm: lower the printk loglevel for __dump_page messages
  2018-11-16  8:30   ` Michal Hocko
  (?)
@ 2018-11-16 11:57   ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 11:57 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Oscar Salvador, Baoquan He, linux-mm, LKML, Michal Hocko



On 11/16/2018 02:00 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> __dump_page messages use KERN_EMERG resp. KERN_ALERT loglevel (this is
> the case since 2004). Most callers of this function are really detecting
> a critical page state and BUG right after. On the other hand the
> function is called also from contexts which just want to inform about
> the page state and those would rather not disrupt logs that much (e.g.
> some systems route these messages to the normal console).
> 
> Reduce the loglevel to KERN_WARNING to make dump_page easier to reuse
> for other contexts while those messages will still make it to the kernel
> log in most setups. Even if the loglevel setup filters warnings away
> those paths that are really critical already print the more targeted
> error or panic and that should make it to the kernel log.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure
  2018-11-16  8:30   ` Michal Hocko
  (?)
  (?)
@ 2018-11-16 12:04   ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 12:04 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Oscar Salvador, Baoquan He, linux-mm, LKML, Michal Hocko



On 11/16/2018 02:00 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> The memory offlining failure reporting is inconsistent and insufficient.
> Some error paths simply do not report the failure to the log at all.
> When we do report there are no details about the reason of the failure
> and there are several of them which makes memory offlining failures
> hard to debug.
> 
> Make sure that the
> 	memory offlining [mem %#010llx-%#010llx] failed
> message is printed for all failures and also provide a short textual
> reason for the failure e.g.
> 
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff
> 
> this tells us that the offlining has failed because of a signal pending
> aka user intervention.
> 
> [akpm: tweak messages a bit]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memory_hotplug.c | 34 +++++++++++++++++++++++-----------
>  1 file changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a92b1b8f6218..88d50e74e3fe 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1553,6 +1553,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	unsigned long valid_start, valid_end;
>  	struct zone *zone;
>  	struct memory_notify arg;
> +	char *reason;
>  
>  	mem_hotplug_begin();
>  
> @@ -1561,7 +1562,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start,
>  				  &valid_end)) {
>  		mem_hotplug_done();
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		reason = "multizone range";
> +		goto failed_removal;
>  	}
>  
>  	zone = page_zone(pfn_to_page(valid_start));
> @@ -1573,7 +1576,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  				       MIGRATE_MOVABLE, true);
>  	if (ret) {
>  		mem_hotplug_done();
> -		return ret;
> +		reason = "failure to isolate range";
> +		goto failed_removal;
>  	}
>  
>  	arg.start_pfn = start_pfn;
> @@ -1582,15 +1586,19 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  
>  	ret = memory_notify(MEM_GOING_OFFLINE, &arg);
>  	ret = notifier_to_errno(ret);
> -	if (ret)
> -		goto failed_removal;
> +	if (ret) {
> +		reason = "notifier failure";
> +		goto failed_removal_isolated;
> +	}
>  
>  	pfn = start_pfn;
>  repeat:
>  	/* start memory hot removal */
>  	ret = -EINTR;
> -	if (signal_pending(current))
> -		goto failed_removal;
> +	if (signal_pending(current)) {
> +		reason = "signal backoff";
> +		goto failed_removal_isolated;
> +	}
>  
>  	cond_resched();
>  	lru_add_drain_all();
> @@ -1607,8 +1615,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	 * actually in order to make hugetlbfs's object counting consistent.
>  	 */
>  	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> -	if (ret)
> -		goto failed_removal;
> +	if (ret) {
> +		reason = "failure to dissolve huge pages";
> +		goto failed_removal_isolated;
> +	}
>  	/* check again */
>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>  	if (offlined_pages < 0)
> @@ -1648,13 +1658,15 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	mem_hotplug_done();
>  	return 0;
>  
> +failed_removal_isolated:
> +	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>  failed_removal:
> -	pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n",
> +	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
>  		 (unsigned long long) start_pfn << PAGE_SHIFT,
> -		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
> +		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
> +		 reason);
>  	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>  	/* pushback to free area */
> -	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>  	mem_hotplug_done();
>  	return ret;
>  }
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
  2018-11-16  8:30   ` Michal Hocko
  (?)
  (?)
@ 2018-11-16 12:07   ` Anshuman Khandual
  -1 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2018-11-16 12:07 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Oscar Salvador, Baoquan He, linux-mm, LKML, Michal Hocko



On 11/16/2018 02:00 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> There is only very limited information printed when the memory offlining
> fails:
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff
> 
> This tells us that the failure is triggered by the userspace
> intervention but it doesn't tell us much more about the underlying
> reason. It might be that the page migration failes repeatedly and the
> userspace timeout expires and send a signal or it might be some of the
> earlier steps (isolation, memory notifier) takes too long.
> 
> If the migration failes then it would be really helpful to see which
> page that and its state. The same applies to the isolation phase. If we
> fail to isolate a page from the allocator then knowing the state of the
> page would be helpful as well.
> 
> Dump the page state that fails to get isolated or migrated. This will
> tell us more about the failure and what to focus on during debugging.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memory_hotplug.c | 12 ++++++++----
>  mm/page_alloc.c     |  1 +
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 88d50e74e3fe..c82193db4be6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1388,10 +1388,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  						    page_is_file_cache(page));
>  
>  		} else {
> -#ifdef CONFIG_DEBUG_VM
> -			pr_alert("failed to isolate pfn %lx\n", pfn);
> +			pr_warn("failed to isolate pfn %lx\n", pfn);
>  			dump_page(page, "isolation failed");
> -#endif
>  			put_page(page);
>  			/* Because we don't have big zone->lock. we should
>  			   check this again here. */
> @@ -1411,8 +1409,14 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		/* Allocate a new page from the nearest neighbor node */
>  		ret = migrate_pages(&source, new_node_page, NULL, 0,
>  					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
> -		if (ret)
> +		if (ret) {
> +			list_for_each_entry(page, &source, lru) {
> +				pr_warn("migrating pfn %lx failed ret:%d ",
> +				       page_to_pfn(page), ret);
> +				dump_page(page, "migration failure");
> +			}
>  			putback_movable_pages(&source);
> +		}
>  	}
>  out:
>  	return ret;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a919ba5cb3c8..ec2c7916dc2d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>  	return false;
>  unmovable:
>  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> +	dump_page(pfn_to_page(pfn+iter), "unmovable page");
>  	return true;
>  }

This seems to have fixed the previous build problem because of the migrate_pages()
return code. Otherwise looks good.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
  2018-11-16 11:22     ` Michal Hocko
@ 2018-11-16 12:29       ` osalvador
  0 siblings, 0 replies; 25+ messages in thread
From: osalvador @ 2018-11-16 12:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Baoquan He, Anshuman Khandual, linux-mm, LKML

On Fri, 2018-11-16 at 12:22 +0100, Michal Hocko wrote:
> On Fri 16-11-18 11:47:01, osalvador wrote:
> > On Fri, 2018-11-16 at 09:30 +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index a919ba5cb3c8..ec2c7916dc2d 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7845,6 +7845,7 @@ bool has_unmovable_pages(struct zone *zone,
> > > struct page *page, int count,
> > >  	return false;
> > >  unmovable:
> > >  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> > > +	dump_page(pfn_to_page(pfn+iter), "unmovable page");
> > 
> > Would not be enough to just do:
> > 
> > dump_page(page, "unmovable page".
> > 
> > Unless I am missing something, page should already have the
> > right pfn?
> 
> What if pfn_valid_within fails? You could have a pointer to the
> previous
> page.

Sorry, I missed that, you are right.

> > 
> > <---
> > unsigned long check = pfn + iter;
> > page = pfn_to_page(check);
> > --->
> > 
> > The rest looks good to me
> > 
> > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 
> Thanks!
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] mm: print more information about mapping in __dump_page
  2018-11-16  8:30   ` Michal Hocko
  (?)
  (?)
@ 2018-11-16 13:50   ` William Kucharski
  -1 siblings, 0 replies; 25+ messages in thread
From: William Kucharski @ 2018-11-16 13:50 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Oscar Salvador, Baoquan He, Anshuman Khandual,
	linux-mm, LKML, Michal Hocko



> On Nov 16, 2018, at 1:30 AM, Michal Hocko <mhocko@kernel.org> wrote:
> 
> From: Michal Hocko <mhocko@suse.com>
> 
> __dump_page prints the mapping pointer but that is quite unhelpful
> for many reports because the pointer itself only helps to distinguish
> anon/ksm mappings from other ones (because of lowest bits
> set). Sometimes it would be much more helpful to know what kind of
> mapping that is actually and if we know this is a file mapping then also
> try to resolve the dentry name.

I really, really like this - the more information available in the dump
output, the easier it is to know where to start looking for the problem.

Reviewed-by: William Kucharski <william.kucharski@oracle.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-11-16 13:51 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-16  8:30 [PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging Michal Hocko
2018-11-16  8:30 ` Michal Hocko
2018-11-16  8:30 ` [PATCH 1/5] mm: print more information about mapping in __dump_page Michal Hocko
2018-11-16  8:30   ` Michal Hocko
2018-11-16 11:55   ` Anshuman Khandual
2018-11-16 13:50   ` William Kucharski
2018-11-16  8:30 ` [PATCH 2/5] mm: lower the printk loglevel for __dump_page messages Michal Hocko
2018-11-16  8:30   ` Michal Hocko
2018-11-16 11:57   ` Anshuman Khandual
2018-11-16  8:30 ` [PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages Michal Hocko
2018-11-16  8:30   ` Michal Hocko
2018-11-16 10:34   ` osalvador
2018-11-16 11:19     ` Michal Hocko
2018-11-16 11:56   ` Anshuman Khandual
2018-11-16  8:30 ` [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure Michal Hocko
2018-11-16  8:30   ` Michal Hocko
2018-11-16 10:38   ` osalvador
2018-11-16 12:04   ` Anshuman Khandual
2018-11-16  8:30 ` [PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures Michal Hocko
2018-11-16  8:30   ` Michal Hocko
2018-11-16 10:47   ` osalvador
2018-11-16 11:22     ` Michal Hocko
2018-11-16 12:29       ` osalvador
2018-11-16 12:07   ` Anshuman Khandual
2018-11-16 11:55 ` [PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging Anshuman Khandual

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.