All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: <linux-mm@kvack.org>, Oscar Salvador <OSalvador@suse.com>,
	Baoquan He <bhe@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
Date: Thu, 15 Nov 2018 16:07:16 -0800	[thread overview]
Message-ID: <20181115160716.18b9956ee64932abe9428ef1@linux-foundation.org> (raw)
In-Reply-To: <20181107101830.17405-6-mhocko@kernel.org>

On Wed,  7 Nov 2018 11:18:30 +0100 Michal Hocko <mhocko@kernel.org> wrote:

> From: Michal Hocko <mhocko@suse.com>
> 
> There is only very limited information printed when the memory offlining
> fails:
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff
> 
> This tells us that the failure is triggered by the userspace
> intervention but it doesn't tell us much more about the underlying
> reason. It might be that the page migration failes repeatedly and the
> userspace timeout expires and send a signal or it might be some of the
> earlier steps (isolation, memory notifier) takes too long.
> 
> If the migration failes then it would be really helpful to see which
> page that and its state. The same applies to the isolation phase. If we
> fail to isolate a page from the allocator then knowing the state of the
> page would be helpful as well.
> 
> Dump the page state that fails to get isolated or migrated. This will
> tell us more about the failure and what to focus on during debugging.
> 
> ...
>
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1388,10 +1388,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  						    page_is_file_cache(page));
>  
>  		} else {
> -#ifdef CONFIG_DEBUG_VM
> -			pr_alert("failed to isolate pfn %lx\n", pfn);
> +			pr_warn("failed to isolate pfn %lx\n", pfn);
>  			dump_page(page, "isolation failed");
> -#endif
>  			put_page(page);
>  			/* Because we don't have big zone->lock. we should
>  			   check this again here. */
> @@ -1411,8 +1409,14 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		/* Allocate a new page from the nearest neighbor node */
>  		ret = migrate_pages(&source, new_node_page, NULL, 0,
>  					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
> -		if (ret)
> +		if (ret) {
> +			list_for_each_entry(page, &source, lru) {
> +				pr_warn("migrating pfn %lx failed ",
> +				       page_to_pfn(page), ret);
> +				dump_page(page, NULL);
> +			}

./include/linux/kern_levels.h:5:18: warning: too many arguments for format [-Wformat-extra-args]
 #define KERN_SOH "\001"  /* ASCII Start Of Header */
                  ^
./include/linux/kern_levels.h:12:22: note: in expansion of macro ‘KERN_SOH’
 #define KERN_WARNING KERN_SOH "4" /* warning conditions */
                      ^~~~~~~~
./include/linux/printk.h:310:9: note: in expansion of macro ‘KERN_WARNING’
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
         ^~~~~~~~~~~~
./include/linux/printk.h:311:17: note: in expansion of macro ‘pr_warning’
 #define pr_warn pr_warning
                 ^~~~~~~~~~
mm/memory_hotplug.c:1414:5: note: in expansion of macro ‘pr_warn’
     pr_warn("migrating pfn %lx failed ",
     ^~~~~~~

--- a/mm/memory_hotplug.c~mm-memory_hotplug-be-more-verbose-for-memory-offline-failures-fix
+++ a/mm/memory_hotplug.c
@@ -1411,7 +1411,7 @@ do_migrate_range(unsigned long start_pfn
 					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
 		if (ret) {
 			list_for_each_entry(page, &source, lru) {
-				pr_warn("migrating pfn %lx failed ",
+				pr_warn("migrating pfn %lx failed: %d",
 				       page_to_pfn(page), ret);
 				dump_page(page, NULL);
 			}


WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Oscar Salvador <OSalvador@suse.com>,
	Baoquan He <bhe@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures
Date: Thu, 15 Nov 2018 16:07:16 -0800	[thread overview]
Message-ID: <20181115160716.18b9956ee64932abe9428ef1@linux-foundation.org> (raw)
In-Reply-To: <20181107101830.17405-6-mhocko@kernel.org>

On Wed,  7 Nov 2018 11:18:30 +0100 Michal Hocko <mhocko@kernel.org> wrote:

> From: Michal Hocko <mhocko@suse.com>
> 
> There is only very limited information printed when the memory offlining
> fails:
> [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff
> 
> This tells us that the failure is triggered by the userspace
> intervention but it doesn't tell us much more about the underlying
> reason. It might be that the page migration failes repeatedly and the
> userspace timeout expires and send a signal or it might be some of the
> earlier steps (isolation, memory notifier) takes too long.
> 
> If the migration failes then it would be really helpful to see which
> page that and its state. The same applies to the isolation phase. If we
> fail to isolate a page from the allocator then knowing the state of the
> page would be helpful as well.
> 
> Dump the page state that fails to get isolated or migrated. This will
> tell us more about the failure and what to focus on during debugging.
> 
> ...
>
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1388,10 +1388,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  						    page_is_file_cache(page));
>  
>  		} else {
> -#ifdef CONFIG_DEBUG_VM
> -			pr_alert("failed to isolate pfn %lx\n", pfn);
> +			pr_warn("failed to isolate pfn %lx\n", pfn);
>  			dump_page(page, "isolation failed");
> -#endif
>  			put_page(page);
>  			/* Because we don't have big zone->lock. we should
>  			   check this again here. */
> @@ -1411,8 +1409,14 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		/* Allocate a new page from the nearest neighbor node */
>  		ret = migrate_pages(&source, new_node_page, NULL, 0,
>  					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
> -		if (ret)
> +		if (ret) {
> +			list_for_each_entry(page, &source, lru) {
> +				pr_warn("migrating pfn %lx failed ",
> +				       page_to_pfn(page), ret);
> +				dump_page(page, NULL);
> +			}

./include/linux/kern_levels.h:5:18: warning: too many arguments for format [-Wformat-extra-args]
 #define KERN_SOH "\001"  /* ASCII Start Of Header */
                  ^
./include/linux/kern_levels.h:12:22: note: in expansion of macro a??KERN_SOHa??
 #define KERN_WARNING KERN_SOH "4" /* warning conditions */
                      ^~~~~~~~
./include/linux/printk.h:310:9: note: in expansion of macro a??KERN_WARNINGa??
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
         ^~~~~~~~~~~~
./include/linux/printk.h:311:17: note: in expansion of macro a??pr_warninga??
 #define pr_warn pr_warning
                 ^~~~~~~~~~
mm/memory_hotplug.c:1414:5: note: in expansion of macro a??pr_warna??
     pr_warn("migrating pfn %lx failed ",
     ^~~~~~~

--- a/mm/memory_hotplug.c~mm-memory_hotplug-be-more-verbose-for-memory-offline-failures-fix
+++ a/mm/memory_hotplug.c
@@ -1411,7 +1411,7 @@ do_migrate_range(unsigned long start_pfn
 					MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
 		if (ret) {
 			list_for_each_entry(page, &source, lru) {
-				pr_warn("migrating pfn %lx failed ",
+				pr_warn("migrating pfn %lx failed: %d",
 				       page_to_pfn(page), ret);
 				dump_page(page, NULL);
 			}

  parent reply	other threads:[~2018-11-16  0:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-07 10:18 [RFC PATCH 0/5] mm, memory_hotplug: improve memory offlining failures debugging Michal Hocko
2018-11-07 10:18 ` Michal Hocko
2018-11-07 10:18 ` [RFC PATCH 1/5] mm: print more information about mapping in __dump_page Michal Hocko
2018-11-07 10:18   ` Michal Hocko
2018-11-24  0:04   ` Andrew Morton
2018-11-24  0:04     ` Andrew Morton
2018-11-25  8:10     ` Michal Hocko
2018-11-07 10:18 ` [RFC PATCH 2/5] mm: lower the printk loglevel for __dump_page messages Michal Hocko
2018-11-07 10:18   ` Michal Hocko
2018-11-16  0:56   ` Baoquan He
2018-12-12 14:25   ` Michal Hocko
2018-12-12 14:34     ` Michal Hocko
2018-11-07 10:18 ` [RFC PATCH 3/5] mm, memory_hotplug: drop pointless block alignment checks from __offline_pages Michal Hocko
2018-11-07 10:18   ` Michal Hocko
2018-11-07 10:18 ` [RFC PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure Michal Hocko
2018-11-07 10:18   ` Michal Hocko
2018-11-07 22:04   ` Andrew Morton
2018-11-07 22:04     ` Andrew Morton
2018-11-08  8:01     ` Michal Hocko
2018-11-13  8:02     ` Michal Hocko
2018-11-08  6:23   ` Anshuman Khandual
2018-11-08  7:59     ` Michal Hocko
2018-11-07 10:18 ` [RFC PATCH 5/5] mm, memory_hotplug: be more verbose for memory offline failures Michal Hocko
2018-11-07 10:18   ` Michal Hocko
2018-11-08  7:16   ` Anshuman Khandual
2018-11-08  8:12     ` Michal Hocko
2018-11-08  8:19       ` Anshuman Khandual
2018-11-13  8:03       ` Michal Hocko
2018-11-16  0:07   ` Andrew Morton [this message]
2018-11-16  0:07     ` Andrew Morton
2018-11-16  7:21     ` Michal Hocko
2018-11-16  7:21       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181115160716.18b9956ee64932abe9428ef1@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=OSalvador@suse.com \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.