linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
@ 2018-12-17 15:59 Heiko Carstens
  2018-12-17 16:03 ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Heiko Carstens @ 2018-12-17 15:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Oscar Salvador, Anshuman Khandual,
	Stephen Rothwell, linux-mm, linux-kernel, linux-next, linux-s390

Hi Michal,

with linux-next as of today on s390 I see tons of messages like

[   20.536664] page dumped because: has_unmovable_pages
[   20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
[   20.536794] flags: 0x3fffe0000010200(slab|head)
[   20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
[   20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
[   20.536797] page dumped because: has_unmovable_pages
[   20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
[   20.536815] flags: 0x7fffe0000000000()
[   20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
[   20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000

bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for memory offline failures")
which is the first commit with which the messages appear.

Note: there is _no_ memory hotplug involved when these messages appear.

I don't know if it helps, but this is the contents of /proc/zoneinfo:

Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 8
      nr_active_anon 8389
      nr_inactive_file 43418
      nr_active_file 22655
      nr_unevictable 0
      nr_slab_reclaimable 8192
      nr_slab_unreclaimable 11368
      nr_isolated_anon 0
      nr_isolated_file 0
      workingset_nodes 0
      workingset_refault 0
      workingset_activate 0
      workingset_restore 0
      workingset_nodereclaim 0
      nr_anon_pages 7088
      nr_mapped    16328
      nr_file_pages 66132
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     55
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_anon_transparent_hugepages 4
      nr_unstable  0
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   20723
      nr_written   18227
      nr_kernel_misc_reclaimable 0
  pages free     519834
        min      1899
        low      2419
        high     2939
        spanned  524288
        present  524288
        managed  520562
        protection: (0, 3988, 3988)
      nr_free_pages 519834
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_page_table_pages 0
      nr_kernel_stack 0
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     40
      numa_miss    0
      numa_foreign 0
      numa_interleave 12
      numa_local   40
      numa_other   0
  pagesets
    cpu: 0
              count: 336
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 1
              count: 60
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 2
              count: 60
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 3
              count: 0
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 4
              count: 62
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 5
              count: 0
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 6
              count: 59
              high:  378
              batch: 63
  vm stats threshold: 40
    cpu: 7
              count: 0
              high:  378
              batch: 63
  vm stats threshold: 40
  node_unreclaimable:  0
  start_pfn:           0
Node 0, zone   Normal
  pages free     912587
        min      3732
        low      4754
        high     5776
        spanned  1048576
        present  1048576
        managed  1022150
        protection: (0, 0, 0)
      nr_free_pages 912587
      nr_zone_inactive_anon 8
      nr_zone_active_anon 8389
      nr_zone_inactive_file 43418
      nr_zone_active_file 22655
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_page_table_pages 548
      nr_kernel_stack 3072
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  1024
      numa_hit     3115288
      numa_miss    0
      numa_foreign 0
      numa_interleave 6865
      numa_local   3115288
      numa_other   0
  pagesets
    cpu: 0
              count: 86
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 1
              count: 80
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 2
              count: 76
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 3
              count: 53
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 4
              count: 81
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 5
              count: 18
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 6
              count: 73
              high:  90
              batch: 15
  vm stats threshold: 48
    cpu: 7
              count: 63
              high:  90
              batch: 15
  vm stats threshold: 48
  node_unreclaimable:  0
  start_pfn:           524288
Node 0, zone  Movable
  pages free     0
        min      0
        low      0
        high     0
        spanned  0
        present  0
        managed  0
        protection: (0, 0, 0)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
  2018-12-17 15:59 [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures" Heiko Carstens
@ 2018-12-17 16:03 ` Michal Hocko
  2018-12-17 16:39   ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2018-12-17 16:03 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, Oscar Salvador, Anshuman Khandual,
	Stephen Rothwell, linux-mm, linux-kernel, linux-next, linux-s390

On Mon 17-12-18 16:59:22, Heiko Carstens wrote:
> Hi Michal,
> 
> with linux-next as of today on s390 I see tons of messages like
> 
> [   20.536664] page dumped because: has_unmovable_pages
> [   20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
> [   20.536794] flags: 0x3fffe0000010200(slab|head)
> [   20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
> [   20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
> [   20.536797] page dumped because: has_unmovable_pages
> [   20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
> [   20.536815] flags: 0x7fffe0000000000()
> [   20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
> [   20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
> 
> bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for memory offline failures")
> which is the first commit with which the messages appear.

I would bet this is CMA allocator. How much is tons? Maybe we want a
rate limit or the other user is not really interested in them at all?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
  2018-12-17 16:03 ` Michal Hocko
@ 2018-12-17 16:39   ` Michal Hocko
  2018-12-18  7:55     ` Heiko Carstens
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2018-12-17 16:39 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, Oscar Salvador, Anshuman Khandual,
	Stephen Rothwell, linux-mm, linux-kernel, linux-next, linux-s390

On Mon 17-12-18 17:03:50, Michal Hocko wrote:
> On Mon 17-12-18 16:59:22, Heiko Carstens wrote:
> > Hi Michal,
> > 
> > with linux-next as of today on s390 I see tons of messages like
> > 
> > [   20.536664] page dumped because: has_unmovable_pages
> > [   20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
> > [   20.536794] flags: 0x3fffe0000010200(slab|head)
> > [   20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
> > [   20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
> > [   20.536797] page dumped because: has_unmovable_pages
> > [   20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
> > [   20.536815] flags: 0x7fffe0000000000()
> > [   20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
> > [   20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
> > 
> > bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for memory offline failures")
> > which is the first commit with which the messages appear.
> 
> I would bet this is CMA allocator. How much is tons? Maybe we want a
> rate limit or the other user is not really interested in them at all?

In other words, this should silence those messages.

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 4ae347cbc36d..4eb26d278046 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype)
 }
 #endif
 
+#define SKIP_HWPOISON	0x1
+#define REPORT_FAILURE	0x2
+
 bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-			 int migratetype, bool skip_hwpoisoned_pages);
+			 int migratetype, int flags);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
 				int migratetype, int *num_movable);
@@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page *page,
  * For isolating all pages in the range finally, the caller have to
  * free all pages in the range. test_page_isolated() can be used for
  * test it.
+ *
+ * The following flags are allowed (they can be combined in a bit mask)
+ * SKIP_HWPOISON - ignore hwpoison pages
+ * REPORT_FAILURE - report details about the failure to isolate the range
  */
 int
 start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			 unsigned migratetype, bool skip_hwpoisoned_pages);
+			 unsigned migratetype, int flags);
 
 /*
  * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c82193db4be6..8537429d33a6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page *page)
 	if (!zone_spans_pfn(zone, pfn))
 		return false;
 
-	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true);
+	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, SKIP_HWPOISON);
 }
 
 /* Checks if this range of memory is likely to be hot-removable. */
@@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
-				       MIGRATE_MOVABLE, true);
+				       MIGRATE_MOVABLE,
+				       SKIP_HWPOISON | REPORT_FAILURE);
 	if (ret) {
 		mem_hotplug_done();
 		reason = "failure to isolate range";
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ec2c7916dc2d..ee4043419791 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7754,8 +7754,7 @@ void *__init alloc_large_system_hash(const char *tablename,
  * race condition. So you can't expect this function should be exact.
  */
 bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
-			 int migratetype,
-			 bool skip_hwpoisoned_pages)
+			 int migratetype, int flags)
 {
 	unsigned long pfn, iter, found;
 
@@ -7818,7 +7817,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		 * The HWPoisoned page may be not in buddy system, and
 		 * page_count() is not 0.
 		 */
-		if (skip_hwpoisoned_pages && PageHWPoison(page))
+		if ((flags & SKIP_HWPOISON) && PageHWPoison(page))
 			continue;
 
 		if (__PageMovable(page))
@@ -7845,7 +7844,8 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 	return false;
 unmovable:
 	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
-	dump_page(pfn_to_page(pfn+iter), "unmovable page");
+	if (flags & REPORT_FAILURE)
+		dump_page(pfn_to_page(pfn+iter), "unmovable page");
 	return true;
 }
 
@@ -7972,8 +7972,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 */
 
 	ret = start_isolate_page_range(pfn_max_align_down(start),
-				       pfn_max_align_up(end), migratetype,
-				       false);
+				       pfn_max_align_up(end), migratetype, 0);
 	if (ret)
 		return ret;
 
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 43e085608846..ce323e56b34d 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -15,8 +15,7 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/page_isolation.h>
 
-static int set_migratetype_isolate(struct page *page, int migratetype,
-				bool skip_hwpoisoned_pages)
+static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
 {
 	struct zone *zone;
 	unsigned long flags, pfn;
@@ -60,8 +59,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype,
-				 skip_hwpoisoned_pages))
+	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype, flags))
 		ret = 0;
 
 	/*
@@ -185,7 +183,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * prevents two threads from simultaneously working on overlapping ranges.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			     unsigned migratetype, bool skip_hwpoisoned_pages)
+			     unsigned migratetype, int flags)
 {
 	unsigned long pfn;
 	unsigned long undo_pfn;
@@ -199,7 +197,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	     pfn += pageblock_nr_pages) {
 		page = __first_valid_page(pfn, pageblock_nr_pages);
 		if (page &&
-		    set_migratetype_isolate(page, migratetype, skip_hwpoisoned_pages)) {
+		    set_migratetype_isolate(page, migratetype, flags)) {
 			undo_pfn = pfn;
 			goto undo;
 		}
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
  2018-12-17 16:39   ` Michal Hocko
@ 2018-12-18  7:55     ` Heiko Carstens
  2018-12-18  9:09       ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Heiko Carstens @ 2018-12-18  7:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Oscar Salvador, Anshuman Khandual,
	Stephen Rothwell, linux-mm, linux-kernel, linux-next, linux-s390

On Mon, Dec 17, 2018 at 05:39:49PM +0100, Michal Hocko wrote:
> On Mon 17-12-18 17:03:50, Michal Hocko wrote:
> > On Mon 17-12-18 16:59:22, Heiko Carstens wrote:
> > > Hi Michal,
> > > 
> > > with linux-next as of today on s390 I see tons of messages like
> > > 
> > > [   20.536664] page dumped because: has_unmovable_pages
> > > [   20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
> > > [   20.536794] flags: 0x3fffe0000010200(slab|head)
> > > [   20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
> > > [   20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
> > > [   20.536797] page dumped because: has_unmovable_pages
> > > [   20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
> > > [   20.536815] flags: 0x7fffe0000000000()
> > > [   20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
> > > [   20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
> > > 
> > > bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for memory offline failures")
> > > which is the first commit with which the messages appear.
> > 
> > I would bet this is CMA allocator. How much is tons? Maybe we want a
> > rate limit or the other user is not really interested in them at all?

Yes, the system in question has a 4NB CMA area. "tons" translates to several hundred.

> In other words, this should silence those messages.

Yes, with the patch below applied the messages don't appear anymore.

> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 4ae347cbc36d..4eb26d278046 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype)
>  }
>  #endif
> 
> +#define SKIP_HWPOISON	0x1
> +#define REPORT_FAILURE	0x2
> +
>  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> -			 int migratetype, bool skip_hwpoisoned_pages);
> +			 int migratetype, int flags);
>  void set_pageblock_migratetype(struct page *page, int migratetype);
>  int move_freepages_block(struct zone *zone, struct page *page,
>  				int migratetype, int *num_movable);
> @@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page *page,
>   * For isolating all pages in the range finally, the caller have to
>   * free all pages in the range. test_page_isolated() can be used for
>   * test it.
> + *
> + * The following flags are allowed (they can be combined in a bit mask)
> + * SKIP_HWPOISON - ignore hwpoison pages
> + * REPORT_FAILURE - report details about the failure to isolate the range
>   */
>  int
>  start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> -			 unsigned migratetype, bool skip_hwpoisoned_pages);
> +			 unsigned migratetype, int flags);
> 
>  /*
>   * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c82193db4be6..8537429d33a6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page *page)
>  	if (!zone_spans_pfn(zone, pfn))
>  		return false;
> 
> -	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true);
> +	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, SKIP_HWPOISON);
>  }
> 
>  /* Checks if this range of memory is likely to be hot-removable. */
> @@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
> 
>  	/* set above range as isolated */
>  	ret = start_isolate_page_range(start_pfn, end_pfn,
> -				       MIGRATE_MOVABLE, true);
> +				       MIGRATE_MOVABLE,
> +				       SKIP_HWPOISON | REPORT_FAILURE);
>  	if (ret) {
>  		mem_hotplug_done();
>  		reason = "failure to isolate range";
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ec2c7916dc2d..ee4043419791 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7754,8 +7754,7 @@ void *__init alloc_large_system_hash(const char *tablename,
>   * race condition. So you can't expect this function should be exact.
>   */
>  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> -			 int migratetype,
> -			 bool skip_hwpoisoned_pages)
> +			 int migratetype, int flags)
>  {
>  	unsigned long pfn, iter, found;
> 
> @@ -7818,7 +7817,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>  		 * The HWPoisoned page may be not in buddy system, and
>  		 * page_count() is not 0.
>  		 */
> -		if (skip_hwpoisoned_pages && PageHWPoison(page))
> +		if ((flags & SKIP_HWPOISON) && PageHWPoison(page))
>  			continue;
> 
>  		if (__PageMovable(page))
> @@ -7845,7 +7844,8 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>  	return false;
>  unmovable:
>  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> -	dump_page(pfn_to_page(pfn+iter), "unmovable page");
> +	if (flags & REPORT_FAILURE)
> +		dump_page(pfn_to_page(pfn+iter), "unmovable page");
>  	return true;
>  }
> 
> @@ -7972,8 +7972,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	 */
> 
>  	ret = start_isolate_page_range(pfn_max_align_down(start),
> -				       pfn_max_align_up(end), migratetype,
> -				       false);
> +				       pfn_max_align_up(end), migratetype, 0);
>  	if (ret)
>  		return ret;
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 43e085608846..ce323e56b34d 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -15,8 +15,7 @@
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/page_isolation.h>
> 
> -static int set_migratetype_isolate(struct page *page, int migratetype,
> -				bool skip_hwpoisoned_pages)
> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
>  {
>  	struct zone *zone;
>  	unsigned long flags, pfn;
> @@ -60,8 +59,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
>  	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
>  	 * We just check MOVABLE pages.
>  	 */
> -	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype,
> -				 skip_hwpoisoned_pages))
> +	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype, flags))
>  		ret = 0;
> 
>  	/*
> @@ -185,7 +183,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
>   * prevents two threads from simultaneously working on overlapping ranges.
>   */
>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> -			     unsigned migratetype, bool skip_hwpoisoned_pages)
> +			     unsigned migratetype, int flags)
>  {
>  	unsigned long pfn;
>  	unsigned long undo_pfn;
> @@ -199,7 +197,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  	     pfn += pageblock_nr_pages) {
>  		page = __first_valid_page(pfn, pageblock_nr_pages);
>  		if (page &&
> -		    set_migratetype_isolate(page, migratetype, skip_hwpoisoned_pages)) {
> +		    set_migratetype_isolate(page, migratetype, flags)) {
>  			undo_pfn = pfn;
>  			goto undo;
>  		}
> -- 
> Michal Hocko
> SUSE Labs
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures"
  2018-12-18  7:55     ` Heiko Carstens
@ 2018-12-18  9:09       ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2018-12-18  9:09 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, Oscar Salvador, Anshuman Khandual,
	Stephen Rothwell, linux-mm, linux-kernel, linux-next, linux-s390

On Tue 18-12-18 08:55:38, Heiko Carstens wrote:
> On Mon, Dec 17, 2018 at 05:39:49PM +0100, Michal Hocko wrote:
> > On Mon 17-12-18 17:03:50, Michal Hocko wrote:
> > > On Mon 17-12-18 16:59:22, Heiko Carstens wrote:
> > > > Hi Michal,
> > > > 
> > > > with linux-next as of today on s390 I see tons of messages like
> > > > 
> > > > [   20.536664] page dumped because: has_unmovable_pages
> > > > [   20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
> > > > [   20.536794] flags: 0x3fffe0000010200(slab|head)
> > > > [   20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
> > > > [   20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
> > > > [   20.536797] page dumped because: has_unmovable_pages
> > > > [   20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
> > > > [   20.536815] flags: 0x7fffe0000000000()
> > > > [   20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
> > > > [   20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
> > > > 
> > > > bisect points to b323c049a999 ("mm, memory_hotplug: be more verbose for memory offline failures")
> > > > which is the first commit with which the messages appear.
> > > 
> > > I would bet this is CMA allocator. How much is tons? Maybe we want a
> > > rate limit or the other user is not really interested in them at all?
> 
> Yes, the system in question has a 4NB CMA area. "tons" translates to several hundred.

OK, I guess these messages on their own without a wider context are not
that helpful. It is still surprising to see slab pages or non-movable
pages in the CMA area. The later might be an CMA allocation I guess but
slab pages shouldn't be there at all AFAIU.
 
> > In other words, this should silence those messages.
> 
> Yes, with the patch below applied the messages don't appear anymore.

OK, I will post an official patch. Even if CMA allocator decides to
report failures it can simply add the flag.

Thanks!

> > diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> > index 4ae347cbc36d..4eb26d278046 100644
> > --- a/include/linux/page-isolation.h
> > +++ b/include/linux/page-isolation.h
> > @@ -30,8 +30,11 @@ static inline bool is_migrate_isolate(int migratetype)
> >  }
> >  #endif
> > 
> > +#define SKIP_HWPOISON	0x1
> > +#define REPORT_FAILURE	0x2
> > +
> >  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> > -			 int migratetype, bool skip_hwpoisoned_pages);
> > +			 int migratetype, int flags);
> >  void set_pageblock_migratetype(struct page *page, int migratetype);
> >  int move_freepages_block(struct zone *zone, struct page *page,
> >  				int migratetype, int *num_movable);
> > @@ -44,10 +47,14 @@ int move_freepages_block(struct zone *zone, struct page *page,
> >   * For isolating all pages in the range finally, the caller have to
> >   * free all pages in the range. test_page_isolated() can be used for
> >   * test it.
> > + *
> > + * The following flags are allowed (they can be combined in a bit mask)
> > + * SKIP_HWPOISON - ignore hwpoison pages
> > + * REPORT_FAILURE - report details about the failure to isolate the range
> >   */
> >  int
> >  start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> > -			 unsigned migratetype, bool skip_hwpoisoned_pages);
> > +			 unsigned migratetype, int flags);
> > 
> >  /*
> >   * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index c82193db4be6..8537429d33a6 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1226,7 +1226,7 @@ static bool is_pageblock_removable_nolock(struct page *page)
> >  	if (!zone_spans_pfn(zone, pfn))
> >  		return false;
> > 
> > -	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true);
> > +	return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, SKIP_HWPOISON);
> >  }
> > 
> >  /* Checks if this range of memory is likely to be hot-removable. */
> > @@ -1577,7 +1577,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
> > 
> >  	/* set above range as isolated */
> >  	ret = start_isolate_page_range(start_pfn, end_pfn,
> > -				       MIGRATE_MOVABLE, true);
> > +				       MIGRATE_MOVABLE,
> > +				       SKIP_HWPOISON | REPORT_FAILURE);
> >  	if (ret) {
> >  		mem_hotplug_done();
> >  		reason = "failure to isolate range";
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index ec2c7916dc2d..ee4043419791 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7754,8 +7754,7 @@ void *__init alloc_large_system_hash(const char *tablename,
> >   * race condition. So you can't expect this function should be exact.
> >   */
> >  bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> > -			 int migratetype,
> > -			 bool skip_hwpoisoned_pages)
> > +			 int migratetype, int flags)
> >  {
> >  	unsigned long pfn, iter, found;
> > 
> > @@ -7818,7 +7817,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> >  		 * The HWPoisoned page may be not in buddy system, and
> >  		 * page_count() is not 0.
> >  		 */
> > -		if (skip_hwpoisoned_pages && PageHWPoison(page))
> > +		if ((flags & SKIP_HWPOISON) && PageHWPoison(page))
> >  			continue;
> > 
> >  		if (__PageMovable(page))
> > @@ -7845,7 +7844,8 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> >  	return false;
> >  unmovable:
> >  	WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE);
> > -	dump_page(pfn_to_page(pfn+iter), "unmovable page");
> > +	if (flags & REPORT_FAILURE)
> > +		dump_page(pfn_to_page(pfn+iter), "unmovable page");
> >  	return true;
> >  }
> > 
> > @@ -7972,8 +7972,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
> >  	 */
> > 
> >  	ret = start_isolate_page_range(pfn_max_align_down(start),
> > -				       pfn_max_align_up(end), migratetype,
> > -				       false);
> > +				       pfn_max_align_up(end), migratetype, 0);
> >  	if (ret)
> >  		return ret;
> > 
> > diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> > index 43e085608846..ce323e56b34d 100644
> > --- a/mm/page_isolation.c
> > +++ b/mm/page_isolation.c
> > @@ -15,8 +15,7 @@
> >  #define CREATE_TRACE_POINTS
> >  #include <trace/events/page_isolation.h>
> > 
> > -static int set_migratetype_isolate(struct page *page, int migratetype,
> > -				bool skip_hwpoisoned_pages)
> > +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
> >  {
> >  	struct zone *zone;
> >  	unsigned long flags, pfn;
> > @@ -60,8 +59,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
> >  	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
> >  	 * We just check MOVABLE pages.
> >  	 */
> > -	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype,
> > -				 skip_hwpoisoned_pages))
> > +	if (!has_unmovable_pages(zone, page, arg.pages_found, migratetype, flags))
> >  		ret = 0;
> > 
> >  	/*
> > @@ -185,7 +183,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
> >   * prevents two threads from simultaneously working on overlapping ranges.
> >   */
> >  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> > -			     unsigned migratetype, bool skip_hwpoisoned_pages)
> > +			     unsigned migratetype, int flags)
> >  {
> >  	unsigned long pfn;
> >  	unsigned long undo_pfn;
> > @@ -199,7 +197,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> >  	     pfn += pageblock_nr_pages) {
> >  		page = __first_valid_page(pfn, pageblock_nr_pages);
> >  		if (page &&
> > -		    set_migratetype_isolate(page, migratetype, skip_hwpoisoned_pages)) {
> > +		    set_migratetype_isolate(page, migratetype, flags)) {
> >  			undo_pfn = pfn;
> >  			goto undo;
> >  		}
> > -- 
> > Michal Hocko
> > SUSE Labs
> > 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-18  9:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17 15:59 [-next] lots of messages due to "mm, memory_hotplug: be more verbose for memory offline failures" Heiko Carstens
2018-12-17 16:03 ` Michal Hocko
2018-12-17 16:39   ` Michal Hocko
2018-12-18  7:55     ` Heiko Carstens
2018-12-18  9:09       ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).