All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 12:47 ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 12:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
same way as in move_freepages_block() simply fixes those crashes.

Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/page_alloc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..9edee36e6a74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not as move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+						~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 12:47 ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 12:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
same way as in move_freepages_block() simply fixes those crashes.

Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/page_alloc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..9edee36e6a74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not as move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+						~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 12:47 ` Daniel Vacek
@ 2018-03-01 13:10   ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-01 13:10 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 13:47:45, Daniel Vacek wrote:
> In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
> due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
> same way as in move_freepages_block() simply fixes those crashes.

This changelog doesn't describe how the fix works. Why doesn't
memblock_next_valid_pfn return the first valid pfn as one would expect?

It would be also good put the panic info in the changelog.

> Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
> Signed-off-by: Daniel Vacek <neelx@redhat.com>
> Cc: stable@vger.kernel.org
> ---
>  mm/page_alloc.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cb416723538f..9edee36e6a74 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
> -			 * on our next iteration of the loop.
> +			 * on our next iteration of the loop. Note that it needs
> +			 * to be pageblock aligned even when the region itself
> +			 * is not as move_freepages_block() can shift ahead of
> +			 * the valid region but still depends on correct page
> +			 * metadata.
>  			 */
> -			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> +			pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +						~(pageblock_nr_pages-1)) - 1;
>  #endif
>  			continue;
>  		}
> -- 
> 2.16.2
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 13:10   ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-01 13:10 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 13:47:45, Daniel Vacek wrote:
> In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
> due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
> same way as in move_freepages_block() simply fixes those crashes.

This changelog doesn't describe how the fix works. Why doesn't
memblock_next_valid_pfn return the first valid pfn as one would expect?

It would be also good put the panic info in the changelog.

> Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
> Signed-off-by: Daniel Vacek <neelx@redhat.com>
> Cc: stable@vger.kernel.org
> ---
>  mm/page_alloc.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cb416723538f..9edee36e6a74 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
> -			 * on our next iteration of the loop.
> +			 * on our next iteration of the loop. Note that it needs
> +			 * to be pageblock aligned even when the region itself
> +			 * is not as move_freepages_block() can shift ahead of
> +			 * the valid region but still depends on correct page
> +			 * metadata.
>  			 */
> -			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> +			pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +						~(pageblock_nr_pages-1)) - 1;
>  #endif
>  			continue;
>  		}
> -- 
> 2.16.2
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 13:10   ` Michal Hocko
@ 2018-03-01 15:09     ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 15:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

ffffe31d01ed8000  7b600000                0        0  0 0
On Thu, Mar 1, 2018 at 2:10 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 13:47:45, Daniel Vacek wrote:
>> In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
>> due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
>> same way as in move_freepages_block() simply fixes those crashes.
>
> This changelog doesn't describe how the fix works. Why doesn't
> memblock_next_valid_pfn return the first valid pfn as one would expect?

Actually it does. The point is it is not guaranteed to be pageblock
aligned. And we
actually want to initialize even those page structures which are
outside of the range.
Hence the alignment here.

For example from reproducer machine, memory map from e820/BIOS:

$ grep 7b7ff000 /proc/iomem
7b7ff000-7b7fffff : System RAM

Page structures before commit b92df1de5d28:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
fffff73941e00000  78000000                0        0  1 1fffff00000000
fffff73941ed7fc0  7b5ff000                0        0  1 1fffff00000000
fffff73941ed8000  7b600000                0        0  1 1fffff00000000
fffff73941edff80  7b7fe000                0        0  1 1fffff00000000
fffff73941edffc0  7b7ff000 ffff8e67e04d3ae0     ad84  1 1fffff00020068
uptodate,lru,active,mappedtodisk    <<<< start of the range here
fffff73941ee0000  7b800000                0        0  1 1fffff00000000
fffff73941ffffc0  7ffff000                0        0  1 1fffff00000000

So far so good.

After commit b92df1de5d28 machine eventually crashes with:

BUG at mm/page_alloc.c:1913

>         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

>From registers and stack I digged start_page points to
ffffe31d01ed8000 (note that this is
page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffe31d01e00000  78000000                0        0  0 0
ffffe31d01ed7fc0  7b5ff000                0        0  0 0
ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
that nodeid and zonenr are encoded in top bits of page flags which are
not initialized here, hence the crash :-(
ffffe31d01edff80  7b7fe000                0        0  0 0
ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000

With my fix applied:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001e00000  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  1 1fffff00000000
   <<<< vital data filled in here this time \o/
ffffea0001edff80  7b7fe000                0        0  1 1fffff00000000
ffffea0001edffc0  7b7ff000 ffff88017fb13720        8  2 1fffff00020068
uptodate,lru,active,mappedtodisk
ffffea0001ee0000  7b800000                0        0  1 1fffff00000000
ffffea0001ffffc0  7ffff000                0        0  1 1fffff00000000

We are not interested in the beginning of whole section. Just the
pages in the first
populated block where the range begins are important (actually just
the first one really, but...).


> It would be also good put the panic info in the changelog.

Of course I forgot to link the related bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=196443

Though it is not very well explained there as well. I hope my notes
above make it clear.


>> Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
>> Signed-off-by: Daniel Vacek <neelx@redhat.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  mm/page_alloc.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cb416723538f..9edee36e6a74 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>                       /*
>>                        * Skip to the pfn preceding the next valid one (or
>>                        * end_pfn), such that we hit a valid pfn (or end_pfn)
>> -                      * on our next iteration of the loop.
>> +                      * on our next iteration of the loop. Note that it needs
>> +                      * to be pageblock aligned even when the region itself
>> +                      * is not as move_freepages_block() can shift ahead of
>> +                      * the valid region but still depends on correct page
>> +                      * metadata.
>>                        */
>> -                     pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
>> +                     pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> +                                             ~(pageblock_nr_pages-1)) - 1;
>>  #endif
>>                       continue;
>>               }
>> --
>> 2.16.2
>>
>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 15:09     ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 15:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

ffffe31d01ed8000  7b600000                0        0  0 0
On Thu, Mar 1, 2018 at 2:10 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 13:47:45, Daniel Vacek wrote:
>> In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
>> due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
>> same way as in move_freepages_block() simply fixes those crashes.
>
> This changelog doesn't describe how the fix works. Why doesn't
> memblock_next_valid_pfn return the first valid pfn as one would expect?

Actually it does. The point is it is not guaranteed to be pageblock
aligned. And we
actually want to initialize even those page structures which are
outside of the range.
Hence the alignment here.

For example from reproducer machine, memory map from e820/BIOS:

$ grep 7b7ff000 /proc/iomem
7b7ff000-7b7fffff : System RAM

Page structures before commit b92df1de5d28:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
fffff73941e00000  78000000                0        0  1 1fffff00000000
fffff73941ed7fc0  7b5ff000                0        0  1 1fffff00000000
fffff73941ed8000  7b600000                0        0  1 1fffff00000000
fffff73941edff80  7b7fe000                0        0  1 1fffff00000000
fffff73941edffc0  7b7ff000 ffff8e67e04d3ae0     ad84  1 1fffff00020068
uptodate,lru,active,mappedtodisk    <<<< start of the range here
fffff73941ee0000  7b800000                0        0  1 1fffff00000000
fffff73941ffffc0  7ffff000                0        0  1 1fffff00000000

So far so good.

After commit b92df1de5d28 machine eventually crashes with:

BUG at mm/page_alloc.c:1913

>         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

>From registers and stack I digged start_page points to
ffffe31d01ed8000 (note that this is
page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffe31d01e00000  78000000                0        0  0 0
ffffe31d01ed7fc0  7b5ff000                0        0  0 0
ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
that nodeid and zonenr are encoded in top bits of page flags which are
not initialized here, hence the crash :-(
ffffe31d01edff80  7b7fe000                0        0  0 0
ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000

With my fix applied:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
7b800000 7ffff000 80000000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001e00000  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  1 1fffff00000000
   <<<< vital data filled in here this time \o/
ffffea0001edff80  7b7fe000                0        0  1 1fffff00000000
ffffea0001edffc0  7b7ff000 ffff88017fb13720        8  2 1fffff00020068
uptodate,lru,active,mappedtodisk
ffffea0001ee0000  7b800000                0        0  1 1fffff00000000
ffffea0001ffffc0  7ffff000                0        0  1 1fffff00000000

We are not interested in the beginning of whole section. Just the
pages in the first
populated block where the range begins are important (actually just
the first one really, but...).


> It would be also good put the panic info in the changelog.

Of course I forgot to link the related bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=196443

Though it is not very well explained there as well. I hope my notes
above make it clear.


>> Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
>> Signed-off-by: Daniel Vacek <neelx@redhat.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  mm/page_alloc.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cb416723538f..9edee36e6a74 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>                       /*
>>                        * Skip to the pfn preceding the next valid one (or
>>                        * end_pfn), such that we hit a valid pfn (or end_pfn)
>> -                      * on our next iteration of the loop.
>> +                      * on our next iteration of the loop. Note that it needs
>> +                      * to be pageblock aligned even when the region itself
>> +                      * is not as move_freepages_block() can shift ahead of
>> +                      * the valid region but still depends on correct page
>> +                      * metadata.
>>                        */
>> -                     pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
>> +                     pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> +                                             ~(pageblock_nr_pages-1)) - 1;
>>  #endif
>>                       continue;
>>               }
>> --
>> 2.16.2
>>
>
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 15:09     ` Daniel Vacek
@ 2018-03-01 15:27       ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-01 15:27 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
[...]
> $ grep 7b7ff000 /proc/iomem
> 7b7ff000-7b7fffff : System RAM
[...]
> After commit b92df1de5d28 machine eventually crashes with:
> 
> BUG at mm/page_alloc.c:1913
> 
> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

This is an important information that should be in the changelog.

> >From registers and stack I digged start_page points to
> ffffe31d01ed8000 (note that this is
> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
> 
> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
> 7b800000 7ffff000 80000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffe31d01e00000  78000000                0        0  0 0
> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note

Are those ranges covered by the System RAM as well?

> that nodeid and zonenr are encoded in top bits of page flags which are
> not initialized here, hence the crash :-(
> ffffe31d01edff80  7b7fe000                0        0  0 0
> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000

It is still not clear why not to do the alignment in
memblock_next_valid_pfn rahter than its caller.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 15:27       ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-01 15:27 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
[...]
> $ grep 7b7ff000 /proc/iomem
> 7b7ff000-7b7fffff : System RAM
[...]
> After commit b92df1de5d28 machine eventually crashes with:
> 
> BUG at mm/page_alloc.c:1913
> 
> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

This is an important information that should be in the changelog.

> >From registers and stack I digged start_page points to
> ffffe31d01ed8000 (note that this is
> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
> 
> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
> 7b800000 7ffff000 80000000
>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
> ffffe31d01e00000  78000000                0        0  0 0
> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note

Are those ranges covered by the System RAM as well?

> that nodeid and zonenr are encoded in top bits of page flags which are
> not initialized here, hence the crash :-(
> ffffe31d01edff80  7b7fe000                0        0  0 0
> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000

It is still not clear why not to do the alignment in
memblock_next_valid_pfn rahter than its caller.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 15:27       ` Michal Hocko
@ 2018-03-01 16:20         ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 16:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
> [...]
>> $ grep 7b7ff000 /proc/iomem
>> 7b7ff000-7b7fffff : System RAM
> [...]
>> After commit b92df1de5d28 machine eventually crashes with:
>>
>> BUG at mm/page_alloc.c:1913
>>
>> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
>
> This is an important information that should be in the changelog.

And that's exactly what my seven very first words tried to express in
human readable form instead of mechanically pasting the source code. I
guess that's a matter of preference. Though I see grepping later can
be an issue here.

>> >From registers and stack I digged start_page points to
>> ffffe31d01ed8000 (note that this is
>> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
>>
>> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
>> 7b800000 7ffff000 80000000
>>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>> ffffe31d01e00000  78000000                0        0  0 0
>> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
>> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
>
> Are those ranges covered by the System RAM as well?
>
>> that nodeid and zonenr are encoded in top bits of page flags which are
>> not initialized here, hence the crash :-(
>> ffffe31d01edff80  7b7fe000                0        0  0 0
>> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
>> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
>> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000
>
> It is still not clear why not to do the alignment in
> memblock_next_valid_pfn rather than its caller.

As it's the mem init which needs it to be aligned. Other callers may
not, possibly?
Not that there are any other callers at the moment so it really does
not matter where it is placed. The only difference would be the end of
the loop with end_pfn vs aligned end_pfn. And it looks like the pure
(unaligned) end_pfn would be preferred here. Wanna me send a v2?

> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 16:20         ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 16:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
> [...]
>> $ grep 7b7ff000 /proc/iomem
>> 7b7ff000-7b7fffff : System RAM
> [...]
>> After commit b92df1de5d28 machine eventually crashes with:
>>
>> BUG at mm/page_alloc.c:1913
>>
>> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
>
> This is an important information that should be in the changelog.

And that's exactly what my seven very first words tried to express in
human readable form instead of mechanically pasting the source code. I
guess that's a matter of preference. Though I see grepping later can
be an issue here.

>> >From registers and stack I digged start_page points to
>> ffffe31d01ed8000 (note that this is
>> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
>>
>> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
>> 7b800000 7ffff000 80000000
>>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>> ffffe31d01e00000  78000000                0        0  0 0
>> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
>> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
>
> Are those ranges covered by the System RAM as well?
>
>> that nodeid and zonenr are encoded in top bits of page flags which are
>> not initialized here, hence the crash :-(
>> ffffe31d01edff80  7b7fe000                0        0  0 0
>> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
>> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
>> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000
>
> It is still not clear why not to do the alignment in
> memblock_next_valid_pfn rather than its caller.

As it's the mem init which needs it to be aligned. Other callers may
not, possibly?
Not that there are any other callers at the moment so it really does
not matter where it is placed. The only difference would be the end of
the loop with end_pfn vs aligned end_pfn. And it looks like the pure
(unaligned) end_pfn would be preferred here. Wanna me send a v2?

> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 15:27       ` Michal Hocko
@ 2018-03-01 17:24         ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 17:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
>> From registers and stack I digged start_page points to
>> ffffe31d01ed8000 (note that this is
>> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
>>
>> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
>> 7b800000 7ffff000 80000000
>>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>> ffffe31d01e00000  78000000                0        0  0 0
>> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
>> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
>
> Are those ranges covered by the System RAM as well?

Sorry I forgot to answer this. If they were, the loop won't be
skipping them, right? But it really does not matter here, kernel needs
(some) page structures initialized anyways. And I do not feel
comfortable with removing the VM_BUG_ON(). The initialization is what
changed with commit b92df1de5d28, hence fixing this.

--nX

>> that nodeid and zonenr are encoded in top bits of page flags which are
>> not initialized here, hence the crash :-(
>> ffffe31d01edff80  7b7fe000                0        0  0 0
>> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
>> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
>> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000
>
> It is still not clear why not to do the alignment in
> memblock_next_valid_pfn rahter than its caller.
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 17:24         ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-01 17:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
>> From registers and stack I digged start_page points to
>> ffffe31d01ed8000 (note that this is
>> page ffffe31d01edffc0 aligned to pageblock) and I can see this in memory dump:
>>
>> crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
>> 7b800000 7ffff000 80000000
>>       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
>> ffffe31d01e00000  78000000                0        0  0 0
>> ffffe31d01ed7fc0  7b5ff000                0        0  0 0
>> ffffe31d01ed8000  7b600000                0        0  0 0    <<<< note
>
> Are those ranges covered by the System RAM as well?

Sorry I forgot to answer this. If they were, the loop won't be
skipping them, right? But it really does not matter here, kernel needs
(some) page structures initialized anyways. And I do not feel
comfortable with removing the VM_BUG_ON(). The initialization is what
changed with commit b92df1de5d28, hence fixing this.

--nX

>> that nodeid and zonenr are encoded in top bits of page flags which are
>> not initialized here, hence the crash :-(
>> ffffe31d01edff80  7b7fe000                0        0  0 0
>> ffffe31d01edffc0  7b7ff000                0        0  1 1fffff00000000
>> ffffe31d01ee0000  7b800000                0        0  1 1fffff00000000
>> ffffe31d01ffffc0  7ffff000                0        0  1 1fffff00000000
>
> It is still not clear why not to do the alignment in
> memblock_next_valid_pfn rahter than its caller.
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 16:20         ` Daniel Vacek
@ 2018-03-01 23:21           ` Andrew Morton
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2018-03-01 23:21 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: Michal Hocko, linux-kernel, linux-mm, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, 1 Mar 2018 17:20:04 +0100 Daniel Vacek <neelx@redhat.com> wrote:

> Wanna me send a v2?

Yes please ;)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-01 23:21           ` Andrew Morton
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2018-03-01 23:21 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: Michal Hocko, linux-kernel, linux-mm, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu, 1 Mar 2018 17:20:04 +0100 Daniel Vacek <neelx@redhat.com> wrote:

> Wanna me send a v2?

Yes please ;)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 16:20         ` Daniel Vacek
@ 2018-03-02 10:54           ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 10:54 UTC (permalink / raw)
  To: Michal Hocko, Paul Burton
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, stable

On Thu, Mar 1, 2018 at 5:20 PM, Daniel Vacek <neelx@redhat.com> wrote:
> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> It is still not clear why not to do the alignment in
>> memblock_next_valid_pfn rather than its caller.
>
> As it's the mem init which needs it to be aligned. Other callers may
> not, possibly?
> Not that there are any other callers at the moment so it really does
> not matter where it is placed. The only difference would be the end of
> the loop with end_pfn vs aligned end_pfn. And it looks like the pure
> (unaligned) end_pfn would be preferred here. Wanna me send a v2?

Thinking about it again memblock has nothing to do with pageblock. And
the function name suggests one shall get a next valid pfn, not
something totally unrelated to memblock. So that's what it returns.
It's the mem init which needs to align this and hence mem init aligns
it for it's purposes. I'd call this the correct design.

To deal with the end_pfn special case I'd actually get rid of it
completely and hardcode -1UL as max pfn instead (rather than 0).
Caller should handle max pfn as an error or end of the loop as here in
this case.

I'll send a v2 with this implemented.

Paul> Why is it based on memblock actually? Wouldn't a generic
mem_section solution work satisfiable for you? That would be natively
aligned with whole section (doing a bit more work as a result in the
end) and also independent of CONFIG_HAVE_MEMBLOCK_NODE_MAP
availability.

>> --
>> Michal Hocko
>> SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-02 10:54           ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 10:54 UTC (permalink / raw)
  To: Michal Hocko, Paul Burton
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, stable

On Thu, Mar 1, 2018 at 5:20 PM, Daniel Vacek <neelx@redhat.com> wrote:
> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> It is still not clear why not to do the alignment in
>> memblock_next_valid_pfn rather than its caller.
>
> As it's the mem init which needs it to be aligned. Other callers may
> not, possibly?
> Not that there are any other callers at the moment so it really does
> not matter where it is placed. The only difference would be the end of
> the loop with end_pfn vs aligned end_pfn. And it looks like the pure
> (unaligned) end_pfn would be preferred here. Wanna me send a v2?

Thinking about it again memblock has nothing to do with pageblock. And
the function name suggests one shall get a next valid pfn, not
something totally unrelated to memblock. So that's what it returns.
It's the mem init which needs to align this and hence mem init aligns
it for it's purposes. I'd call this the correct design.

To deal with the end_pfn special case I'd actually get rid of it
completely and hardcode -1UL as max pfn instead (rather than 0).
Caller should handle max pfn as an error or end of the loop as here in
this case.

I'll send a v2 with this implemented.

Paul> Why is it based on memblock actually? Wouldn't a generic
mem_section solution work satisfiable for you? That would be natively
aligned with whole section (doing a bit more work as a result in the
end) and also independent of CONFIG_HAVE_MEMBLOCK_NODE_MAP
availability.

>> --
>> Michal Hocko
>> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 12:47 ` Daniel Vacek
@ 2018-03-02 11:01   ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 11:01 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

BUG at mm/page_alloc.c:1913

>	VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone()
the same way as in move_freepages_block().

Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  9 +++++++--
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..2a5facd236bb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
+	phys_addr_t addr = PFN_PHYS(++pfn);
 
 	do {
 		mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 				  type->regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
+			/* addr is within the region, so pfn is valid */
+			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
-		return max_pfn;
+		return -1UL;
 	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+		return PHYS_PFN(type->regions[right].base);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..eb27ccb50928 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not as move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn) &
+					~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-02 11:01   ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 11:01 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

BUG at mm/page_alloc.c:1913

>	VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone()
the same way as in move_freepages_block().

Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  9 +++++++--
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..2a5facd236bb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
+	phys_addr_t addr = PFN_PHYS(++pfn);
 
 	do {
 		mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 				  type->regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
+			/* addr is within the region, so pfn is valid */
+			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
-		return max_pfn;
+		return -1UL;
 	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+		return PHYS_PFN(type->regions[right].base);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..eb27ccb50928 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not as move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn) &
+					~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-01 16:20         ` Daniel Vacek
@ 2018-03-02 13:01           ` Michal Hocko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-02 13:01 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 17:20:04, Daniel Vacek wrote:
> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
> > [...]
> >> $ grep 7b7ff000 /proc/iomem
> >> 7b7ff000-7b7fffff : System RAM
> > [...]
> >> After commit b92df1de5d28 machine eventually crashes with:
> >>
> >> BUG at mm/page_alloc.c:1913
> >>
> >> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
> >
> > This is an important information that should be in the changelog.
> 
> And that's exactly what my seven very first words tried to express in
> human readable form instead of mechanically pasting the source code. I
> guess that's a matter of preference. Though I see grepping later can
> be an issue here.

Do not get me wrong I do not want to nag just for fun of it. The
changelog should be really clear about the problem. What might be clear
to you based on the debugging might not be so clear to others. And the
struct page initialization code is far from trivial especially when we
have different alignment requirements by the memory model and the page
allocator.

Therefore being as clear as possible is really valuable. So I would
really love to see the changelog to contain.
- What is going on - VM_BUG_ON in move_freepages along with the crash
  report
- memory ranges exported by BIOS/FW
- explain why is the pageblock alignment the proper one. How does the
  range look from the memory section POV (with SPARSEMEM).
- What about those unaligned pages which are not backed by any memory?
  Are they reserved so that they will never get used?

And just to be clear. I am not saying your patch is wrong. It just
raises more questions than answers and I suspect it just papers over
some more fundamental problem. I might be clearly wrong and I cannot
deserve this more time for the next week because I will be offline
but I would _really_ appreciate if this all got explained.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-02 13:01           ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-03-02 13:01 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Thu 01-03-18 17:20:04, Daniel Vacek wrote:
> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
> > [...]
> >> $ grep 7b7ff000 /proc/iomem
> >> 7b7ff000-7b7fffff : System RAM
> > [...]
> >> After commit b92df1de5d28 machine eventually crashes with:
> >>
> >> BUG at mm/page_alloc.c:1913
> >>
> >> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
> >
> > This is an important information that should be in the changelog.
> 
> And that's exactly what my seven very first words tried to express in
> human readable form instead of mechanically pasting the source code. I
> guess that's a matter of preference. Though I see grepping later can
> be an issue here.

Do not get me wrong I do not want to nag just for fun of it. The
changelog should be really clear about the problem. What might be clear
to you based on the debugging might not be so clear to others. And the
struct page initialization code is far from trivial especially when we
have different alignment requirements by the memory model and the page
allocator.

Therefore being as clear as possible is really valuable. So I would
really love to see the changelog to contain.
- What is going on - VM_BUG_ON in move_freepages along with the crash
  report
- memory ranges exported by BIOS/FW
- explain why is the pageblock alignment the proper one. How does the
  range look from the memory section POV (with SPARSEMEM).
- What about those unaligned pages which are not backed by any memory?
  Are they reserved so that they will never get used?

And just to be clear. I am not saying your patch is wrong. It just
raises more questions than answers and I suspect it just papers over
some more fundamental problem. I might be clearly wrong and I cannot
deserve this more time for the next week because I will be offline
but I would _really_ appreciate if this all got explained.

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-02 13:01           ` Michal Hocko
@ 2018-03-02 15:27             ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 15:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Fri, Mar 2, 2018 at 2:01 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 17:20:04, Daniel Vacek wrote:
>> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
>> > [...]
>> >> $ grep 7b7ff000 /proc/iomem
>> >> 7b7ff000-7b7fffff : System RAM
>> > [...]
>> >> After commit b92df1de5d28 machine eventually crashes with:
>> >>
>> >> BUG at mm/page_alloc.c:1913
>> >>
>> >> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
>> >
>> > This is an important information that should be in the changelog.
>>
>> And that's exactly what my seven very first words tried to express in
>> human readable form instead of mechanically pasting the source code. I
>> guess that's a matter of preference. Though I see grepping later can
>> be an issue here.
>
> Do not get me wrong I do not want to nag just for fun of it. The
> changelog should be really clear about the problem. What might be clear
> to you based on the debugging might not be so clear to others. And the
> struct page initialization code is far from trivial especially when we
> have different alignment requirements by the memory model and the page
> allocator.

I get it. I didn't mean to be rude or something. I just thought I
covered all the relevant details..

> Therefore being as clear as possible is really valuable. So I would
> really love to see the changelog to contain.
> - What is going on - VM_BUG_ON in move_freepages along with the crash
>   report

I'll put more details there.

> - memory ranges exported by BIOS/FW

They were not mentioned as they are not really relevant. Any e820 map
can have issues. Now I only saw reports on few selected machines,
mostly LENOVO System x3650 M5, some FUJITSU, some Cisco blades. But
the map is always fairly normal. IIUC, the bug only happens if the
range which is not pageblock aligned happens to be the first one in a
zone or following after an not-populated section.

Again, nothing of that is really relevant. What is is that the commit
b92df1de5d28 changes the way page structures are initialized so that
for some perfectly fine maps from BIOS kernel now can crash as a
result. And my fix tries to keep at least the bare minimum of the
original behavior needed to keep kernel stable.

> - explain why is the pageblock alignment the proper one. How does the
>   range look from the memory section POV (with SPARSEMEM).

The commit message explains that. "the same way as in
move_freepages_block()" to quote myself. The alignment in this
function is the one causing the crash as the VM_BUG_ON() assert in
subsequential move_freepages() is checking the (now) uninitialized
structure. If we follow this alignment the initialization will not get
skipped for that structure. Again, this is partially restoring the
original behavior rather than rewriting move_freepages{,_block} to not
crash with some data it was not designed for.

I'll try to explain this more transparently in commit message.

Alternatively you can just revert the b92df1de5d28. That will fix the
crashes as well.

> - What about those unaligned pages which are not backed by any memory?
>   Are they reserved so that they will never get used?

They are handled the same way as it used to be before b92df1de5d28.
This patch does not change or touch anything with this regards. Or am
I wrong?

> And just to be clear. I am not saying your patch is wrong. It just

You better not. My patch it totally correct :p
(I hope)

> raises more questions than answers and I suspect it just papers over
> some more fundamental problem. I might be clearly wrong and I cannot

I see. Thank you for looking into it. It's appreciated. I would not
call it a fundamental problem, rather a design of
move_freepages{,_block} which I'd vote for keeping for now. Hopefully
I explained it above.

> deserve this more time for the next week because I will be offline

Enjoy your time off.

> but I would _really_ appreciate if this all got explained.

I'll do my best.

> Thanks!
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-02 15:27             ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-02 15:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Fri, Mar 2, 2018 at 2:01 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 01-03-18 17:20:04, Daniel Vacek wrote:
>> On Thu, Mar 1, 2018 at 4:27 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Thu 01-03-18 16:09:35, Daniel Vacek wrote:
>> > [...]
>> >> $ grep 7b7ff000 /proc/iomem
>> >> 7b7ff000-7b7fffff : System RAM
>> > [...]
>> >> After commit b92df1de5d28 machine eventually crashes with:
>> >>
>> >> BUG at mm/page_alloc.c:1913
>> >>
>> >> >         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
>> >
>> > This is an important information that should be in the changelog.
>>
>> And that's exactly what my seven very first words tried to express in
>> human readable form instead of mechanically pasting the source code. I
>> guess that's a matter of preference. Though I see grepping later can
>> be an issue here.
>
> Do not get me wrong I do not want to nag just for fun of it. The
> changelog should be really clear about the problem. What might be clear
> to you based on the debugging might not be so clear to others. And the
> struct page initialization code is far from trivial especially when we
> have different alignment requirements by the memory model and the page
> allocator.

I get it. I didn't mean to be rude or something. I just thought I
covered all the relevant details..

> Therefore being as clear as possible is really valuable. So I would
> really love to see the changelog to contain.
> - What is going on - VM_BUG_ON in move_freepages along with the crash
>   report

I'll put more details there.

> - memory ranges exported by BIOS/FW

They were not mentioned as they are not really relevant. Any e820 map
can have issues. Now I only saw reports on few selected machines,
mostly LENOVO System x3650 M5, some FUJITSU, some Cisco blades. But
the map is always fairly normal. IIUC, the bug only happens if the
range which is not pageblock aligned happens to be the first one in a
zone or following after an not-populated section.

Again, nothing of that is really relevant. What is is that the commit
b92df1de5d28 changes the way page structures are initialized so that
for some perfectly fine maps from BIOS kernel now can crash as a
result. And my fix tries to keep at least the bare minimum of the
original behavior needed to keep kernel stable.

> - explain why is the pageblock alignment the proper one. How does the
>   range look from the memory section POV (with SPARSEMEM).

The commit message explains that. "the same way as in
move_freepages_block()" to quote myself. The alignment in this
function is the one causing the crash as the VM_BUG_ON() assert in
subsequential move_freepages() is checking the (now) uninitialized
structure. If we follow this alignment the initialization will not get
skipped for that structure. Again, this is partially restoring the
original behavior rather than rewriting move_freepages{,_block} to not
crash with some data it was not designed for.

I'll try to explain this more transparently in commit message.

Alternatively you can just revert the b92df1de5d28. That will fix the
crashes as well.

> - What about those unaligned pages which are not backed by any memory?
>   Are they reserved so that they will never get used?

They are handled the same way as it used to be before b92df1de5d28.
This patch does not change or touch anything with this regards. Or am
I wrong?

> And just to be clear. I am not saying your patch is wrong. It just

You better not. My patch it totally correct :p
(I hope)

> raises more questions than answers and I suspect it just papers over
> some more fundamental problem. I might be clearly wrong and I cannot

I see. Thank you for looking into it. It's appreciated. I would not
call it a fundamental problem, rather a design of
move_freepages{,_block} which I'd vote for keeping for now. Hopefully
I explained it above.

> deserve this more time for the next week because I will be offline

Enjoy your time off.

> but I would _really_ appreciate if this all got explained.

I'll do my best.

> Thanks!
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3 0/2] mm/page_alloc: fix kernel BUG at mm/page_alloc.c:1913! crash in move_freepages()
  2018-03-01 12:47 ` Daniel Vacek
@ 2018-03-03  0:12   ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

Kernel can crash on failed VM_BUG_ON assertion in function move_freepages()
on some rare physical memory mappings (with huge range(s) of memory
reserved by BIOS followed by usable memory not aligned to pageblock).

crash> page_init_bug -v | grep resource | sed '/RAM .3/,/RAM .4/!d'
<struct resource 0xffff88067fffd480>      4bfac000 -     646b1fff	System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd4b8>      646b2000 -     793fefff	reserved (333.30 MiB = 341300.00 KiB)
<struct resource 0xffff88067fffd4f0>      793ff000 -     7b3fefff	ACPI Non-volatile Storage ( 32.00 MiB)
<struct resource 0xffff88067fffd528>      7b3ff000 -     7b787fff	ACPI Tables (  3.54 MiB = 3620.00 KiB)
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)

More details in second patch.

v2: Use -1 constant for max_pfn and remove the parameter. That's
    mostly just a cosmetics.
v3: Split to two patches series to make clear what is the actual fix
    and what is just a clean up. No code changes compared to v2 and
    second patch is identical to original v1.

Cc: stable@vger.kernel.org

Daniel Vacek (2):
  mm/memblock: hardcode the max_pfn being -1
  mm/page_alloc: fix memmap_init_zone pageblock alignment

 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  9 +++++++--
 2 files changed, 13 insertions(+), 9 deletions(-)

-- 
2.16.2

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3 0/2] mm/page_alloc: fix kernel BUG at mm/page_alloc.c:1913! crash in move_freepages()
@ 2018-03-03  0:12   ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

Kernel can crash on failed VM_BUG_ON assertion in function move_freepages()
on some rare physical memory mappings (with huge range(s) of memory
reserved by BIOS followed by usable memory not aligned to pageblock).

crash> page_init_bug -v | grep resource | sed '/RAM .3/,/RAM .4/!d'
<struct resource 0xffff88067fffd480>      4bfac000 -     646b1fff	System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd4b8>      646b2000 -     793fefff	reserved (333.30 MiB = 341300.00 KiB)
<struct resource 0xffff88067fffd4f0>      793ff000 -     7b3fefff	ACPI Non-volatile Storage ( 32.00 MiB)
<struct resource 0xffff88067fffd528>      7b3ff000 -     7b787fff	ACPI Tables (  3.54 MiB = 3620.00 KiB)
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)

More details in second patch.

v2: Use -1 constant for max_pfn and remove the parameter. That's
    mostly just a cosmetics.
v3: Split to two patches series to make clear what is the actual fix
    and what is just a clean up. No code changes compared to v2 and
    second patch is identical to original v1.

Cc: stable@vger.kernel.org

Daniel Vacek (2):
  mm/memblock: hardcode the max_pfn being -1
  mm/page_alloc: fix memmap_init_zone pageblock alignment

 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  9 +++++++--
 2 files changed, 13 insertions(+), 9 deletions(-)

-- 
2.16.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3 1/2] mm/memblock: hardcode the end_pfn being -1
  2018-03-03  0:12   ` Daniel Vacek
@ 2018-03-03  0:12     ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

This is just a clean up. It aids preventing to handle the special end case
in the next commit.

Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  2 +-
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..2a5facd236bb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
+	phys_addr_t addr = PFN_PHYS(++pfn);
 
 	do {
 		mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 				  type->regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
+			/* addr is within the region, so pfn is valid */
+			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
-		return max_pfn;
+		return -1UL;
 	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+		return PHYS_PFN(type->regions[right].base);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..f2c57da5bbe5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = memblock_next_valid_pfn(pfn) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 1/2] mm/memblock: hardcode the end_pfn being -1
@ 2018-03-03  0:12     ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

This is just a clean up. It aids preventing to handle the special end case
in the next commit.

Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/memblock.c   | 13 ++++++-------
 mm/page_alloc.c |  2 +-
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..2a5facd236bb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
+	phys_addr_t addr = PFN_PHYS(++pfn);
 
 	do {
 		mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 				  type->regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
+			/* addr is within the region, so pfn is valid */
+			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
-		return max_pfn;
+		return -1UL;
 	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+		return PHYS_PFN(type->regions[right].base);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..f2c57da5bbe5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+			pfn = memblock_next_valid_pfn(pfn) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-03  0:12   ` Daniel Vacek
@ 2018-03-03  0:12     ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone()
the same way as in move_freepages_block().

>From one of the RHEL reports:

crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP 
--
RIP: 0010:[<ffffffff8118833e>]  [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688  EFLAGS: 00010087
--
Call Trace:
 [<ffffffff811883b3>] move_freepages_block+0x73/0x80
 [<ffffffff81189e63>] __rmqueue+0x263/0x460
 [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
 [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP  [<ffffffff8118833e>] move_freepages+0x15e/0x160
 RSP <ffff88054d727688>

crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8>          1000 -        9bfff	System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0>        100000 -     430bffff	System RAM (  1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410>      4b0c8000 -     4bf9cfff	System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480>      4bfac000 -     646b1fff	System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640>     100000000 -    67fffffff	System RAM ( 22.00 GiB)

crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
<struct page 0xffffea0001ede200>   1fffff00000000  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000>                0  0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA               1       4095
<struct page 0xffffea0001edffc0>   1fffff00000400  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
BUG, zones differ!

Note that this range follows two not populated sections 68000000-77ffffff
in this zone. 7b788000-7b7fffff is the first one after a gap. This makes
memmap_init_zone() skip all the pfns up to the beginning of this range.
But this range is not pageblock (2M) aligned. In fact no range has to be.

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001ed7fc0  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  0 0	<<<<
ffffea0001ede1c0  7b787000                0        0  0 0
ffffea0001ede200  7b788000                0        0  1 1fffff00000000

Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).

crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0

crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0

Initialization of the whole beginning of the section is skipped up to the
start of the range due to the commit b92df1de5d28. Now any code calling
move_freepages_block() (like reusing the page from a freelist as in this
example) with a page from the beginning of the range will get the page
rounded down to start_page ffffea0001ed8000 and passed to move_freepages()
which crashes on assertion getting wrong zonenr.

>         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

Note, page_zone() derives the zone from page flags here.


>From similar machine before commit b92df1de5d28:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
fffff73941e00000  78000000                0        0  1 1fffff00000000
fffff73941ed7fc0  7b5ff000                0        0  1 1fffff00000000
fffff73941ed8000  7b600000                0        0  1 1fffff00000000
fffff73941edff80  7b7fe000                0        0  1 1fffff00000000
fffff73941edffc0  7b7ff000 ffff8e67e04d3ae0     ad84  1 1fffff00020068 uptodate,lru,active,mappedtodisk

All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.

The same machine with this fix applied:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001e00000  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  1 1fffff00000000
ffffea0001edff80  7b7fe000                0        0  1 1fffff00000000
ffffea0001edffc0  7b7ff000 ffff88017fb13720        8  2 1fffff00020068 uptodate,lru,active,mappedtodisk

At least the bare minimum of pages is initialized preventing the crash as well.


Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/page_alloc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f2c57da5bbe5..eb27ccb50928 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not. move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn) &
+					~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-03  0:12     ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  0:12 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Mel Gorman,
	Pavel Tatashin, Paul Burton, Daniel Vacek, stable

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone()
the same way as in move_freepages_block().

>From one of the RHEL reports:

crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP 
--
RIP: 0010:[<ffffffff8118833e>]  [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688  EFLAGS: 00010087
--
Call Trace:
 [<ffffffff811883b3>] move_freepages_block+0x73/0x80
 [<ffffffff81189e63>] __rmqueue+0x263/0x460
 [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
 [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP  [<ffffffff8118833e>] move_freepages+0x15e/0x160
 RSP <ffff88054d727688>

crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8>          1000 -        9bfff	System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0>        100000 -     430bffff	System RAM (  1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410>      4b0c8000 -     4bf9cfff	System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480>      4bfac000 -     646b1fff	System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640>     100000000 -    67fffffff	System RAM ( 22.00 GiB)

crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560>      7b788000 -     7b7fffff	System RAM (480.00 KiB)
<struct page 0xffffea0001ede200>   1fffff00000000  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000>                0  0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA               1       4095
<struct page 0xffffea0001edffc0>   1fffff00000400  0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32          4096    1048575
BUG, zones differ!

Note that this range follows two not populated sections 68000000-77ffffff
in this zone. 7b788000-7b7fffff is the first one after a gap. This makes
memmap_init_zone() skip all the pfns up to the beginning of this range.
But this range is not pageblock (2M) aligned. In fact no range has to be.

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001ed7fc0  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  0 0	<<<<
ffffea0001ede1c0  7b787000                0        0  0 0
ffffea0001ede200  7b788000                0        0  1 1fffff00000000

Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).

crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0

crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0

Initialization of the whole beginning of the section is skipped up to the
start of the range due to the commit b92df1de5d28. Now any code calling
move_freepages_block() (like reusing the page from a freelist as in this
example) with a page from the beginning of the range will get the page
rounded down to start_page ffffea0001ed8000 and passed to move_freepages()
which crashes on assertion getting wrong zonenr.

>         VM_BUG_ON(page_zone(start_page) != page_zone(end_page));

Note, page_zone() derives the zone from page flags here.


>From similar machine before commit b92df1de5d28:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
fffff73941e00000  78000000                0        0  1 1fffff00000000
fffff73941ed7fc0  7b5ff000                0        0  1 1fffff00000000
fffff73941ed8000  7b600000                0        0  1 1fffff00000000
fffff73941edff80  7b7fe000                0        0  1 1fffff00000000
fffff73941edffc0  7b7ff000 ffff8e67e04d3ae0     ad84  1 1fffff00020068 uptodate,lru,active,mappedtodisk

All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.

The same machine with this fix applied:

crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0001e00000  78000000                0        0  0 0
ffffea0001e00000  7b5ff000                0        0  0 0
ffffea0001ed8000  7b600000                0        0  1 1fffff00000000
ffffea0001edff80  7b7fe000                0        0  1 1fffff00000000
ffffea0001edffc0  7b7ff000 ffff88017fb13720        8  2 1fffff00020068 uptodate,lru,active,mappedtodisk

At least the bare minimum of pages is initialized preventing the crash as well.


Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx@redhat.com>
Cc: stable@vger.kernel.org
---
 mm/page_alloc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f2c57da5bbe5..eb27ccb50928 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
-			 * on our next iteration of the loop.
+			 * on our next iteration of the loop. Note that it needs
+			 * to be pageblock aligned even when the region itself
+			 * is not. move_freepages_block() can shift ahead of
+			 * the valid region but still depends on correct page
+			 * metadata.
 			 */
-			pfn = memblock_next_valid_pfn(pfn) - 1;
+			pfn = (memblock_next_valid_pfn(pfn) &
+					~(pageblock_nr_pages-1)) - 1;
 #endif
 			continue;
 		}
-- 
2.16.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-03  0:12     ` Daniel Vacek
@ 2018-03-03  0:40       ` Andrew Morton
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2018-03-03  0:40 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:

> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") introduced a bug where move_freepages() triggers a
> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.

b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
been reported before now?

This makes me wonder whether a -stable backport is really needed...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-03  0:40       ` Andrew Morton
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Morton @ 2018-03-03  0:40 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:

> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") introduced a bug where move_freepages() triggers a
> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.

b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
been reported before now?

This makes me wonder whether a -stable backport is really needed...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-03  0:40       ` Andrew Morton
@ 2018-03-03  1:08         ` Daniel Vacek
  -1 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  1:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:
>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") introduced a bug where move_freepages() triggers a
>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>
> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
> been reported before now?

Yeah. I was surprised myself I couldn't find a fix to backport to
RHEL. But actually customers started to report this as soon as 7.4
(where b92df1de5d28 was merged in RHEL) was released. I remember
reports from September/October-ish times. It's not easily reproduced
and happens on a handful of machines only. I guess that's why. But
that does not make it less serious, I think.

Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443

And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS: https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121

Enough? ;-)

> This makes me wonder whether a -stable backport is really needed...

For some machines it definitely is. Won't hurt either, IMHO.

--nX

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
@ 2018-03-03  1:08         ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-03  1:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Mel Gorman, Pavel Tatashin, Paul Burton, stable

On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:
>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") introduced a bug where move_freepages() triggers a
>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>
> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
> been reported before now?

Yeah. I was surprised myself I couldn't find a fix to backport to
RHEL. But actually customers started to report this as soon as 7.4
(where b92df1de5d28 was merged in RHEL) was released. I remember
reports from September/October-ish times. It's not easily reproduced
and happens on a handful of machines only. I guess that's why. But
that does not make it less serious, I think.

Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443

And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS: https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121

Enough? ;-)

> This makes me wonder whether a -stable backport is really needed...

For some machines it definitely is. Won't hurt either, IMHO.

--nX

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-03  1:08         ` Daniel Vacek
  (?)
@ 2018-03-12 12:26         ` Sudeep Holla
  2018-03-12 14:49           ` Naresh Kamboju
  -1 siblings, 1 reply; 38+ messages in thread
From: Sudeep Holla @ 2018-03-12 12:26 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton, stable,
	Sudeep Holla

Hi,

I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
pageblock alignment"
cause boot hang on my ARM64 platform.

Log:
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem
0x0000000000000000-0x00000009ffffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x00000009ffffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x00000000f8f9afff]
[    0.000000]   node   0: [mem 0x00000000f8f9b000-0x00000000f908ffff]
[    0.000000]   node   0: [mem 0x00000000f9090000-0x00000000f914ffff]
[    0.000000]   node   0: [mem 0x00000000f9150000-0x00000000f920ffff]
[    0.000000]   node   0: [mem 0x00000000f9210000-0x00000000f922ffff]
[    0.000000]   node   0: [mem 0x00000000f9230000-0x00000000f95bffff]
[    0.000000]   node   0: [mem 0x00000000f95c0000-0x00000000fe58ffff]
[    0.000000]   node   0: [mem 0x00000000fe590000-0x00000000fe5cffff]
[    0.000000]   node   0: [mem 0x00000000fe5d0000-0x00000000fe5dffff]
[    0.000000]   node   0: [mem 0x00000000fe5e0000-0x00000000fe62ffff]
[    0.000000]   node   0: [mem 0x00000000fe630000-0x00000000feffffff]
[    0.000000]   node   0: [mem 0x0000000880000000-0x00000009ffffffff]
[    0.000000]  Initmem setup node 0 [mem 0x0000000080000000-0x00000009ffffffff]

On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek <neelx@redhat.com> wrote:
> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:
>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") introduced a bug where move_freepages() triggers a
>>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>
>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>> been reported before now?
>
> Yeah. I was surprised myself I couldn't find a fix to backport to
> RHEL. But actually customers started to report this as soon as 7.4
> (where b92df1de5d28 was merged in RHEL) was released. I remember
> reports from September/October-ish times. It's not easily reproduced
> and happens on a handful of machines only. I guess that's why. But
> that does not make it less serious, I think.
>
> Though there actually is a report here:
> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>
> And there are reports for Fedora from July:
> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
> and CentOS: https://bugs.centos.org/view.php?id=13964
> and we internally track several dozens reports for RHEL bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>
> Enough? ;-)
>
>> This makes me wonder whether a -stable backport is really needed...
>
> For some machines it definitely is. Won't hurt either, IMHO.
>
> --nX

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-12 12:26         ` Sudeep Holla
@ 2018-03-12 14:49           ` Naresh Kamboju
  2018-03-12 16:51             ` Daniel Vacek
  0 siblings, 1 reply; 38+ messages in thread
From: Naresh Kamboju @ 2018-03-12 14:49 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Daniel Vacek, Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton,
	linux- stable

On 12 March 2018 at 17:56, Sudeep Holla <sudeep.holla@arm.com> wrote:
> Hi,
>
> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
> pageblock alignment"
> cause boot hang on my ARM64 platform.

I have also noticed this problem on hi6220 Hikey - arm64.

LKFT: linux-next: Hikey boot failed linux-next-20180308
https://bugs.linaro.org/show_bug.cgi?id=3676

- Naresh

>
> Log:
> [    0.000000] NUMA: No NUMA configuration found
> [    0.000000] NUMA: Faking a node at [mem
> 0x0000000000000000-0x00000009ffffffff]
> [    0.000000] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
> [    0.000000] Zone ranges:
> [    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000ffffffff]
> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000009ffffffff]
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000f8f9afff]
> [    0.000000]   node   0: [mem 0x00000000f8f9b000-0x00000000f908ffff]
> [    0.000000]   node   0: [mem 0x00000000f9090000-0x00000000f914ffff]
> [    0.000000]   node   0: [mem 0x00000000f9150000-0x00000000f920ffff]
> [    0.000000]   node   0: [mem 0x00000000f9210000-0x00000000f922ffff]
> [    0.000000]   node   0: [mem 0x00000000f9230000-0x00000000f95bffff]
> [    0.000000]   node   0: [mem 0x00000000f95c0000-0x00000000fe58ffff]
> [    0.000000]   node   0: [mem 0x00000000fe590000-0x00000000fe5cffff]
> [    0.000000]   node   0: [mem 0x00000000fe5d0000-0x00000000fe5dffff]
> [    0.000000]   node   0: [mem 0x00000000fe5e0000-0x00000000fe62ffff]
> [    0.000000]   node   0: [mem 0x00000000fe630000-0x00000000feffffff]
> [    0.000000]   node   0: [mem 0x0000000880000000-0x00000009ffffffff]
> [    0.000000]  Initmem setup node 0 [mem 0x0000000080000000-0x00000009ffffffff]
>
> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek <neelx@redhat.com> wrote:
>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek <neelx@redhat.com> wrote:
>>>
>>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>>> where possible") introduced a bug where move_freepages() triggers a
>>>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>>
>>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>>> been reported before now?
>>
>> Yeah. I was surprised myself I couldn't find a fix to backport to
>> RHEL. But actually customers started to report this as soon as 7.4
>> (where b92df1de5d28 was merged in RHEL) was released. I remember
>> reports from September/October-ish times. It's not easily reproduced
>> and happens on a handful of machines only. I guess that's why. But
>> that does not make it less serious, I think.
>>
>> Though there actually is a report here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>>
>> And there are reports for Fedora from July:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
>> and CentOS: https://bugs.centos.org/view.php?id=13964
>> and we internally track several dozens reports for RHEL bug
>> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>>
>> Enough? ;-)
>>
>>> This makes me wonder whether a -stable backport is really needed...
>>
>> For some machines it definitely is. Won't hurt either, IMHO.
>>
>> --nX

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-12 14:49           ` Naresh Kamboju
@ 2018-03-12 16:51             ` Daniel Vacek
  2018-03-12 17:11               ` Sudeep Holla
  2018-03-13  6:34               ` Naresh Kamboju
  0 siblings, 2 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-12 16:51 UTC (permalink / raw)
  To: Sudeep Holla, Naresh Kamboju
  Cc: Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton,
	linux- stable

On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
<naresh.kamboju@linaro.org> wrote:
> On 12 March 2018 at 17:56, Sudeep Holla <sudeep.holla@arm.com> wrote:
>> Hi,
>>
>> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>> pageblock alignment"
>> cause boot hang on my ARM64 platform.
>
> I have also noticed this problem on hi6220 Hikey - arm64.
>
> LKFT: linux-next: Hikey boot failed linux-next-20180308
> https://bugs.linaro.org/show_bug.cgi?id=3676
>
> - Naresh
>
>>
>> Log:
>> [    0.000000] NUMA: No NUMA configuration found
>> [    0.000000] NUMA: Faking a node at [mem
>> 0x0000000000000000-0x00000009ffffffff]
>> [    0.000000] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000ffffffff]
>> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000009ffffffff]
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000f8f9afff]
>> [    0.000000]   node   0: [mem 0x00000000f8f9b000-0x00000000f908ffff]
>> [    0.000000]   node   0: [mem 0x00000000f9090000-0x00000000f914ffff]
>> [    0.000000]   node   0: [mem 0x00000000f9150000-0x00000000f920ffff]
>> [    0.000000]   node   0: [mem 0x00000000f9210000-0x00000000f922ffff]
>> [    0.000000]   node   0: [mem 0x00000000f9230000-0x00000000f95bffff]
>> [    0.000000]   node   0: [mem 0x00000000f95c0000-0x00000000fe58ffff]
>> [    0.000000]   node   0: [mem 0x00000000fe590000-0x00000000fe5cffff]
>> [    0.000000]   node   0: [mem 0x00000000fe5d0000-0x00000000fe5dffff]
>> [    0.000000]   node   0: [mem 0x00000000fe5e0000-0x00000000fe62ffff]
>> [    0.000000]   node   0: [mem 0x00000000fe630000-0x00000000feffffff]
>> [    0.000000]   node   0: [mem 0x0000000880000000-0x00000009ffffffff]
>> [    0.000000]  Initmem setup node 0 [mem 0x0000000080000000-0x00000009ffffffff]
>>
>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek <neelx@redhat.com> wrote:
>>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>>>
>>>> This makes me wonder whether a -stable backport is really needed...
>>>
>>> For some machines it definitely is. Won't hurt either, IMHO.
>>>
>>> --nX

Hmm, does it step back perhaps?

Can you check if below cures the boot hang?

--nX

~~~~
neelx@metal:~/nX/src/linux$ git diff
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3d974cb2a1a1..415571120bbd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
size, int nid, unsigned long zone,
                         * the valid region but still depends on correct page
                         * metadata.
                         */
-                       pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+                       unsigned long next_pfn;
+                       next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
                                        ~(pageblock_nr_pages-1)) - 1;
+                       pfn = max(next_pfn, pfn);
 #endif
                        continue;
                }
~~~~

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-12 16:51             ` Daniel Vacek
@ 2018-03-12 17:11               ` Sudeep Holla
  2018-03-13  6:34               ` Naresh Kamboju
  1 sibling, 0 replies; 38+ messages in thread
From: Sudeep Holla @ 2018-03-12 17:11 UTC (permalink / raw)
  To: Daniel Vacek, Naresh Kamboju
  Cc: Sudeep Holla, Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton,
	linux- stable



On 12/03/18 16:51, Daniel Vacek wrote:
[...]

> 
> Hmm, does it step back perhaps?
> 
> Can you check if below cures the boot hang?
> 

Yes it does fix the boot hang.

> --nX
> 
> ~~~~
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>                          * the valid region but still depends on correct page
>                          * metadata.
>                          */
> -                       pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +                       unsigned long next_pfn;
> +                       next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>                                         ~(pageblock_nr_pages-1)) - 1;
> +                       pfn = max(next_pfn, pfn);
>  #endif
>                         continue;
>                 }
> ~~~~
> 

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-12 16:51             ` Daniel Vacek
  2018-03-12 17:11               ` Sudeep Holla
@ 2018-03-13  6:34               ` Naresh Kamboju
  2018-03-13 22:47                 ` Daniel Vacek
  1 sibling, 1 reply; 38+ messages in thread
From: Naresh Kamboju @ 2018-03-13  6:34 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: Sudeep Holla, Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton,
	linux- stable

On 12 March 2018 at 22:21, Daniel Vacek <neelx@redhat.com> wrote:
> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
> <naresh.kamboju@linaro.org> wrote:
>> On 12 March 2018 at 17:56, Sudeep Holla <sudeep.holla@arm.com> wrote:
>>> Hi,
>>>
>>> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
>>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>>> pageblock alignment"
>>> cause boot hang on my ARM64 platform.
>>
>> I have also noticed this problem on hi6220 Hikey - arm64.
>>
>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>
>> - Naresh
>>
>>>
>>> Log:
>>> [    0.000000] NUMA: No NUMA configuration found
>>> [    0.000000] NUMA: Faking a node at [mem
>>> 0x0000000000000000-0x00000009ffffffff]
>>> [    0.000000] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000ffffffff]
>>> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000009ffffffff]
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000f8f9afff]
>>> [    0.000000]   node   0: [mem 0x00000000f8f9b000-0x00000000f908ffff]
>>> [    0.000000]   node   0: [mem 0x00000000f9090000-0x00000000f914ffff]
>>> [    0.000000]   node   0: [mem 0x00000000f9150000-0x00000000f920ffff]
>>> [    0.000000]   node   0: [mem 0x00000000f9210000-0x00000000f922ffff]
>>> [    0.000000]   node   0: [mem 0x00000000f9230000-0x00000000f95bffff]
>>> [    0.000000]   node   0: [mem 0x00000000f95c0000-0x00000000fe58ffff]
>>> [    0.000000]   node   0: [mem 0x00000000fe590000-0x00000000fe5cffff]
>>> [    0.000000]   node   0: [mem 0x00000000fe5d0000-0x00000000fe5dffff]
>>> [    0.000000]   node   0: [mem 0x00000000fe5e0000-0x00000000fe62ffff]
>>> [    0.000000]   node   0: [mem 0x00000000fe630000-0x00000000feffffff]
>>> [    0.000000]   node   0: [mem 0x0000000880000000-0x00000009ffffffff]
>>> [    0.000000]  Initmem setup node 0 [mem 0x0000000080000000-0x00000009ffffffff]
>>>
>>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek <neelx@redhat.com> wrote:
>>>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>>>>
>>>>> This makes me wonder whether a -stable backport is really needed...
>>>>
>>>> For some machines it definitely is. Won't hurt either, IMHO.
>>>>
>>>> --nX
>
> Hmm, does it step back perhaps?
>
> Can you check if below cures the boot hang?
>
> --nX
>
> ~~~~
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>                          * the valid region but still depends on correct page
>                          * metadata.
>                          */
> -                       pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +                       unsigned long next_pfn;
> +                       next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>                                         ~(pageblock_nr_pages-1)) - 1;
> +                       pfn = max(next_pfn, pfn);
>  #endif
>                         continue;
>                 }

After applying this patch on linux-next the boot hang problem resolved.
Now the hi6220-hikey is booting successfully.
Thank you.

- Naresh

> ~~~~

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
  2018-03-13  6:34               ` Naresh Kamboju
@ 2018-03-13 22:47                 ` Daniel Vacek
  0 siblings, 0 replies; 38+ messages in thread
From: Daniel Vacek @ 2018-03-13 22:47 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Sudeep Holla, Andrew Morton, open list, linux-mm, Michal Hocko,
	Vlastimil Babka, Mel Gorman, Pavel Tatashin, Paul Burton,
	linux- stable

On Tue, Mar 13, 2018 at 7:34 AM, Naresh Kamboju
<naresh.kamboju@linaro.org> wrote:
> On 12 March 2018 at 22:21, Daniel Vacek <neelx@redhat.com> wrote:
>> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
>> <naresh.kamboju@linaro.org> wrote:
>>> On 12 March 2018 at 17:56, Sudeep Holla <sudeep.holla@arm.com> wrote:
>>>> Hi,
>>>>
>>>> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
>>>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>>>> pageblock alignment"
>>>> cause boot hang on my ARM64 platform.
>>>
>>> I have also noticed this problem on hi6220 Hikey - arm64.
>>>
>>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>>
>>> - Naresh
>>>
>>>>
>>>> Log:
>>>> [    0.000000] NUMA: No NUMA configuration found
>>>> [    0.000000] NUMA: Faking a node at [mem
>>>> 0x0000000000000000-0x00000009ffffffff]
>>>> [    0.000000] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>>>> [    0.000000] Zone ranges:
>>>> [    0.000000]   DMA32    [mem 0x0000000080000000-0x00000000ffffffff]
>>>> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000009ffffffff]
>>>> [    0.000000] Movable zone start for each node
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000080000000-0x00000000f8f9afff]
>>>> [    0.000000]   node   0: [mem 0x00000000f8f9b000-0x00000000f908ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f9090000-0x00000000f914ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f9150000-0x00000000f920ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f9210000-0x00000000f922ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f9230000-0x00000000f95bffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f95c0000-0x00000000fe58ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000fe590000-0x00000000fe5cffff]
>>>> [    0.000000]   node   0: [mem 0x00000000fe5d0000-0x00000000fe5dffff]
>>>> [    0.000000]   node   0: [mem 0x00000000fe5e0000-0x00000000fe62ffff]
>>>> [    0.000000]   node   0: [mem 0x00000000fe630000-0x00000000feffffff]
>>>> [    0.000000]   node   0: [mem 0x0000000880000000-0x00000009ffffffff]
>>>> [    0.000000]  Initmem setup node 0 [mem 0x0000000080000000-0x00000009ffffffff]
>>>>
>>>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek <neelx@redhat.com> wrote:
>>>>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>>>>>
>>>>>> This makes me wonder whether a -stable backport is really needed...
>>>>>
>>>>> For some machines it definitely is. Won't hurt either, IMHO.
>>>>>
>>>>> --nX
>>
>> Hmm, does it step back perhaps?
>>
>> Can you check if below cures the boot hang?
>>
>> --nX
>>
>> ~~~~
>> neelx@metal:~/nX/src/linux$ git diff
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3d974cb2a1a1..415571120bbd 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
>> size, int nid, unsigned long zone,
>>                          * the valid region but still depends on correct page
>>                          * metadata.
>>                          */
>> -                       pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> +                       unsigned long next_pfn;
>> +                       next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>>                                         ~(pageblock_nr_pages-1)) - 1;
>> +                       pfn = max(next_pfn, pfn);
>>  #endif
>>                         continue;
>>                 }
>
> After applying this patch on linux-next the boot hang problem resolved.
> Now the hi6220-hikey is booting successfully.
> Thank you.

Thank you and Sudeep for testing. I've just sent Andrew a formal patch.

>
> - Naresh
>
>> ~~~~

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-03-13 22:47 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-01 12:47 [PATCH] mm/page_alloc: fix memmap_init_zone pageblock alignment Daniel Vacek
2018-03-01 12:47 ` Daniel Vacek
2018-03-01 13:10 ` Michal Hocko
2018-03-01 13:10   ` Michal Hocko
2018-03-01 15:09   ` Daniel Vacek
2018-03-01 15:09     ` Daniel Vacek
2018-03-01 15:27     ` Michal Hocko
2018-03-01 15:27       ` Michal Hocko
2018-03-01 16:20       ` Daniel Vacek
2018-03-01 16:20         ` Daniel Vacek
2018-03-01 23:21         ` Andrew Morton
2018-03-01 23:21           ` Andrew Morton
2018-03-02 10:54         ` Daniel Vacek
2018-03-02 10:54           ` Daniel Vacek
2018-03-02 13:01         ` Michal Hocko
2018-03-02 13:01           ` Michal Hocko
2018-03-02 15:27           ` Daniel Vacek
2018-03-02 15:27             ` Daniel Vacek
2018-03-01 17:24       ` Daniel Vacek
2018-03-01 17:24         ` Daniel Vacek
2018-03-02 11:01 ` [PATCH v2] " Daniel Vacek
2018-03-02 11:01   ` Daniel Vacek
2018-03-03  0:12 ` [PATCH v3 0/2] mm/page_alloc: fix kernel BUG at mm/page_alloc.c:1913! crash in move_freepages() Daniel Vacek
2018-03-03  0:12   ` Daniel Vacek
2018-03-03  0:12   ` [PATCH v3 1/2] mm/memblock: hardcode the end_pfn being -1 Daniel Vacek
2018-03-03  0:12     ` Daniel Vacek
2018-03-03  0:12   ` [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment Daniel Vacek
2018-03-03  0:12     ` Daniel Vacek
2018-03-03  0:40     ` Andrew Morton
2018-03-03  0:40       ` Andrew Morton
2018-03-03  1:08       ` Daniel Vacek
2018-03-03  1:08         ` Daniel Vacek
2018-03-12 12:26         ` Sudeep Holla
2018-03-12 14:49           ` Naresh Kamboju
2018-03-12 16:51             ` Daniel Vacek
2018-03-12 17:11               ` Sudeep Holla
2018-03-13  6:34               ` Naresh Kamboju
2018-03-13 22:47                 ` Daniel Vacek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.