linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm, compaction: fix wrong pfn handling in __reset_isolation_pfn()
@ 2019-10-08 15:29 Vlastimil Babka
  2019-10-08 15:51 ` Mel Gorman
  0 siblings, 1 reply; 2+ messages in thread
From: Vlastimil Babka @ 2019-10-08 15:29 UTC (permalink / raw)
  To: Andrew Morton, Mel Gorman, Florian Weimer, Dave Chinner
  Cc: linux-mm, linux-kernel, linux-xfs, linux-fsdevel,
	Vlastimil Babka, stable

Florian and Dave reported [1] a NULL pointer dereference in
__reset_isolation_pfn(). While the exact cause is unclear, staring at the code
revealed two bugs, which might be related.

One bug is that if zone starts in the middle of pageblock, block_page might
correspond to different pfn than block_pfn, and then the pfn_valid_within()
checks will check different pfn's than those accessed via struct page. This
might result in acessing an unitialized page in CONFIG_HOLES_IN_ZONE configs.

The other bug is that end_page refers to the first page of next pageblock and
not last page of current pageblock. The online and valid check is then wrong
and with sections, the while (page < end_page) loop might wander off actual
struct page arrays.

[1] https://lore.kernel.org/linux-xfs/87o8z1fvqu.fsf@mid.deneb.enyo.de/

Reported-by: Florian Weimer <fw@deneb.enyo.de>
Reported-by: Dave Chinner <david@fromorbit.com>
Fixes: 6b0868c820ff ("mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/compaction.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ce08b39d85d4..672d3c78c6ab 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -270,14 +270,15 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
 
 	/* Ensure the start of the pageblock or zone is online and valid */
 	block_pfn = pageblock_start_pfn(pfn);
-	block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn));
+	block_pfn = max(block_pfn, zone->zone_start_pfn);
+	block_page = pfn_to_online_page(block_pfn);
 	if (block_page) {
 		page = block_page;
 		pfn = block_pfn;
 	}
 
 	/* Ensure the end of the pageblock or zone is online and valid */
-	block_pfn += pageblock_nr_pages;
+	block_pfn = pageblock_end_pfn(pfn) - 1;
 	block_pfn = min(block_pfn, zone_end_pfn(zone) - 1);
 	end_page = pfn_to_online_page(block_pfn);
 	if (!end_page)
@@ -303,7 +304,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
 
 		page += (1 << PAGE_ALLOC_COSTLY_ORDER);
 		pfn += (1 << PAGE_ALLOC_COSTLY_ORDER);
-	} while (page < end_page);
+	} while (page <= end_page);
 
 	return false;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] mm, compaction: fix wrong pfn handling in __reset_isolation_pfn()
  2019-10-08 15:29 [PATCH] mm, compaction: fix wrong pfn handling in __reset_isolation_pfn() Vlastimil Babka
@ 2019-10-08 15:51 ` Mel Gorman
  0 siblings, 0 replies; 2+ messages in thread
From: Mel Gorman @ 2019-10-08 15:51 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Florian Weimer, Dave Chinner, linux-mm,
	linux-kernel, linux-xfs, linux-fsdevel, stable

On Tue, Oct 08, 2019 at 05:29:15PM +0200, Vlastimil Babka wrote:
> Florian and Dave reported [1] a NULL pointer dereference in
> __reset_isolation_pfn(). While the exact cause is unclear, staring at the code
> revealed two bugs, which might be related.
> 

I think the fix is a good fit. Even if the problem still occurs, it
eliminates an important possibility.

> One bug is that if zone starts in the middle of pageblock, block_page might
> correspond to different pfn than block_pfn, and then the pfn_valid_within()
> checks will check different pfn's than those accessed via struct page. This
> might result in acessing an unitialized page in CONFIG_HOLES_IN_ZONE configs.
> 

s/acessing/accessing/

Aside from HOLES_IN_ZONE, the patch addresses an issue if the start
of the zone is not pageblock-aligned. While this is common, it's not
guaranteed. I don't think this needs to be clarified in the changelog as
your example is valid. I'm commenting in case someone decides not to try
the patch because they feel HOLES_IN_ZONE is required.

> The other bug is that end_page refers to the first page of next pageblock and
> not last page of current pageblock. The online and valid check is then wrong
> and with sections, the while (page < end_page) loop might wander off actual
> struct page arrays.
> 
> [1] https://lore.kernel.org/linux-xfs/87o8z1fvqu.fsf@mid.deneb.enyo.de/
> 
> Reported-by: Florian Weimer <fw@deneb.enyo.de>
> Reported-by: Dave Chinner <david@fromorbit.com>
> Fixes: 6b0868c820ff ("mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Mel Gorman <mgorman@techsingularity.net>

Just one minor irrelevant note below.

> ---
>  mm/compaction.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ce08b39d85d4..672d3c78c6ab 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -270,14 +270,15 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
>  
>  	/* Ensure the start of the pageblock or zone is online and valid */
>  	block_pfn = pageblock_start_pfn(pfn);
> -	block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn));
> +	block_pfn = max(block_pfn, zone->zone_start_pfn);
> +	block_page = pfn_to_online_page(block_pfn);
>  	if (block_page) {
>  		page = block_page;
>  		pfn = block_pfn;
>  	}
>  
>  	/* Ensure the end of the pageblock or zone is online and valid */
> -	block_pfn += pageblock_nr_pages;
> +	block_pfn = pageblock_end_pfn(pfn) - 1;
>  	block_pfn = min(block_pfn, zone_end_pfn(zone) - 1);
>  	end_page = pfn_to_online_page(block_pfn);
>  	if (!end_page)

This is fine and is definetly fixing a potential issue.

> @@ -303,7 +304,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
>  
>  		page += (1 << PAGE_ALLOC_COSTLY_ORDER);
>  		pfn += (1 << PAGE_ALLOC_COSTLY_ORDER);
> -	} while (page < end_page);
> +	} while (page <= end_page);
>  
>  	return false;
>  }

I think this is also ok as it's appropriate for PFN walkers in general of
this style. However, I think it's unlikely to fix anything given that we
are walking in steps of (1 << PAGE_ALLOC_COSTLY_ORDER) and the final page
is not necessarily aligned on that boundary. Still, it's an improvement.

Thanks

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-08 15:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 15:29 [PATCH] mm, compaction: fix wrong pfn handling in __reset_isolation_pfn() Vlastimil Babka
2019-10-08 15:51 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).