From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC309C4360C for ; Tue, 8 Oct 2019 16:01:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9C9AD21721 for ; Tue, 8 Oct 2019 16:01:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728132AbfJHQB4 (ORCPT ); Tue, 8 Oct 2019 12:01:56 -0400 Received: from outbound-smtp24.blacknight.com ([81.17.249.192]:46147 "EHLO outbound-smtp24.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726336AbfJHQBz (ORCPT ); Tue, 8 Oct 2019 12:01:55 -0400 X-Greylist: delayed 593 seconds by postgrey-1.27 at vger.kernel.org; Tue, 08 Oct 2019 12:01:53 EDT Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp24.blacknight.com (Postfix) with ESMTPS id EA601B887E for ; Tue, 8 Oct 2019 16:51:58 +0100 (IST) Received: (qmail 5557 invoked from network); 8 Oct 2019 15:51:58 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.19.210]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 8 Oct 2019 15:51:58 -0000 Date: Tue, 8 Oct 2019 16:51:56 +0100 From: Mel Gorman To: Vlastimil Babka Cc: Andrew Morton , Florian Weimer , Dave Chinner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] mm, compaction: fix wrong pfn handling in __reset_isolation_pfn() Message-ID: <20191008155156.GD3321@techsingularity.net> References: <20191008152915.24704-1-vbabka@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20191008152915.24704-1-vbabka@suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Oct 08, 2019 at 05:29:15PM +0200, Vlastimil Babka wrote: > Florian and Dave reported [1] a NULL pointer dereference in > __reset_isolation_pfn(). While the exact cause is unclear, staring at the code > revealed two bugs, which might be related. > I think the fix is a good fit. Even if the problem still occurs, it eliminates an important possibility. > One bug is that if zone starts in the middle of pageblock, block_page might > correspond to different pfn than block_pfn, and then the pfn_valid_within() > checks will check different pfn's than those accessed via struct page. This > might result in acessing an unitialized page in CONFIG_HOLES_IN_ZONE configs. > s/acessing/accessing/ Aside from HOLES_IN_ZONE, the patch addresses an issue if the start of the zone is not pageblock-aligned. While this is common, it's not guaranteed. I don't think this needs to be clarified in the changelog as your example is valid. I'm commenting in case someone decides not to try the patch because they feel HOLES_IN_ZONE is required. > The other bug is that end_page refers to the first page of next pageblock and > not last page of current pageblock. The online and valid check is then wrong > and with sections, the while (page < end_page) loop might wander off actual > struct page arrays. > > [1] https://lore.kernel.org/linux-xfs/87o8z1fvqu.fsf@mid.deneb.enyo.de/ > > Reported-by: Florian Weimer > Reported-by: Dave Chinner > Fixes: 6b0868c820ff ("mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints") > Cc: > Signed-off-by: Vlastimil Babka Acked-by: Mel Gorman Just one minor irrelevant note below. > --- > mm/compaction.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index ce08b39d85d4..672d3c78c6ab 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -270,14 +270,15 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > > /* Ensure the start of the pageblock or zone is online and valid */ > block_pfn = pageblock_start_pfn(pfn); > - block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn)); > + block_pfn = max(block_pfn, zone->zone_start_pfn); > + block_page = pfn_to_online_page(block_pfn); > if (block_page) { > page = block_page; > pfn = block_pfn; > } > > /* Ensure the end of the pageblock or zone is online and valid */ > - block_pfn += pageblock_nr_pages; > + block_pfn = pageblock_end_pfn(pfn) - 1; > block_pfn = min(block_pfn, zone_end_pfn(zone) - 1); > end_page = pfn_to_online_page(block_pfn); > if (!end_page) This is fine and is definetly fixing a potential issue. > @@ -303,7 +304,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source, > > page += (1 << PAGE_ALLOC_COSTLY_ORDER); > pfn += (1 << PAGE_ALLOC_COSTLY_ORDER); > - } while (page < end_page); > + } while (page <= end_page); > > return false; > } I think this is also ok as it's appropriate for PFN walkers in general of this style. However, I think it's unlikely to fix anything given that we are walking in steps of (1 << PAGE_ALLOC_COSTLY_ORDER) and the final page is not necessarily aligned on that boundary. Still, it's an improvement. Thanks -- Mel Gorman SUSE Labs