From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3849BC388F9 for ; Mon, 23 Nov 2020 13:01:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ADF392076E for ; Mon, 23 Nov 2020 13:01:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ADF392076E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C3AC56B005D; Mon, 23 Nov 2020 08:01:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BEC3A6B0070; Mon, 23 Nov 2020 08:01:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD99F6B0071; Mon, 23 Nov 2020 08:01:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0031.hostedemail.com [216.40.44.31]) by kanga.kvack.org (Postfix) with ESMTP id 8253E6B005D for ; Mon, 23 Nov 2020 08:01:19 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 192E38249980 for ; Mon, 23 Nov 2020 13:01:19 +0000 (UTC) X-FDA: 77515693878.10.bag66_0a031dc27365 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id CF7DA16A0D1 for ; Mon, 23 Nov 2020 13:01:18 +0000 (UTC) X-HE-Tag: bag66_0a031dc27365 X-Filterd-Recvd-Size: 5108 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 23 Nov 2020 13:01:18 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8B828ABCE; Mon, 23 Nov 2020 13:01:16 +0000 (UTC) To: Andrea Arcangeli , Mel Gorman , Andrew Morton , linux-mm@kvack.org, Qian Cai Cc: Michal Hocko , David Hildenbrand , linux-kernel@vger.kernel.org, Mike Rapoport , Baoquan He References: <8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw> <20201121194506.13464-1-aarcange@redhat.com> <20201121194506.13464-2-aarcange@redhat.com> From: Vlastimil Babka Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock_skip on reserved pages Message-ID: Date: Mon, 23 Nov 2020 14:01:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.3 MIME-Version: 1.0 In-Reply-To: <20201121194506.13464-2-aarcange@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/21/20 8:45 PM, Andrea Arcangeli wrote: > A corollary issue was fixed in > e577c8b64d58fe307ea4d5149d31615df2d90861. A second issue remained in > v5.7: >=20 > https://lkml.kernel.org/r/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw >=20 > =3D=3D > page:ffffea0000aa0000 refcount:1 mapcount:0 mapping:000000002243743b in= dex:0x0 > flags: 0x1fffe000001000(reserved) > =3D=3D >=20 > 73a6e474cb376921a311786652782155eac2fdf0 was applied to supposedly the > second issue, but I still reproduced it twice with v5.9 on two > different systems: >=20 > =3D=3D > page:0000000062b3e92f refcount:1 mapcount:0 mapping:0000000000000000 in= dex:0x0 pfn:0x39800 > flags: 0x1000(reserved) > =3D=3D > page:000000002a7114f8 refcount:1 mapcount:0 mapping:0000000000000000 in= dex:0x0 pfn:0x7a200 > flags: 0x1fff000000001000(reserved) > =3D=3D >=20 > I actually never reproduced it until v5.9, but it's still the same bug > as it was reported first for v5.7. >=20 > See the page is "reserved" in all 3 cases. In the last two crashes > with the pfn: >=20 > pfn 0x39800 -> 0x39800000 min_pfn hit non-RAM: >=20 > 39639000-39814fff : Unknown E820 type >=20 > pfn 0x7a200 -> 0x7a200000 min_pfn hit non-RAM: >=20 > 7a17b000-7a216fff : Unknown E820 type It would be nice to also provide a /proc/zoneinfo and how exactly the=20 "zone_spans_pfn" was violated. I assume we end up below zone's=20 start_pfn, but is it true? > This actually seems a false positive bugcheck, the page structures are > valid and the zones are correct, just it's non-RAM but setting > pageblockskip should do no harm. However it's possible to solve the > crash without lifting the bugcheck, by enforcing the invariant that > the free_pfn cursor doesn't point to reserved pages (which would be > otherwise implicitly achieved through the PageBuddy check, except in > the new fast_isolate_around() path). >=20 > Fixes: 5a811889de10 ("mm, compaction: use free lists to quickly locate = a migration target") > Signed-off-by: Andrea Arcangeli > --- > mm/compaction.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) >=20 > diff --git a/mm/compaction.c b/mm/compaction.c > index 13cb7a961b31..d17e69549d34 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1433,7 +1433,10 @@ fast_isolate_freepages(struct compact_control *c= c) > page =3D pageblock_pfn_to_page(min_pfn, > pageblock_end_pfn(min_pfn), > cc->zone); > - cc->free_pfn =3D min_pfn; > + if (likely(!PageReserved(page))) PageReserved check seems rather awkward solution to me. Wouldn't it be=20 more obvious if we made sure we don't end up below zone's start_pfn (if=20 my assumption is correct) in the first place? When I check the code: unsigned long distance; distance =3D (cc->free_pfn - cc->migrate_pfn); low_pfn =3D pageblock_start_pfn(cc->free_pfn - (distance >> 2)); min_pfn =3D pageblock_start_pfn(cc->free_pfn - (distance >> 1)); I think what can happen is that cc->free_pfn <=3D cc->migrate_pfn after=20 the very last isolate_migratepages(). Then compact_finished() detects=20 that in compact_zone(), but only after migrate_pages() and thus=20 fast_isolate_freepages() is called. That would mean distance can be negative, or rather a large unsigned=20 number and low_pfn and min_pfn end up away from the zone? Or maybe the above doesn't happen, but cc->free_pfn gets so close to=20 start of the zone, that the calculations above make min_pfn go below=20 start_pfn? In any case I would rather make sure we stay within the expected zone=20 boundaries, than play tricks with PageReserved. Mel? > + cc->free_pfn =3D min_pfn; > + else > + page =3D NULL; > } > } > } >=20