From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C9B5C433B4 for ; Thu, 6 May 2021 12:48:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3B027613F6 for ; Thu, 6 May 2021 12:48:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3B027613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 78EB76B006C; Thu, 6 May 2021 08:48:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 765386B006E; Thu, 6 May 2021 08:48:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61D276B0070; Thu, 6 May 2021 08:48:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 450BA6B006C for ; Thu, 6 May 2021 08:48:02 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id F0717181AEF1A for ; Thu, 6 May 2021 12:48:01 +0000 (UTC) X-FDA: 78110783562.39.DD32337 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf02.hostedemail.com (Postfix) with ESMTP id 7656440001DE for ; Thu, 6 May 2021 12:47:27 +0000 (UTC) Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4FbYFr4TdPz1BJCs; Thu, 6 May 2021 20:45:20 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by DGGEMS408-HUB.china.huawei.com (10.3.19.208) with Microsoft SMTP Server id 14.3.498.0; Thu, 6 May 2021 20:47:52 +0800 Subject: Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()) To: Mike Rapoport , David Hildenbrand CC: , Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , Marc Zyngier , Mark Rutland , "Mike Rapoport" , Will Deacon , , , References: <259d14df-a713-72e7-4ccb-c06a8ee31e13@huawei.com> <6ad2956c-70ae-c423-ed7d-88e94c88060f@huawei.com> <0cb013e4-1157-f2fa-96ec-e69e60833f72@huawei.com> <24b37c01-fc75-d459-6e61-d67e8f0cf043@redhat.com> From: Kefeng Wang Message-ID: <82cfbb7f-dd4f-12d8-dc76-847f06172200@huawei.com> Date: Thu, 6 May 2021 20:47:52 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US X-Originating-IP: [10.174.177.243] X-CFilter-Loop: Reflected Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf02.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com X-Stat-Signature: 7sphi4nbgm4rngkxigox11d53j5y8xx3 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7656440001DE Received-SPF: none (huawei.com>: No applicable sender policy available) receiver=imf02; identity=mailfrom; envelope-from=""; helo=szxga04-in.huawei.com; client-ip=45.249.212.190 X-HE-DKIM-Result: none/none X-HE-Tag: 1620305247-697156 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/5/3 16:44, Mike Rapoport wrote: > On Mon, May 03, 2021 at 10:07:01AM +0200, David Hildenbrand wrote: >> On 03.05.21 08:26, Mike Rapoport wrote: >>> On Fri, Apr 30, 2021 at 07:24:37PM +0800, Kefeng Wang wrote: >>>> >>>> >>>> On 2021/4/30 17:51, Mike Rapoport wrote: >>>>> On Thu, Apr 29, 2021 at 06:22:55PM +0800, Kefeng Wang wrote: >>>>>> >>>>>> On 2021/4/29 14:57, Mike Rapoport wrote: >>>>>> >>>>>>>>> Do you use SPARSMEM? If yes, what is your section size? >>>>>>>>> What is the value if CONFIG_FORCE_MAX_ZONEORDER in your configu= ration? >>>>>>>> Yes, >>>>>>>> >>>>>>>> CONFIG_SPARSEMEM=3Dy >>>>>>>> >>>>>>>> CONFIG_SPARSEMEM_STATIC=3Dy >>>>>>>> >>>>>>>> CONFIG_FORCE_MAX_ZONEORDER =3D 11 >>>>>>>> >>>>>>>> CONFIG_PAGE_OFFSET=3D0xC0000000 >>>>>>>> CONFIG_HAVE_ARCH_PFN_VALID=3Dy >>>>>>>> CONFIG_HIGHMEM=3Dy >>>>>>>> #define SECTION_SIZE_BITS 26 >>>>>>>> #define MAX_PHYSADDR_BITS 32 >>>>>>>> #define MAX_PHYSMEM_BITS 32 >>>>>> >>>>>> >>>>>> With the patch,=C2=A0 the addr is aligned, but the panic still occ= urred, >>>>> >>>>> Is this the same panic at move_freepages() for range [de600, de7ff]= ? >>>>> >>>>> Do you enable CONFIG_ARM_LPAE? >>>> >>>> no, the CONFIG_ARM_LPAE is not set, and yes with same panic at >>>> move_freepages at >>>> >>>> start_pfn/end_pfn [de600, de7ff], [de600000, de7ff000] : pfn =3Dde6= 00, page >>>> =3Def3cc000, page-flags =3D ffffffff, pfn2phy =3D de600000 >>>> >>>>>> __free_memory_core, range: 0xb0200000 - 0xc0000000, pfn: b0200 - b= 0200 >>>>>> __free_memory_core, range: 0xcc000000 - 0xdca00000, pfn: cc000 - b= 0200 >>>>>> __free_memory_core, range: 0xde700000 - 0xdea00000, pfn: de700 - b= 0200 >>> >>> Hmm, [de600, de7ff] is not added to the free lists which is correct. = But >>> then it's unclear how the page for de600 gets to move_freepages()... >>> >>> Can't say I have any bright ideas to try here... >> >> Are we missing some checks (e.g., PageReserved()) that pfn_valid_withi= n() >> would have "caught" before? >=20 > Unless I'm missing something the crash happens in __rmqueue_fallback(): >=20 > do_steal: > page =3D get_page_from_free_area(area, fallback_mt); >=20 > steal_suitable_fallback(zone, page, alloc_flags, start_migratetype, > can_steal); > -> move_freepages() > -> BUG() >=20 > So a page from free area should be sane as the freed range was never ad= ded > it to the free lists. Sorry for the late response due to the vacation. The pfn in range [de600, de7ff] won't be added into the free lists via=20 __free_memory_core(), but the pfn could be added into freelists via=20 free_highmem_page() I add some debug[1] in add_to_free_list(), we could see the calltrace free_highpages, range_pfn [b0200, c0000], range_addr [b0200000, c0000000] free_highpages, range_pfn [cc000, dca00], range_addr [cc000000, dca00000] free_highpages, range_pfn [de700, dea00], range_addr [de700000, dea00000] add_to_free_list, =3D=3D=3D> pfn =3D de700 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:900 add_to_free_list+0x8c/0xec pfn =3D de700 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #48 Hardware name: Hisilicon A9 [] (show_stack) from [] (dump_stack+0x9c/0xc0) [] (dump_stack) from [] (__warn+0xc0/0xec) [] (__warn) from [] (warn_slowpath_fmt+0x74/0xa4) [] (warn_slowpath_fmt) from []=20 (add_to_free_list+0x8c/0xec) [] (add_to_free_list) from []=20 (free_pcppages_bulk+0x200/0x278) [] (free_pcppages_bulk) from []=20 (free_unref_page+0x58/0x68) [] (free_unref_page) from []=20 (free_highmem_page+0xc/0x50) [] (free_highmem_page) from [] (mem_init+0x21c/0x254) [] (mem_init) from [] (start_kernel+0x258/0x5c0) [] (start_kernel) from [<00000000>] (0x0) so any idea? [1] debug diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 1ba9f9f9dbd8..ee3619c04f93 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -286,7 +286,7 @@ static void __init free_highpages(void) /* Truncate partial highmem entries */ if (start < max_low) start =3D max_low; - + pr_info("%s, range_pfn [%lx, %lx], range_addr [%x,=20 %x]\n", __func__, start, end, range_start, range_end); for (; start < end; start++) free_highmem_page(pfn_to_page(start)); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 592479f43c74..920f041f0c6f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -892,7 +892,14 @@ compaction_capture(struct capture_control *capc,=20 struct page *page, static inline void add_to_free_list(struct page *page, struct zone *zon= e, unsigned int order, int migratetype) { + unsigned long pfn; struct free_area *area =3D &zone->free_area[order]; + pfn =3D page_to_pfn(page); + if (pfn >=3D 0xde600 && pfn < 0xde7ff) { + pr_info("%s, =3D=3D=3D> pfn =3D %lx", __func__, pfn); + WARN_ONCE(pfn =3D=3D 0xde700, "pfn =3D %lx", pfn); + } >=20 > And honestly, with the memory layout reported elsewhere in the stack I'= d > say that the bootloader/fdt beg for fixes... >=20