From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3321CC76190 for ; Tue, 23 Jul 2019 05:54:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0B22A2238E for ; Tue, 23 Jul 2019 05:54:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732837AbfGWFyC (ORCPT ); Tue, 23 Jul 2019 01:54:02 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:2701 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725788AbfGWFxw (ORCPT ); Tue, 23 Jul 2019 01:53:52 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id E1F76BCD23BD65A58604; Tue, 23 Jul 2019 13:53:49 +0800 (CST) Received: from linux-ibm.site (10.175.102.37) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Tue, 23 Jul 2019 13:53:43 +0800 From: Hanjun Guo To: Ard Biesheuvel , Andrew Morton , Catalin Marinas , "Jia He" , Mike Rapoport , Will Deacon CC: , , , Hanjun Guo Subject: [PATCH v12 1/2] mm: page_alloc: introduce memblock_next_valid_pfn() (again) for arm64 Date: Tue, 23 Jul 2019 13:51:12 +0800 Message-ID: <1563861073-47071-2-git-send-email-guohanjun@huawei.com> X-Mailer: git-send-email 1.7.12.4 In-Reply-To: <1563861073-47071-1-git-send-email-guohanjun@huawei.com> References: <1563861073-47071-1-git-send-email-guohanjun@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.102.37] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jia He Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") optimized the loop in memmap_init_zone(). But it causes possible panic on x86 due to specific memory mapping on x86_64 which will skip valid pfns as well, so Daniel Vacek reverted it later. But as suggested by Daniel Vacek, it is fine to using memblock to skip gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID. Daniel said: "On arm and arm64, memblock is used by default. But generic version of pfn_valid() is based on mem sections and memblock_next_valid_pfn() does not always return the next valid one but skips more resulting in some valid frames to be skipped (as if they were invalid). And that's why kernel was eventually crashing on some !arm machines." Introduce a new config option CONFIG_HAVE_MEMBLOCK_PFN_VALID and only selected for arm64, using the new config option to guard the memblock_next_valid_pfn(). This was tested on a HiSilicon Kunpeng920 based ARM64 server, the speedup is pretty impressive for bootmem_init() at boot: with 384G memory, before: 13310ms after: 1415ms with 1T memory, before: 20s after: 2s Suggested-by: Daniel Vacek Signed-off-by: Jia He Signed-off-by: Hanjun Guo --- arch/arm64/Kconfig | 1 + include/linux/mmzone.h | 9 +++++++++ mm/Kconfig | 3 +++ mm/memblock.c | 31 +++++++++++++++++++++++++++++++ mm/page_alloc.c | 4 +++- 5 files changed, 47 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 697ea0510729..058eb26579be 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -893,6 +893,7 @@ config ARCH_FLATMEM_ENABLE config HAVE_ARCH_PFN_VALID def_bool y + select HAVE_MEMBLOCK_PFN_VALID config HW_PERF_EVENTS def_bool y diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..24cb6bdb1759 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1325,6 +1325,10 @@ static inline int pfn_present(unsigned long pfn) #endif #define early_pfn_valid(pfn) pfn_valid(pfn) +#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID +extern unsigned long memblock_next_valid_pfn(unsigned long pfn); +#define next_valid_pfn(pfn) memblock_next_valid_pfn(pfn) +#endif void sparse_init(void); #else #define sparse_init() do {} while (0) @@ -1347,6 +1351,11 @@ struct mminit_pfnnid_cache { #define early_pfn_valid(pfn) (1) #endif +/* fallback to default definitions */ +#ifndef next_valid_pfn +#define next_valid_pfn(pfn) (pfn + 1) +#endif + void memory_present(int nid, unsigned long start, unsigned long end); /* diff --git a/mm/Kconfig b/mm/Kconfig index f0c76ba47695..c578374b6413 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -132,6 +132,9 @@ config HAVE_MEMBLOCK_NODE_MAP config HAVE_MEMBLOCK_PHYS_MAP bool +config HAVE_MEMBLOCK_PFN_VALID + bool + config HAVE_GENERIC_GUP bool diff --git a/mm/memblock.c b/mm/memblock.c index 7d4f61ae666a..d57ba51bb9cd 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1251,6 +1251,37 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, return 0; } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ + +#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) +{ + struct memblock_type *type = &memblock.memory; + unsigned int right = type->cnt; + unsigned int mid, left = 0; + phys_addr_t addr = PFN_PHYS(++pfn); + + do { + mid = (right + left) / 2; + + if (addr < type->regions[mid].base) + right = mid; + else if (addr >= (type->regions[mid].base + + type->regions[mid].size)) + left = mid + 1; + else { + /* addr is within the region, so pfn is valid */ + return pfn; + } + } while (left < right); + + if (right == type->cnt) + return -1UL; + else + return PHYS_PFN(type->regions[right].base); +} +EXPORT_SYMBOL(memblock_next_valid_pfn); +#endif /* CONFIG_HAVE_MEMBLOCK_PFN_VALID */ + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /** * __next_mem_pfn_range_in_zone - iterator for for_each_*_range_in_zone() diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8abe0af..70933c40380a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5811,8 +5811,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * function. They do not exist on hotplugged memory. */ if (context == MEMMAP_EARLY) { - if (!early_pfn_valid(pfn)) + if (!early_pfn_valid(pfn)) { + pfn = next_valid_pfn(pfn) - 1; continue; + } if (!early_pfn_in_nid(pfn, nid)) continue; if (overlap_memmap_init(zone, &pfn)) -- 2.19.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E56AC76188 for ; Tue, 23 Jul 2019 05:54:37 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 637FD2238E for ; Tue, 23 Jul 2019 05:54:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="AL+AkQcN" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 637FD2238E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ZsI7cw80m9oGS+BGmGYo5W331N2hA0zQiCkWsaIWRiY=; b=AL+AkQcNrrjhTL bwHYgvIvdBcCSkDRx5Ij0JNDk/+c0KzQWDu+7YxgalDuoMf9npj8PcCC60Kk+NEJ9MxgmaNKHeIxG UaYTDhKG4fMqSWnojiYD3d2y58DT9JmTUweC92ee0zxa/peQ3DSx9Yt+kIi/C0rnTv+PlxqI5Bx7l 87JHdNNh418AncEcGnjZ0u3S19J1Yw05WjNSDH3SjuaFGF2PzlmGtq11rM6fVawcowzkv8udJJz7n KDep01tPjv0S6rrYBKvtIQJJaukY+HZ7myoBPO44L824s6o1Rwykx3NDW/ya0jtWF2ses69ayKoJc Jk8cOTxfG5I5NR2kFueQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1hpnky-00049f-2n; Tue, 23 Jul 2019 05:54:36 +0000 Received: from szxga04-in.huawei.com ([45.249.212.190] helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.92 #3 (Red Hat Linux)) id 1hpnkJ-0003ed-Bt for linux-arm-kernel@lists.infradead.org; Tue, 23 Jul 2019 05:53:58 +0000 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id E1F76BCD23BD65A58604; Tue, 23 Jul 2019 13:53:49 +0800 (CST) Received: from linux-ibm.site (10.175.102.37) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Tue, 23 Jul 2019 13:53:43 +0800 From: Hanjun Guo To: Ard Biesheuvel , Andrew Morton , Catalin Marinas , "Jia He" , Mike Rapoport , Will Deacon Subject: [PATCH v12 1/2] mm: page_alloc: introduce memblock_next_valid_pfn() (again) for arm64 Date: Tue, 23 Jul 2019 13:51:12 +0800 Message-ID: <1563861073-47071-2-git-send-email-guohanjun@huawei.com> X-Mailer: git-send-email 1.7.12.4 In-Reply-To: <1563861073-47071-1-git-send-email-guohanjun@huawei.com> References: <1563861073-47071-1-git-send-email-guohanjun@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.102.37] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190722_225355_578978_C3D5E9C5 X-CRM114-Status: GOOD ( 20.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Hanjun Guo Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Jia He Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") optimized the loop in memmap_init_zone(). But it causes possible panic on x86 due to specific memory mapping on x86_64 which will skip valid pfns as well, so Daniel Vacek reverted it later. But as suggested by Daniel Vacek, it is fine to using memblock to skip gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID. Daniel said: "On arm and arm64, memblock is used by default. But generic version of pfn_valid() is based on mem sections and memblock_next_valid_pfn() does not always return the next valid one but skips more resulting in some valid frames to be skipped (as if they were invalid). And that's why kernel was eventually crashing on some !arm machines." Introduce a new config option CONFIG_HAVE_MEMBLOCK_PFN_VALID and only selected for arm64, using the new config option to guard the memblock_next_valid_pfn(). This was tested on a HiSilicon Kunpeng920 based ARM64 server, the speedup is pretty impressive for bootmem_init() at boot: with 384G memory, before: 13310ms after: 1415ms with 1T memory, before: 20s after: 2s Suggested-by: Daniel Vacek Signed-off-by: Jia He Signed-off-by: Hanjun Guo --- arch/arm64/Kconfig | 1 + include/linux/mmzone.h | 9 +++++++++ mm/Kconfig | 3 +++ mm/memblock.c | 31 +++++++++++++++++++++++++++++++ mm/page_alloc.c | 4 +++- 5 files changed, 47 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 697ea0510729..058eb26579be 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -893,6 +893,7 @@ config ARCH_FLATMEM_ENABLE config HAVE_ARCH_PFN_VALID def_bool y + select HAVE_MEMBLOCK_PFN_VALID config HW_PERF_EVENTS def_bool y diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..24cb6bdb1759 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1325,6 +1325,10 @@ static inline int pfn_present(unsigned long pfn) #endif #define early_pfn_valid(pfn) pfn_valid(pfn) +#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID +extern unsigned long memblock_next_valid_pfn(unsigned long pfn); +#define next_valid_pfn(pfn) memblock_next_valid_pfn(pfn) +#endif void sparse_init(void); #else #define sparse_init() do {} while (0) @@ -1347,6 +1351,11 @@ struct mminit_pfnnid_cache { #define early_pfn_valid(pfn) (1) #endif +/* fallback to default definitions */ +#ifndef next_valid_pfn +#define next_valid_pfn(pfn) (pfn + 1) +#endif + void memory_present(int nid, unsigned long start, unsigned long end); /* diff --git a/mm/Kconfig b/mm/Kconfig index f0c76ba47695..c578374b6413 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -132,6 +132,9 @@ config HAVE_MEMBLOCK_NODE_MAP config HAVE_MEMBLOCK_PHYS_MAP bool +config HAVE_MEMBLOCK_PFN_VALID + bool + config HAVE_GENERIC_GUP bool diff --git a/mm/memblock.c b/mm/memblock.c index 7d4f61ae666a..d57ba51bb9cd 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1251,6 +1251,37 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, return 0; } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ + +#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) +{ + struct memblock_type *type = &memblock.memory; + unsigned int right = type->cnt; + unsigned int mid, left = 0; + phys_addr_t addr = PFN_PHYS(++pfn); + + do { + mid = (right + left) / 2; + + if (addr < type->regions[mid].base) + right = mid; + else if (addr >= (type->regions[mid].base + + type->regions[mid].size)) + left = mid + 1; + else { + /* addr is within the region, so pfn is valid */ + return pfn; + } + } while (left < right); + + if (right == type->cnt) + return -1UL; + else + return PHYS_PFN(type->regions[right].base); +} +EXPORT_SYMBOL(memblock_next_valid_pfn); +#endif /* CONFIG_HAVE_MEMBLOCK_PFN_VALID */ + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /** * __next_mem_pfn_range_in_zone - iterator for for_each_*_range_in_zone() diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8abe0af..70933c40380a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5811,8 +5811,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * function. They do not exist on hotplugged memory. */ if (context == MEMMAP_EARLY) { - if (!early_pfn_valid(pfn)) + if (!early_pfn_valid(pfn)) { + pfn = next_valid_pfn(pfn) - 1; continue; + } if (!early_pfn_in_nid(pfn, nid)) continue; if (overlap_memmap_init(zone, &pfn)) -- 2.19.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel