From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF907C3DA7A for ; Thu, 5 Jan 2023 04:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229900AbjAEESF (ORCPT ); Wed, 4 Jan 2023 23:18:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229924AbjAEERe (ORCPT ); Wed, 4 Jan 2023 23:17:34 -0500 Received: from a48-122.smtp-out.amazonses.com (a48-122.smtp-out.amazonses.com [54.240.48.122]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0B622F78A; Wed, 4 Jan 2023 20:17:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=zp2ap7btoiiow65hultmctjebh3tse7g; d=aaront.org; t=1672892252; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding; bh=7x0zcYTnDW/0V3YX6tNEC/SbTk6ln4b3Dz3YfT/7uzw=; b=AiYjSxo4S6yHni5Hq7iJXcrI5XfwhbOZlmmJk4b3KK08GolCaWK7nCoU0/e1ceuo e78R6JhUMRT//OAXyXCQwF8TVUa+JzLH6mzRM1M5GkFWu7EMnplW5sMAMxtn7LXRNZk +iIxc/IKOvGxJxllOw9dooLGAg9DT+RMR58bC6ww= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=6gbrjpgwjskckoa6a5zn6fwqkn67xbtw; d=amazonses.com; t=1672892252; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding:Feedback-ID; bh=7x0zcYTnDW/0V3YX6tNEC/SbTk6ln4b3Dz3YfT/7uzw=; b=b9UFfezr/R0asCAnF0VSnGmlOUkAMKNqwaIZTTuaUSSBuIRecMn6B3yxLvCpr0CO 1lYQAlygiBJZKneRg3V5fnlpqQfhWor6otDp4qqA6WsAaSCwJbmIjBYafPzQxxsgwG4 Hp51zYedEnZYoY5NFmMGi+J60sFditLaVibBQ0DA= From: Aaron Thompson To: Mike Rapoport , linux-mm@kvack.org Cc: "H. Peter Anvin" , Alexander Potapenko , Andrew Morton , Andy Shevchenko , Ard Biesheuvel , Borislav Petkov , Darren Hart , Dave Hansen , David Rientjes , Dmitry Vyukov , Ingo Molnar , Marco Elver , Thomas Gleixner , kasan-dev@googlegroups.com, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, platform-driver-x86@vger.kernel.org, x86@kernel.org, Aaron Thompson Subject: [PATCH v2 1/1] mm: Always release pages to the buddy allocator in memblock_free_late(). Date: Thu, 5 Jan 2023 04:17:31 +0000 Message-ID: <010001858025fc22-e619988e-c0a5-4545-bd93-783890b9ad14-000000@email.amazonses.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230105041650.1485-1-dev@aaront.org> References: <010101857bbc3a41-173240b3-9064-42ef-93f3-482081126ec2-000000@us-west-2.amazonses.com> <20230105041650.1485-1-dev@aaront.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Feedback-ID: 1.us-east-1.8/56jQl+KfkRukJqWjlnf+MtEL0x/NchId1fC0q616g=:AmazonSES X-SES-Outgoing: 2023.01.05-54.240.48.122 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. All pages in reserved ranges have been initialized at that point, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. One case where this issue can occur in the wild is EFI boot on x86_64. The x86 EFI code reserves all EFI boot services memory ranges via memblock_reserve() and frees them later via memblock_free_late() (efi_reserve_boot_services() and efi_free_boot_services(), respectively). If any of those ranges happen to fall within the deferred init range, the pages will not be released and that memory will be unavailable. For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI: v6.2-rc2: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 178867 v6.2-rc2 + patch: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 222816 Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Aaron Thompson --- mm/memblock.c | 8 +++++++- tools/testing/memblock/internal.h | 4 ++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/mm/memblock.c b/mm/memblock.c index 511d4783dcf1..fc3d8fbd2060 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1640,7 +1640,13 @@ void __init memblock_free_late(phys_addr_t base, phys_addr_t size) end = PFN_DOWN(base + size); for (; cursor < end; cursor++) { - memblock_free_pages(pfn_to_page(cursor), cursor, 0); + /* + * Reserved pages are always initialized by the end of + * memblock_free_all() (by memmap_init() and, if deferred + * initialization is enabled, memmap_init_reserved_pages()), so + * these pages can be released directly to the buddy allocator. + */ + __free_pages_core(pfn_to_page(cursor), 0); totalram_pages_inc(); } } diff --git a/tools/testing/memblock/internal.h b/tools/testing/memblock/internal.h index fdb7f5db7308..85973e55489e 100644 --- a/tools/testing/memblock/internal.h +++ b/tools/testing/memblock/internal.h @@ -15,6 +15,10 @@ bool mirrored_kernelcore = false; struct page {}; +void __free_pages_core(struct page *page, unsigned int order) +{ +} + void memblock_free_pages(struct page *page, unsigned long pfn, unsigned int order) { -- 2.30.2 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 473DCC54EAA for ; Mon, 30 Jan 2023 07:40:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E2B38E0001; Mon, 30 Jan 2023 02:40:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5939D6B0073; Mon, 30 Jan 2023 02:40:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45A648E0001; Mon, 30 Jan 2023 02:40:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3732C6B0072 for ; Mon, 30 Jan 2023 02:40:48 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3ACAD140656 for ; Mon, 30 Jan 2023 07:40:46 +0000 (UTC) X-FDA: 80410668492.11.B6CDAD7 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf12.hostedemail.com (Postfix) with ESMTP id 3053D40011 for ; Mon, 30 Jan 2023 07:40:41 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=fail ("body hash did not verify") header.d=aaront.org header.s=zp2ap7btoiiow65hultmctjebh3tse7g header.b=AiYjSxo4; dkim=fail ("body hash did not verify") header.d=amazonses.com header.s=6gbrjpgwjskckoa6a5zn6fwqkn67xbtw header.b=b9UFfezr; arc=reject ("signature check failed: fail, {[1] = sig:hostedemail.com:reject}"); spf=pass (imf12.hostedemail.com: domain of xuyu@linux.alibaba.com designates 115.124.30.45 as permitted sender) smtp.mailfrom=xuyu@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675064443; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:dkim-signature; bh=ZL7UMK1se7Q7PwgmTMLVK9I6eog3Ex1nmensjrwv4AY=; b=IZwxnDi2w+NHmTnHHS6giHpSLCNaRvcRZzfMTuziJHw/j+IRvDC7Al5TjQc1Hfy7OwxkAB UInEBP22IARD35AsdAkgMya0Au3czm4/zMm4D607MutSJSPaC7/ZaHeUF7xyzqtT41LLkI qesBBqQTbEydvHIY+/d9OmnecHt26h4= ARC-Authentication-Results: i=2; imf12.hostedemail.com; dkim=fail ("body hash did not verify") header.d=aaront.org header.s=zp2ap7btoiiow65hultmctjebh3tse7g header.b=AiYjSxo4; dkim=fail ("body hash did not verify") header.d=amazonses.com header.s=6gbrjpgwjskckoa6a5zn6fwqkn67xbtw header.b=b9UFfezr; arc=reject ("signature check failed: fail, {[1] = sig:hostedemail.com:reject}"); spf=pass (imf12.hostedemail.com: domain of xuyu@linux.alibaba.com designates 115.124.30.45 as permitted sender) smtp.mailfrom=xuyu@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1675064443; a=rsa-sha256; cv=fail; b=N4OkxZg5w7p+S/aACP7vrDGG+d4F3zPGPc3gD0ZOdS3rbx1vdqShbbaDGgO4wU0fqxrBEi qn9K8kXpIjB0J/7mrqsru/08cFj7HChd+FN/zzMLN1QeNmxotLNZPOyGv/SK+kEhbYjPJk KG++JCySvwQxblQ0cEOtP4PIDvYKALM= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xuyu@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VaQTAGJ_1675064437; Received: from localhost(mailfrom:xuyu@linux.alibaba.com fp:SMTPD_---0VaQTAGJ_1675064437) by smtp.aliyun-inc.com; Mon, 30 Jan 2023 15:40:38 +0800 From: Xu Yu To: baolin.wang@linux.alibaba.com, Mike Rapoport , linux-mm@kvack.org Cc: alikernel-developer@linux.alibaba.com Subject: [PATCH v2 1/1] mm: Always release pages to the buddy allocator in memblock_free_late(). Date: Mon, 30 Jan 2023 15:40:34 +0800 Message-ID: <010001858025fc22-e619988e-c0a5-4545-bd93-783890b9ad14-000000@email.amazonses.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230105041650.1485-1-dev@aaront.org> References: <010101857bbc3a41-173240b3-9064-42ef-93f3-482081126ec2-000000@us-west-2.amazonses.com> <20230105041650.1485-1-dev@aaront.org> X-Mozilla-Status: 0001 Received: from kanga.kvack.org(mailfrom:owner-linux-mm@kvack.org ip:205.233.56.17) by mx1.aliyun-inc.com; Thu, 05 Jan 2023 12:17:36 +0800 Received: by kanga.kvack.org (Postfix) id D84158E0003; Wed, 4 Jan 2023 23:17:34 -0500 (EST) X-Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D0D088E0001; Wed, 4 Jan 2023 23:17:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B86998E0003; Wed, 4 Jan 2023 23:17:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A46738E0001 for ; Wed, 4 Jan 2023 23:17:34 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7D46C12035A for ; Thu, 5 Jan 2023 04:17:34 +0000 (UTC) X-FDA: 80319436428.22.D9DB8B5 Received: from a8-22.smtp-out.amazonses.com (a8-22.smtp-out.amazonses.com [54.240.8.22]) by imf19.hostedemail.com (Postfix) with ESMTP id 068751A000E for ; Thu, 5 Jan 2023 04:17:32 +0000 (UTC) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672892253; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yakfCQe+8Ykoijus7tecq8mv0JnqLNzwgKukFStPbcw=; b=C67rEVSVg3MQ7MYfY1kjXSqrqu8miTv+cOH5JKRn69V+BwTUT+IaZHZmovWOB8sBj/gi2i TR/HwaHpYYgmT9vhyKivj9eZkECuboktB8H0FGAJjLXs4ry7AufHfi3B/36ZWqRMwZaaeR 3yeJDHCMWyeZTiud3Xv7ci3x3CSBino= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=aaront.org header.s=zp2ap7btoiiow65hultmctjebh3tse7g header.b=AiYjSxo4; dkim=pass header.d=amazonses.com header.s=6gbrjpgwjskckoa6a5zn6fwqkn67xbtw header.b=b9UFfezr; spf=pass (imf19.hostedemail.com: domain of 010001858025fc22-e619988e-c0a5-4545-bd93-783890b9ad14-000000@ses-us-east-1.bounces.aaront.org designates 54.240.8.22 as permitted sender) smtp.mailfrom=010001858025fc22-e619988e-c0a5-4545-bd93-783890b9ad14-000000@ses-us-east-1.bounces.aaront.org; dmarc=pass (policy=quarantine) header.from=aaront.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672892253; a=rsa-sha256; cv=none; b=oNe6bei7442sCT75ox4aXcOjc01yYyC6tECghWLvu4W25Qbk4qk6aL0vyIHdtHvGmED2i/ jmeuGjrkq55uzAJnCuMSH1SYNGtifDB26jBvl8SVYXj94tmUgRf16CzonYbJ2PhvP02f4S 5WYW/RGRr7e/SQnH3TjsQXZu638UmGQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=zp2ap7btoiiow65hultmctjebh3tse7g; d=aaront.org; t=1672892252; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding; bh=7x0zcYTnDW/0V3YX6tNEC/SbTk6ln4b3Dz3YfT/7uzw=; b=AiYjSxo4S6yHni5Hq7iJXcrI5XfwhbOZlmmJk4b3KK08GolCaWK7nCoU0/e1ceuo e78R6JhUMRT//OAXyXCQwF8TVUa+JzLH6mzRM1M5GkFWu7EMnplW5sMAMxtn7LXRNZk +iIxc/IKOvGxJxllOw9dooLGAg9DT+RMR58bC6ww= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=6gbrjpgwjskckoa6a5zn6fwqkn67xbtw; d=amazonses.com; t=1672892252; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding:Feedback-ID; bh=7x0zcYTnDW/0V3YX6tNEC/SbTk6ln4b3Dz3YfT/7uzw=; b=b9UFfezr/R0asCAnF0VSnGmlOUkAMKNqwaIZTTuaUSSBuIRecMn6B3yxLvCpr0CO 1lYQAlygiBJZKneRg3V5fnlpqQfhWor6otDp4qqA6WsAaSCwJbmIjBYafPzQxxsgwG4 Hp51zYedEnZYoY5NFmMGi+J60sFditLaVibBQ0DA= X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Feedback-ID: 1.us-east-1.8/56jQl+KfkRukJqWjlnf+MtEL0x/NchId1fC0q616g=:AmazonSES X-SES-Outgoing: 2023.01.05-54.240.8.22 X-HE-Meta: U2FsdGVkX19xOJXW2vGlW9w5Me6vDtP4NkkAfXkOeRA1C8gzCq2Fx40RWNpbElQ42hYFhGI5eIgQBSsyDI7i+SYxS3dmewC/HnCeono+yFHsvpz2nB0MduqBzxVHOSxCtmfuZiS2y8K54sy4UFXqJtZVhpbdhC/5M9vLSwOTzyqXcW3YrhXe2SZ9cchYZuFLCSbzPiPpFLimTztglx+WBYWegKcbXpw4LzRrBajcB41q+1GOMs+eqSruSxOU3QMJPXYjzlUyLIXQzi1iSwL6kjgZDqkFeFAsGJL6wYQ+oM5O7AjqGfLzLJp7gq4Yq/IJWpV67VDaE6TrggCeSjNbYNwwjhCAQwpfAWTbH2h+aOoghTEdZFwbOdM/qzgBjcfEycucAFdTijlGWcSqp3tATmI1G5HKUfTlJsZRfaz5lL0nYcnlErH7ExR4bZGPC7mpGXntVSBz8xO3pZkKkfBEPiKZg8TAY/eHdAi1RftjzRMnWDZPjs0kl2nv2U1jC2Va9AU5MO6wFXnbsu15ZKPKP5mKeZAvKRFn3LW/dz67Eeug+n6T+3XX2g2T4C4oF/kd+q91+DerLIs6EI8TrCYoG0eL5glaSZECzy600w5eRRFX13lEX8fU9YIPEGKr4/X4J9Derw7gb6M7GR3iY6loq+IBSdMOxQkHGG12zEouxaltmt+0b6ej4ec1hm29VppPdE8WgKg85p9oTwpA4a5xj7rh55BGbQZbEs/fNomemHizO9Xhh9Vl8jnXhXYmEa/J1iPEtpBS2mI3HU0ezhd50A7dNwknyoZkLFQKDzGxtbg0fDonkyE0tuWHTo5B1bl3CKWGii2zK8J85pZUj6e6e8sWL5WoP+FxYRdzEvKzyPz2nmxOcw6Lwk4pUaZml0ey4vV0qcBH8qGacsjzizQVRupZfx72YPHBhlUMF+trOfrx0f3koLsrw/i8/tO8/GKnaFn/0YY9Ckkc5fHcCed 4oPY8o 0e EqOUqw/b9Hj6bHX3YPlo/zXYfDYuHT63i8ZHk+Da755datc/Hvvjm50lcnECbyq1C8aMBaI/zTnktfizesJD4khxs/XGOyIZPT5YZPro1QPpjLq239PVbaee0kPfBH7jvhOqiJd97+IJfgsVQgnaahHkQRxeWXuLCYfWzZFyvfCewSCumO9SBmWsfDIU2136ea62h5XNoJBI/EenK1YYcJSE25ijCrUUkRJSf/QFFjDm+GJoM8bUEOAko9arszCL54XH2TyOl/imJttGkYyn+lQwOJbkS5lGPickeUTO3Sd3f0xxA4Wh2njosZ131uybLw6uOOkYSBMfVRWATxBMamS0iWWtpkH/ukLZKRQbMWoxGInhctyUgnv9M28M//89R7Y4b5dBzk9r10jW+y8sicSrEael5jPoGAbAN+AWDClR+ks1l7angAdl7PQ== X-Loop: owner-majordomo@kvack.org List-ID: Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3053D40011 X-Rspam-User: X-Stat-Signature: gqwqnq9xspxkse86s39u7pkb71y1z7kd X-HE-Tag-Orig: 1672892252-885784 X-HE-Bulk: true X-HE-Tag: 1675064441-356682 X-HE-Meta: U2FsdGVkX1+OTp1IJcqrWz7+CPJgFVFACKftEWVB4iee2XAu8uOv+cHmCeMIL2zMaRog3e5PFwzyVWZNKMwE2lYeLiL01SZe8uwdwOFHZYzD2Mw9ztkmbiDK/pWryNvhEDXxfnUvjFQN1KwasVnP1zAmWwJBunXgEdjEsPaxqJzyKLHttD3xn4jZLaZahRLnj+0JMM1ThwbYEUOjb5xFk4+uQ6hgRU8Yd4lqRNUk3fr6Fe4WJ5NKjtYwWIw4XzinmRhd/KTMRJZwAtcOOlbvIyie5U+3nL3txTSLF+qvUnUDcoPtLxRBEmMm2ooGibt41mu9BHnc0DXBqtBHWH/vDZco69uCZwmJvcd7L2eo93KE8LrrI6NcnD17ovYNJaqYEE7mtHsz4xjrPjIIBkaG9E7AWZCm6uSIkVEuwSGYuz9tJjM/T2VmIhnjWThhZpcyvpy1hi/iAoGUU2y3fKy0VddISLZ1tZCpaPW7Vyu7/TL1/t+MiutxxungSRRE2oXLLaCNzFz1ntZWb2BzUltxOX1b4pJpJrYYm/aQ2nkiVMHbJtqSyqcPey65ob+XmJicpcDAbSqTof4fFK1lKJDYuafz+QWfpxqlpt68NNNLOl0tfmt9lrXt+pntFkRnfa0lBmfhrZqtWY00e9px9Sks/rf9gFAnXiMjvfA0EEuIN7onu2Bc1TQGuMAtEL1fsmzY42U7QBHqh2X2QMJQXdB3hkIVDzGWEelAMiT4LRGI58gzFjhGn4mO/6T5wV+Oge/xTlmM4c9oFUesVpxlmXjSPHuZdC/h6nuc+J/P41PbqccyIqmo0aPwpYLDCd7EGLaK+ndtXO8W93DReajHQkbYrjMD9AQjKWc289pdtBdeivcTX5tWyUsM9HGltNzkDqK/lt/WpSCvq8mNxFjz3wrqqVB+Fj9K7zgrxRaKY2Qp67KKhWn/1M6Dudksj6HzkoqawditVdutgeJAVTTxzNC mkHAHCQH XyPYErejP6Kgl7xdKpy85hC04R3FTa7ujrtKQUdahdKKsZzQJGJVXCP89YohCv1UChrruIQMvSy0d2dsiYkuzu8m3UumhqfdgYpsdKGxgDNZr72P/E+D1X4lhtyNMMzPLYZs67fNEqxKtsOl4tz18UT2/PcPAz/p2BIkZtBpOzc5ivva8Or6hQXaS8+YjeVJPneyfIG0j0YLeSSV0Xd0Mfobg17/WZbBHWQmp/xeGoZoGtd9p+ZgPdxVEmmdEaaycE83Cya7JL+Vyngeg1tyStkmAOAKtb2FgPXFQIMNmRTs5sVFAn6YI91MUmxtp+RzBYWuedoyOCNOZrPB2lQuetCQGGmXl49FkgbhwBcd21nqfZ21tFz3BNAip7PbHDifOSZbwsLZ/odmAdhIg2sOEihhkkZJvFcirsQotAJ2p7D2fONGCiPsTo0BcUO76eBwXl50qTHuL1x/Iz+pfhsAd3UYb2ntSA5N1yamM2HDqONMxvCx5KAFzXYiQaUdMk7n/e+EgB8RXKkBfAOGx1nGJqK1klxUDCo7l6hwpX5+y+fHQnfdKfX3+Smj/8V7GJDkI9BRMC6txwzUW268f/tyrSu1JoMmq3OEZskCznAMUYyEQz5+stbEKKQYulRXwvZSauNy/FocDUhjHOG3Q636GWcqlvpqBvlJw052+jbHeuQo/e8GHIAyJxx3pdBuX0OvJilOWxySAQokOcVc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.012647, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Message-ID: <20230130074034.GDnq9cas7PY7qhjtdrMp395Ke1rnZ-LzbFesjgULCJs@z> From: Aaron Thompson =0D If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()=0D only releases pages to the buddy allocator if they are not in the=0D deferred range. This is correct for free pages (as defined by=0D for_each_free_mem_pfn_range_in_zone()) because free pages in the=0D deferred range will be initialized and released as part of the deferred=0D init process. memblock_free_pages() is called by memblock_free_late(),=0D which is used to free reserved ranges after memblock_free_all() has=0D run. All pages in reserved ranges have been initialized at that point,=0D and accordingly, those pages are not touched by the deferred init=0D process. This means that currently, if the pages that=0D memblock_free_late() intends to release are in the deferred range, they=0D will never be released to the buddy allocator. They will forever be=0D reserved.=0D =0D In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),=0D which is also correct for free pages but is not correct for reserved=0D pages. KMSAN metadata for reserved pages is initialized by=0D kmsan_init_shadow(), which runs shortly before memblock_free_all().=0D =0D For both of these reasons, memblock_free_pages() should only be called=0D for free pages, and memblock_free_late() should call __free_pages_core()=0D directly instead.=0D =0D One case where this issue can occur in the wild is EFI boot on=0D x86_64. The x86 EFI code reserves all EFI boot services memory ranges=0D via memblock_reserve() and frees them later via memblock_free_late()=0D (efi_reserve_boot_services() and efi_free_boot_services(),=0D respectively). If any of those ranges happen to fall within the deferred=0D init range, the pages will not be released and that memory will be=0D unavailable.=0D =0D For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:=0D =0D v6.2-rc2:=0D # grep -E 'Node|spanned|present|managed' /proc/zoneinfo=0D Node 0, zone DMA=0D spanned 4095=0D present 3999=0D managed 3840=0D Node 0, zone DMA32=0D spanned 246652=0D present 245868=0D managed 178867=0D =0D v6.2-rc2 + patch:=0D # grep -E 'Node|spanned|present|managed' /proc/zoneinfo=0D Node 0, zone DMA=0D spanned 4095=0D present 3999=0D managed 3840=0D Node 0, zone DMA32=0D spanned 246652=0D present 245868=0D managed 222816=0D =0D Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if C= ONFIG_DEFERRED_STRUCT_PAGE_INIT is set")=0D Signed-off-by: Aaron Thompson =0D ---=0D mm/memblock.c | 8 +++++++-=0D 2 files changed, 11 insertions(+), 1 deletion(-)=0D =0D diff --git a/mm/memblock.c b/mm/memblock.c=0D index 511d4783dcf1..fc3d8fbd2060 100644=0D --- a/mm/memblock.c=0D +++ b/mm/memblock.c=0D @@ -1640,7 +1640,13 @@ void __init memblock_free_late(phys_addr_t base, phy= s_addr_t size)=0D end =3D PFN_DOWN(base + size);=0D =0D for (; cursor < end; cursor++) {=0D - memblock_free_pages(pfn_to_page(cursor), cursor, 0);=0D + /*=0D + * Reserved pages are always initialized by the end of=0D + * memblock_free_all() (by memmap_init() and, if deferred=0D + * initialization is enabled, memmap_init_reserved_pages()), so=0D + * these pages can be released directly to the buddy allocator.=0D + */=0D + __free_pages_core(pfn_to_page(cursor), 0);=0D totalram_pages_inc();=0D }=0D }=0D -- =0D 2.30.2=0D =0D