All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Roman Gushchin <guro@fb.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
	Wonhyuk Yang <vvghjk1234@gmail.com>,
	Thiago Jung Bauermann <bauerman@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 22/65] memblock: do not start bottom-up allocations with kernel_end
Date: Mon,  8 Feb 2021 16:00:54 +0100	[thread overview]
Message-ID: <20210208145811.096767670@linuxfoundation.org> (raw)
In-Reply-To: <20210208145810.230485165@linuxfoundation.org>

From: Roman Gushchin <guro@fb.com>

[ Upstream commit 2dcb3964544177c51853a210b6ad400de78ef17d ]

With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:

  hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
  ------------[ cut here ]------------
  memblock: bottom-up allocation failed, memory hotremove may be affected
  WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  RIP: 0010:memblock_find_in_range_node+0x178/0x25a
  Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
  RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
  RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
  RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
  RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
  R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
  FS:  0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
  Call Trace:
    memblock_alloc_range_nid+0x8d/0x11e
    cma_declare_contiguous_nid+0x2c4/0x38c
    hugetlb_cma_reserve+0xdc/0x128
    flush_tlb_one_kernel+0xc/0x20
    native_set_fixmap+0x82/0xd0
    flat_get_apic_id+0x5/0x10
    register_lapic_address+0x8e/0x97
    setup_arch+0x8a5/0xc3f
    start_kernel+0x66/0x547
    load_ucode_bsp+0x4c/0xcd
    secondary_startup_64_no_verify+0xb0/0xbb
  random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
  ---[ end trace f151227d0b39be70 ]---

At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE.  In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure.  All together it
simplifies the logic.

Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Wonhyuk Yang <vvghjk1234@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/memblock.c | 49 ++++++-------------------------------------------
 1 file changed, 6 insertions(+), 43 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index c4b16cae2bc9b..11f6ae37d6699 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -257,14 +257,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end,
  *
  * Find @size free area aligned to @align in the specified range and node.
  *
- * When allocation direction is bottom-up, the @start should be greater
- * than the end of the kernel image. Otherwise, it will be trimmed. The
- * reason is that we want the bottom-up allocation just near the kernel
- * image so it is highly likely that the allocated memory and the kernel
- * will reside in the same node.
- *
- * If bottom-up allocation failed, will try to allocate memory top-down.
- *
  * Return:
  * Found address on success, 0 on failure.
  */
@@ -273,8 +265,6 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
 					phys_addr_t end, int nid,
 					enum memblock_flags flags)
 {
-	phys_addr_t kernel_end, ret;
-
 	/* pump up @end */
 	if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
 	    end == MEMBLOCK_ALLOC_KASAN)
@@ -283,40 +273,13 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
 	/* avoid allocating the first page */
 	start = max_t(phys_addr_t, start, PAGE_SIZE);
 	end = max(start, end);
-	kernel_end = __pa_symbol(_end);
-
-	/*
-	 * try bottom-up allocation only when bottom-up mode
-	 * is set and @end is above the kernel image.
-	 */
-	if (memblock_bottom_up() && end > kernel_end) {
-		phys_addr_t bottom_up_start;
-
-		/* make sure we will allocate above the kernel */
-		bottom_up_start = max(start, kernel_end);
 
-		/* ok, try bottom-up allocation first */
-		ret = __memblock_find_range_bottom_up(bottom_up_start, end,
-						      size, align, nid, flags);
-		if (ret)
-			return ret;
-
-		/*
-		 * we always limit bottom-up allocation above the kernel,
-		 * but top-down allocation doesn't have the limit, so
-		 * retrying top-down allocation may succeed when bottom-up
-		 * allocation failed.
-		 *
-		 * bottom-up allocation is expected to be fail very rarely,
-		 * so we use WARN_ONCE() here to see the stack trace if
-		 * fail happens.
-		 */
-		WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE),
-			  "memblock: bottom-up allocation failed, memory hotremove may be affected\n");
-	}
-
-	return __memblock_find_range_top_down(start, end, size, align, nid,
-					      flags);
+	if (memblock_bottom_up())
+		return __memblock_find_range_bottom_up(start, end, size, align,
+						       nid, flags);
+	else
+		return __memblock_find_range_top_down(start, end, size, align,
+						      nid, flags);
 }
 
 /**
-- 
2.27.0




  parent reply	other threads:[~2021-02-08 16:33 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 15:00 [PATCH 5.4 00/65] 5.4.97-rc1 review Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 01/65] USB: serial: cp210x: add pid/vid for WSDA-200-USB Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 02/65] USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000 Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 03/65] USB: serial: option: Adding support for Cinterion MV31 Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 04/65] arm64: dts: qcom: c630: keep both touchpad devices enabled Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 05/65] Input: i8042 - unbreak Pegatron C15B Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 06/65] arm64: dts: amlogic: meson-g12: Set FL-adj property value Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 07/65] arm64: dts: rockchip: fix vopl iommu irq on px30 Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 08/65] bpf, cgroup: Fix optlen WARN_ON_ONCE toctou Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 09/65] bpf, cgroup: Fix problematic bounds check Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 10/65] um: virtio: free vu_dev only with the contained struct device Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 11/65] rxrpc: Fix deadlock around release of dst cached on udp tunnel Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 12/65] arm64: dts: ls1046a: fix dcfg address range Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 13/65] igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 14/65] igc: check return value of ret_val in igc_config_fc_after_link_up Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 15/65] i40e: Revert "i40e: dont report link up for a VF who hasnt enabled queues" Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 16/65] net/mlx5: Fix leak upon failure of rule creation Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 17/65] net: lapb: Copy the skb before sending a packet Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 18/65] net: mvpp2: TCAM entry enable should be written after SRAM data Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 19/65] r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 20/65] ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 21/65] nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs Greg Kroah-Hartman
2021-02-08 15:00 ` Greg Kroah-Hartman [this message]
2021-02-08 15:00 ` [PATCH 5.4 23/65] USB: gadget: legacy: fix an error code in eth_bind() Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 24/65] USB: usblp: dont call usb_set_interface if theres a single alt Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 25/65] usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop() Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 26/65] usb: dwc2: Fix endpoint direction check in ep_from_windex Greg Kroah-Hartman
2021-02-08 15:00 ` [PATCH 5.4 27/65] usb: dwc3: fix clock issue during resume in OTG mode Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 28/65] usb: xhci-mtk: fix unreleased bandwidth data Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 29/65] usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 30/65] usb: xhci-mtk: break loop when find the endpoint to drop Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 31/65] usb: host: xhci-plat: add priv quirk for skip PHY initialization Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 32/65] ovl: fix dentry leak in ovl_get_redirect Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 33/65] mac80211: fix station rate table updates on assoc Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 34/65] fgraph: Initialize tracing_graph_pause at task creation Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 35/65] kretprobe: Avoid re-registration of the same kretprobe earlier Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 36/65] libnvdimm/dimm: Avoid race between probe and available_slots_show() Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 37/65] genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 38/65] xhci: fix bounce buffer usage for non-sg list case Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 39/65] cifs: report error instead of invalid when revalidating a dentry fails Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 40/65] smb3: Fix out-of-bounds bug in SMB2_negotiate() Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 41/65] smb3: fix crediting for compounding when only one request in flight Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 42/65] mmc: core: Limit retries when analyse of SDIO tuples fails Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 43/65] drm/amd/display: Revert "Fix EDID parsing after resume from suspend" Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 44/65] nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 45/65] KVM: SVM: Treat SVM as unsupported when running as an SEV guest Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 46/65] KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 47/65] ARM: footbridge: fix dc21285 PCI configuration accessors Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 48/65] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 49/65] mm: hugetlb: fix a race between freeing and dissolving the page Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 50/65] mm: hugetlb: fix a race between isolating and freeing page Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 51/65] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 52/65] mm, compaction: move high_pfn to the for loop scope Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 53/65] mm: thp: fix MADV_REMOVE deadlock on shmem THP Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 54/65] x86/build: Disable CET instrumentation in the kernel Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 55/65] x86/apic: Add extra serialization for non-serializing MSRs Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 56/65] iwlwifi: mvm: dont send RFH_QUEUE_CONFIG_CMD with no queues Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 57/65] Input: xpad - sync supported devices with fork on GitHub Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 58/65] iommu/vt-d: Do not use flush-queue when caching-mode is on Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 59/65] md: Set prev_flush_start and flush_bio in an atomic way Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 60/65] igc: Report speed and duplex as unknown when device is runtime suspended Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 61/65] neighbour: Prevent a dead entry from updating gc_list Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 62/65] net: ip_tunnel: fix mtu calculation Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 63/65] net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 64/65] net: sched: replaced invalid qdisc tree flush helper in qdisc_replace Greg Kroah-Hartman
2021-02-08 15:01 ` [PATCH 5.4 65/65] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720 Greg Kroah-Hartman
2021-02-08 17:39 ` [PATCH 5.4 00/65] 5.4.97-rc1 review Florian Fainelli
2021-02-10  8:28   ` Greg Kroah-Hartman
2021-02-08 20:42 ` Shuah Khan
2021-02-09  5:50 ` Naresh Kamboju
2021-02-09 11:00 ` Igor Torrente
2021-02-10  8:29   ` Greg Kroah-Hartman
2021-02-09 18:14 ` Guenter Roeck
2021-02-10  1:23 ` Ross Schmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210208145811.096767670@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=bauerman@linux.ibm.com \
    --cc=guro@fb.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    --cc=rppt@linux.ibm.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vvghjk1234@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.