From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A011CC43387 for ; Mon, 31 Dec 2018 08:40:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 62D6C2080D for ; Mon, 31 Dec 2018 08:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727117AbeLaIkd (ORCPT ); Mon, 31 Dec 2018 03:40:33 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:36864 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726033AbeLaIkd (ORCPT ); Mon, 31 Dec 2018 03:40:33 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wBV8dkJ1109165 for ; Mon, 31 Dec 2018 03:40:31 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2pqdpnma3s-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 31 Dec 2018 03:40:31 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 31 Dec 2018 08:40:29 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 31 Dec 2018 08:40:24 -0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wBV8eNmg60424196 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 31 Dec 2018 08:40:23 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 28DA052057; Mon, 31 Dec 2018 08:40:23 +0000 (GMT) Received: from rapoport-lnx (unknown [9.148.205.47]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTPS id 2E31652050; Mon, 31 Dec 2018 08:40:21 +0000 (GMT) Date: Mon, 31 Dec 2018 10:40:19 +0200 From: Mike Rapoport To: Pingfan Liu Cc: linux-acpi@vger.kernel.org, linux-mm@kvack.org, kexec@lists.infradead.org, Tang Chen , "Rafael J. Wysocki" , Len Brown , Andrew Morton , Mike Rapoport , Michal Hocko , Jonathan Corbet , Yaowei Bai , Pavel Tatashin , Nicholas Piggin , Naoya Horiguchi , Daniel Vacek , Mathieu Malaterre , Stefan Agner , Dave Young , Baoquan He , yinghai@kernel.org, vgoyal@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCHv3 1/2] mm/memblock: extend the limit inferior of bottom-up after parsing hotplug attr References: <1545966002-3075-1-git-send-email-kernelfans@gmail.com> <1545966002-3075-2-git-send-email-kernelfans@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1545966002-3075-2-git-send-email-kernelfans@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-TM-AS-GCONF: 00 x-cbid: 18123108-0020-0000-0000-000002FE32D1 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18123108-0021-0000-0000-0000214E6C26 Message-Id: <20181231084018.GA28478@rapoport-lnx> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-12-31_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812310081 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 28, 2018 at 11:00:01AM +0800, Pingfan Liu wrote: > The bottom-up allocation style is introduced to cope with movable_node, > where the limit inferior of allocation starts from kernel's end, due to > lack of knowledge of memory hotplug info at this early time. But if later, > hotplug info has been got, the limit inferior can be extend to 0. > 'kexec -c' prefers to reuse this style to alloc mem at lower address, > since if the reserved region is beyond 4G, then it requires extra mem > (default is 16M) for swiotlb. I fail to understand why the availability of memory hotplug information would allow to extend the lower limit of bottom-up memblock allocations below the kernel. The memory in the physical range [0, kernel_start) can be allocated as soon as the kernel memory is reserved. The extents of the memory node hosting the kernel image can be used to limit memblok allocations from that particular node, even in top-down mode. > Signed-off-by: Pingfan Liu > Cc: Tang Chen > Cc: "Rafael J. Wysocki" > Cc: Len Brown > Cc: Andrew Morton > Cc: Mike Rapoport > Cc: Michal Hocko > Cc: Jonathan Corbet > Cc: Yaowei Bai > Cc: Pavel Tatashin > Cc: Nicholas Piggin > Cc: Naoya Horiguchi > Cc: Daniel Vacek > Cc: Mathieu Malaterre > Cc: Stefan Agner > Cc: Dave Young > Cc: Baoquan He > Cc: yinghai@kernel.org, > Cc: vgoyal@redhat.com > Cc: linux-kernel@vger.kernel.org > --- > drivers/acpi/numa.c | 4 ++++ > include/linux/memblock.h | 1 + > mm/memblock.c | 58 +++++++++++++++++++++++++++++------------------- > 3 files changed, 40 insertions(+), 23 deletions(-) > > diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c > index 2746994..3eea4e4 100644 > --- a/drivers/acpi/numa.c > +++ b/drivers/acpi/numa.c > @@ -462,6 +462,10 @@ int __init acpi_numa_init(void) > > cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY, > acpi_parse_memory_affinity, 0); > + > +#if defined(CONFIG_X86) || defined(CONFIG_ARM64) > + mark_mem_hotplug_parsed(); > +#endif > } > > /* SLIT: System Locality Information Table */ > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index aee299a..d89ed9e 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -125,6 +125,7 @@ int memblock_reserve(phys_addr_t base, phys_addr_t size); > void memblock_trim_memory(phys_addr_t align); > bool memblock_overlaps_region(struct memblock_type *type, > phys_addr_t base, phys_addr_t size); > +void mark_mem_hotplug_parsed(void); > int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size); > int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size); > int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); > diff --git a/mm/memblock.c b/mm/memblock.c > index 81ae63c..a3f5e46 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -231,6 +231,12 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end, > return 0; > } > > +static bool mem_hotmovable_parsed __initdata_memblock; > +void __init_memblock mark_mem_hotplug_parsed(void) > +{ > + mem_hotmovable_parsed = true; > +} > + > /** > * memblock_find_in_range_node - find free area in given range and node > * @size: size of free area to find > @@ -259,7 +265,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, > phys_addr_t end, int nid, > enum memblock_flags flags) > { > - phys_addr_t kernel_end, ret; > + phys_addr_t kernel_end, ret = 0; > > /* pump up @end */ > if (end == MEMBLOCK_ALLOC_ACCESSIBLE) > @@ -270,34 +276,40 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, > end = max(start, end); > kernel_end = __pa_symbol(_end); > > - /* > - * try bottom-up allocation only when bottom-up mode > - * is set and @end is above the kernel image. > - */ > - if (memblock_bottom_up() && end > kernel_end) { > - phys_addr_t bottom_up_start; > + if (memblock_bottom_up()) { > + phys_addr_t bottom_up_start = start; > > - /* make sure we will allocate above the kernel */ > - bottom_up_start = max(start, kernel_end); > - > - /* ok, try bottom-up allocation first */ > - ret = __memblock_find_range_bottom_up(bottom_up_start, end, > - size, align, nid, flags); > - if (ret) > + if (mem_hotmovable_parsed) { > + ret = __memblock_find_range_bottom_up( > + bottom_up_start, end, size, align, nid, > + flags); > return ret; > > /* > - * we always limit bottom-up allocation above the kernel, > - * but top-down allocation doesn't have the limit, so > - * retrying top-down allocation may succeed when bottom-up > - * allocation failed. > - * > - * bottom-up allocation is expected to be fail very rarely, > - * so we use WARN_ONCE() here to see the stack trace if > - * fail happens. > + * if mem hotplug info is not parsed yet, try bottom-up > + * allocation with @end above the kernel image. > */ > - WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE), > + } else if (!mem_hotmovable_parsed && end > kernel_end) { > + /* make sure we will allocate above the kernel */ > + bottom_up_start = max(start, kernel_end); > + ret = __memblock_find_range_bottom_up( > + bottom_up_start, end, size, align, nid, > + flags); > + if (ret) > + return ret; > + /* > + * we always limit bottom-up allocation above the > + * kernel, but top-down allocation doesn't have > + * the limit, so retrying top-down allocation may > + * succeed when bottom-up allocation failed. > + * > + * bottom-up allocation is expected to be fail > + * very rarely, so we use WARN_ONCE() here to see > + * the stack trace if fail happens. > + */ > + WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE), > "memblock: bottom-up allocation failed, memory hotremove may be affected\n"); > + } > } > > return __memblock_find_range_top_down(start, end, size, align, nid, > -- > 2.7.4 > -- Sincerely yours, Mike.