From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62ED5C2B9F8 for ; Tue, 25 May 2021 08:02:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0D139600D3 for ; Tue, 25 May 2021 08:02:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D139600D3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 11ED76B0072; Tue, 25 May 2021 04:02:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A6E86B0073; Tue, 25 May 2021 04:02:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B17416B0074; Tue, 25 May 2021 04:02:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 685DE6B0072 for ; Tue, 25 May 2021 04:02:03 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0FEC0180ACF62 for ; Tue, 25 May 2021 08:02:03 +0000 (UTC) X-FDA: 78179010126.25.A8FE1A8 Received: from outbound-smtp48.blacknight.com (outbound-smtp48.blacknight.com [46.22.136.219]) by imf08.hostedemail.com (Postfix) with ESMTP id 82488801AE53 for ; Tue, 25 May 2021 08:01:56 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp48.blacknight.com (Postfix) with ESMTPS id DFCBEFAB32 for ; Tue, 25 May 2021 09:02:00 +0100 (IST) Received: (qmail 5713 invoked from network); 25 May 2021 08:02:00 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.23.168]) by 81.17.254.9 with ESMTPA; 25 May 2021 08:02:00 -0000 From: Mel Gorman To: Andrew Morton Cc: Hillf Danton , Dave Hansen , Vlastimil Babka , Michal Hocko , LKML , Linux-MM , Mel Gorman Subject: [PATCH 3/6] mm/page_alloc: Adjust pcp->high after CPU hotplug events Date: Tue, 25 May 2021 09:01:16 +0100 Message-Id: <20210525080119.5455-4-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210525080119.5455-1-mgorman@techsingularity.net> References: <20210525080119.5455-1-mgorman@techsingularity.net> MIME-Version: 1.0 Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.219 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 82488801AE53 X-Stat-Signature: mpprgygaawt5igtm1be1ewm4cboa1u1c X-HE-Tag: 1621929716-806027 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The PCP high watermark is based on the number of online CPUs so the watermarks must be adjusted during CPU hotplug. At the time of hot-remove, the number of online CPUs is already adjusted but during hot-add, a delta needs to be applied to update PCP to the correct value. After this patch is applied, the high watermarks are adjusted correctly. # grep high: /proc/zoneinfo | tail -1 high: 649 # echo 0 > /sys/devices/system/cpu/cpu4/online # grep high: /proc/zoneinfo | tail -1 high: 664 # echo 1 > /sys/devices/system/cpu/cpu4/online # grep high: /proc/zoneinfo | tail -1 high: 649 Signed-off-by: Mel Gorman --- include/linux/cpuhotplug.h | 2 +- mm/internal.h | 2 +- mm/memory_hotplug.c | 4 ++-- mm/page_alloc.c | 38 +++++++++++++++++++++++++++----------- 4 files changed, 31 insertions(+), 15 deletions(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 4a62b3980642..47e13582d9fc 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -54,7 +54,7 @@ enum cpuhp_state { CPUHP_MM_MEMCQ_DEAD, CPUHP_PERCPU_CNT_DEAD, CPUHP_RADIX_DEAD, - CPUHP_PAGE_ALLOC_DEAD, + CPUHP_PAGE_ALLOC, CPUHP_NET_DEV_DEAD, CPUHP_PCI_XGENE_DEAD, CPUHP_IOMMU_IOVA_DEAD, diff --git a/mm/internal.h b/mm/internal.h index 54bd0dc2c23c..651250e59ef5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -221,7 +221,7 @@ extern int user_min_free_kbytes; extern void free_unref_page(struct page *page); extern void free_unref_page_list(struct list_head *list); =20 -extern void zone_pcp_update(struct zone *zone); +extern void zone_pcp_update(struct zone *zone, int cpu_online); extern void zone_pcp_reset(struct zone *zone); extern void zone_pcp_disable(struct zone *zone); extern void zone_pcp_enable(struct zone *zone); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 70620d0dd923..bebb3cead810 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -961,7 +961,7 @@ int __ref online_pages(unsigned long pfn, unsigned lo= ng nr_pages, struct zone *z node_states_set_node(nid, &arg); if (need_zonelists_rebuild) build_all_zonelists(NULL); - zone_pcp_update(zone); + zone_pcp_update(zone, 0); =20 /* Basic onlining is complete, allow allocation of onlined pages. */ undo_isolate_page_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE); @@ -1835,7 +1835,7 @@ int __ref offline_pages(unsigned long start_pfn, un= signed long nr_pages) zone_pcp_reset(zone); build_all_zonelists(NULL); } else - zone_pcp_update(zone); + zone_pcp_update(zone, 0); =20 node_states_clear_node(node, &arg); if (arg.status_change_nid >=3D 0) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c0536e5d088a..dc4ac309bc21 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6628,7 +6628,7 @@ static int zone_batchsize(struct zone *zone) #endif } =20 -static int zone_highsize(struct zone *zone, int batch) +static int zone_highsize(struct zone *zone, int batch, int cpu_online) { #ifdef CONFIG_MMU int high; @@ -6639,9 +6639,10 @@ static int zone_highsize(struct zone *zone, int ba= tch) * so that if they are full then background reclaim will not be * started prematurely. The value is split across all online CPUs * local to the zone. Note that early in boot that CPUs may not be - * online yet. + * online yet and that during CPU hotplug that the cpumask is not + * yet updated when a CPU is being onlined. */ - nr_local_cpus =3D max(1U, cpumask_weight(cpumask_of_node(zone_to_nid(zo= ne)))); + nr_local_cpus =3D max(1U, cpumask_weight(cpumask_of_node(zone_to_nid(zo= ne)))) + cpu_online; high =3D low_wmark_pages(zone) / nr_local_cpus; =20 /* @@ -6715,12 +6716,12 @@ static void __zone_set_pageset_high_and_batch(str= uct zone *zone, unsigned long h * Calculate and set new high and batch values for all per-cpu pagesets = of a * zone based on the zone's size. */ -static void zone_set_pageset_high_and_batch(struct zone *zone) +static void zone_set_pageset_high_and_batch(struct zone *zone, int cpu_o= nline) { int new_high, new_batch; =20 new_batch =3D max(1, zone_batchsize(zone)); - new_high =3D zone_highsize(zone, new_batch); + new_high =3D zone_highsize(zone, new_batch, cpu_online); =20 if (zone->pageset_high =3D=3D new_high && zone->pageset_batch =3D=3D new_batch) @@ -6750,7 +6751,7 @@ void __meminit setup_zone_pageset(struct zone *zone= ) per_cpu_pages_init(pcp, pzstats); } =20 - zone_set_pageset_high_and_batch(zone); + zone_set_pageset_high_and_batch(zone, 0); } =20 /* @@ -8008,6 +8009,7 @@ void __init set_dma_reserve(unsigned long new_dma_r= eserve) =20 static int page_alloc_cpu_dead(unsigned int cpu) { + struct zone *zone; =20 lru_add_drain_cpu(cpu); drain_pages(cpu); @@ -8028,6 +8030,19 @@ static int page_alloc_cpu_dead(unsigned int cpu) * race with what we are doing. */ cpu_vm_stats_fold(cpu); + + for_each_populated_zone(zone) + zone_pcp_update(zone, 0); + + return 0; +} + +static int page_alloc_cpu_online(unsigned int cpu) +{ + struct zone *zone; + + for_each_populated_zone(zone) + zone_pcp_update(zone, 1); return 0; } =20 @@ -8053,8 +8068,9 @@ void __init page_alloc_init(void) hashdist =3D 0; #endif =20 - ret =3D cpuhp_setup_state_nocalls(CPUHP_PAGE_ALLOC_DEAD, - "mm/page_alloc:dead", NULL, + ret =3D cpuhp_setup_state_nocalls(CPUHP_PAGE_ALLOC, + "mm/page_alloc:pcp", + page_alloc_cpu_online, page_alloc_cpu_dead); WARN_ON(ret < 0); } @@ -8192,7 +8208,7 @@ static void __setup_per_zone_wmarks(void) * The watermark size have changed so update the pcpu batch * and high limits or the limits may be inappropriate. */ - zone_set_pageset_high_and_batch(zone); + zone_set_pageset_high_and_batch(zone, 0); =20 spin_unlock_irqrestore(&zone->lock, flags); } @@ -9014,10 +9030,10 @@ EXPORT_SYMBOL(free_contig_range); * The zone indicated has a new number of managed_pages; batch sizes and= percpu * page high values need to be recalculated. */ -void __meminit zone_pcp_update(struct zone *zone) +void zone_pcp_update(struct zone *zone, int cpu_online) { mutex_lock(&pcp_batch_high_lock); - zone_set_pageset_high_and_batch(zone); + zone_set_pageset_high_and_batch(zone, cpu_online); mutex_unlock(&pcp_batch_high_lock); } =20 --=20 2.26.2