From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35797C433ED for ; Wed, 7 Apr 2021 20:24:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C301F611EE for ; Wed, 7 Apr 2021 20:24:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C301F611EE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0D01B6B007D; Wed, 7 Apr 2021 16:24:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A6286B007E; Wed, 7 Apr 2021 16:24:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E62886B0080; Wed, 7 Apr 2021 16:24:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id CBEDA6B007D for ; Wed, 7 Apr 2021 16:24:57 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9052C81D8 for ; Wed, 7 Apr 2021 20:24:57 +0000 (UTC) X-FDA: 78006699834.36.B907DB9 Received: from outbound-smtp47.blacknight.com (outbound-smtp47.blacknight.com [46.22.136.64]) by imf17.hostedemail.com (Postfix) with ESMTP id 5B0BA40002DB for ; Wed, 7 Apr 2021 20:24:55 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp47.blacknight.com (Postfix) with ESMTPS id D44DCFB3E9 for ; Wed, 7 Apr 2021 21:24:55 +0100 (IST) Received: (qmail 14561 invoked from network); 7 Apr 2021 20:24:55 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 7 Apr 2021 20:24:55 -0000 From: Mel Gorman To: Linux-MM , Linux-RT-Users Cc: LKML , Chuck Lever , Jesper Dangaard Brouer , Matthew Wilcox , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Michal Hocko , Oscar Salvador , Mel Gorman Subject: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Date: Wed, 7 Apr 2021 21:24:14 +0100 Message-Id: <20210407202423.16022-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210407202423.16022-1-mgorman@techsingularity.net> References: <20210407202423.16022-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5B0BA40002DB X-Stat-Signature: t5dhqhcqnehn78gy7eadzpeuuo8nwrhg Received-SPF: none (techsingularity.net>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=outbound-smtp47.blacknight.com; client-ip=46.22.136.64 X-HE-DKIM-Result: none/none X-HE-Tag: 1617827095-335038 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is a lack of clarity of what exactly local_irq_save/local_irq_resto= re protects in page_alloc.c . It conflates the protection of per-cpu page allocation structures with per-cpu vmstat deltas. This patch protects the PCP structure using local_lock which for most configurations is identical to IRQ enabling/disabling. The scope of the lock is still wider than it should be but this is decreased laster. [lkp@intel.com: Make pagesets static] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c | 50 +++++++++++++++++++++++++++++------------- 2 files changed, 37 insertions(+), 15 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a4393ac27336..106da8fbc72a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -20,6 +20,7 @@ #include #include #include +#include #include =20 /* Free memory management - zoned buddy allocator. */ @@ -337,6 +338,7 @@ enum zone_watermarks { #define high_wmark_pages(z) (z->_watermark[WMARK_HIGH] + z->watermark_bo= ost) #define wmark_pages(z, i) (z->_watermark[i] + z->watermark_boost) =20 +/* Fields and list protected by pagesets local_lock in page_alloc.c */ struct per_cpu_pages { int count; /* number of pages in the list */ int high; /* high watermark, emptying needed */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a68bacddcae0..e9e60d1a85d4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -112,6 +112,13 @@ typedef int __bitwise fpi_t; static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_FRACTION (8) =20 +struct pagesets { + local_lock_t lock; +}; +static DEFINE_PER_CPU(struct pagesets, pagesets) =3D { + .lock =3D INIT_LOCAL_LOCK(lock), +}; + #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID DEFINE_PER_CPU(int, numa_node); EXPORT_PER_CPU_SYMBOL(numa_node); @@ -1421,6 +1428,10 @@ static void free_pcppages_bulk(struct zone *zone, = int count, } while (--count && --batch_free && !list_empty(list)); } =20 + /* + * local_lock_irq held so equivalent to spin_lock_irqsave for + * both PREEMPT_RT and non-PREEMPT_RT configurations. + */ spin_lock(&zone->lock); isolated_pageblocks =3D has_isolate_pageblock(zone); =20 @@ -1541,6 +1552,11 @@ static void __free_pages_ok(struct page *page, uns= igned int order, return; =20 migratetype =3D get_pfnblock_migratetype(page, pfn); + + /* + * TODO FIX: Disable IRQs before acquiring IRQ-safe zone->lock + * and protect vmstat updates. + */ local_irq_save(flags); __count_vm_events(PGFREE, 1 << order); free_one_page(page_zone(page), page, pfn, order, migratetype, @@ -2910,6 +2926,10 @@ static int rmqueue_bulk(struct zone *zone, unsigne= d int order, { int i, allocated =3D 0; =20 + /* + * local_lock_irq held so equivalent to spin_lock_irqsave for + * both PREEMPT_RT and non-PREEMPT_RT configurations. + */ spin_lock(&zone->lock); for (i =3D 0; i < count; ++i) { struct page *page =3D __rmqueue(zone, order, migratetype, @@ -2962,12 +2982,12 @@ void drain_zone_pages(struct zone *zone, struct p= er_cpu_pages *pcp) unsigned long flags; int to_drain, batch; =20 - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); batch =3D READ_ONCE(pcp->batch); to_drain =3D min(pcp->count, batch); if (to_drain > 0) free_pcppages_bulk(zone, to_drain, pcp); - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); } #endif =20 @@ -2983,13 +3003,13 @@ static void drain_pages_zone(unsigned int cpu, st= ruct zone *zone) unsigned long flags; struct per_cpu_pages *pcp; =20 - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); =20 pcp =3D per_cpu_ptr(zone->per_cpu_pageset, cpu); if (pcp->count) free_pcppages_bulk(zone, pcp->count, pcp); =20 - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); } =20 /* @@ -3252,9 +3272,9 @@ void free_unref_page(struct page *page) if (!free_unref_page_prepare(page, pfn)) return; =20 - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); free_unref_page_commit(page, pfn); - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); } =20 /* @@ -3274,7 +3294,7 @@ void free_unref_page_list(struct list_head *list) set_page_private(page, pfn); } =20 - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); list_for_each_entry_safe(page, next, list, lru) { unsigned long pfn =3D page_private(page); =20 @@ -3287,12 +3307,12 @@ void free_unref_page_list(struct list_head *list) * a large list of pages to free. */ if (++batch_count =3D=3D SWAP_CLUSTER_MAX) { - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); batch_count =3D 0; - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); } } - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); } =20 /* @@ -3449,7 +3469,7 @@ static struct page *rmqueue_pcplist(struct zone *pr= eferred_zone, struct page *page; unsigned long flags; =20 - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); pcp =3D this_cpu_ptr(zone->per_cpu_pageset); list =3D &pcp->lists[migratetype]; page =3D __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list); @@ -3457,7 +3477,7 @@ static struct page *rmqueue_pcplist(struct zone *pr= eferred_zone, __count_zid_vm_events(PGALLOC, page_zonenum(page), 1); zone_statistics(preferred_zone, zone); } - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); return page; } =20 @@ -5052,7 +5072,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int pre= ferred_nid, goto failed; =20 /* Attempt the batch allocation */ - local_irq_save(flags); + local_lock_irqsave(&pagesets.lock, flags); pcp =3D this_cpu_ptr(zone->per_cpu_pageset); pcp_list =3D &pcp->lists[ac.migratetype]; =20 @@ -5090,12 +5110,12 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int p= referred_nid, nr_populated++; } =20 - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); =20 return nr_populated; =20 failed_irq: - local_irq_restore(flags); + local_unlock_irqrestore(&pagesets.lock, flags); =20 failed: page =3D __alloc_pages(gfp, 0, preferred_nid, nodemask); --=20 2.26.2