From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59FABEE57DF for ; Mon, 11 Sep 2023 18:11:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 94E306B02CD; Mon, 11 Sep 2023 14:11:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FE106B02CE; Mon, 11 Sep 2023 14:11:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C63C6B02CF; Mon, 11 Sep 2023 14:11:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6D3B26B02CD for ; Mon, 11 Sep 2023 14:11:13 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 424B1A0A1D for ; Mon, 11 Sep 2023 18:11:13 +0000 (UTC) X-FDA: 81225108426.11.9777983 Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) by imf27.hostedemail.com (Postfix) with ESMTP id 427A54000D for ; Mon, 11 Sep 2023 18:11:10 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ihQN1UeC; spf=pass (imf27.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.50 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694455870; a=rsa-sha256; cv=none; b=w6qagE+pSS0lnGUrcMGpi3exLA5qA9sGDGPa1KX8Ts/GDZSouNB+z9U+224myqT+aeTaIN 04JkRpXasfos22Wq60XSvemfeftt5obbu5uurIB/Z8nfvOF4rnjBuSd3mB1DA4eHf+dDuu SrjgHUMP/rPK2GK5mr54SFezMQ9TWu0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ihQN1UeC; spf=pass (imf27.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.50 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694455870; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BZVUCQtETrJ9r0Nkfa1IVxgH2ja2C3QVluo3MwRYloc=; b=0TTrvbgLZWKA1CRux69s1Z8EVY6VRz8c0YVJ4t+0be1UC6iPLV/9SdRhhjBhEj/o6HFvuc MkKUaN0wZrxnkKcDH9mgfjTZUlhUIXaSafBH9DnnSmzGJOPhWHMFV90YsnlJGes71d5+r3 8toKsedEBQIk9DQs8Ljstm/VtR7LtV4= Received: by mail-qv1-f50.google.com with SMTP id 6a1803df08f44-655e10e4e4aso9597876d6.2 for ; Mon, 11 Sep 2023 11:11:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694455869; x=1695060669; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=BZVUCQtETrJ9r0Nkfa1IVxgH2ja2C3QVluo3MwRYloc=; b=ihQN1UeCzNJOW6brQ4k7jKgXvcvEVh1ZZMwMQ9oMpqofe4vh+iPV/n3wWho16y201y gBEG2w79ydLGAdV4y2rfT92gO07jEbiw2K7nxI7EWKBc+T/YzUiQMIx70mCpM9kw+yhn bpoAFJENrdVdVV6mYxCPGe/HVbPfGGa2cue7rkF6718AVwlsFPaTiQ5dxxt0ChHKVDNg dp/K981sw79AhA7xB657tl3+0+6M5ADzLQO8D+VlD1S7ehj2WMO+20JBbwfflXTzYCy6 p6KV9PkRYHjKfYQvZeYZmf7gybI8H+0Cxcwl9jXMDHyLg1SM4yUsk60ZAE9L2C/FeRN1 CD4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694455869; x=1695060669; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BZVUCQtETrJ9r0Nkfa1IVxgH2ja2C3QVluo3MwRYloc=; b=ndmI24t9l0QaeLigLd1r0tIgOsDwuKANeIpprwB6dgk6qNjDyWS3KLj6uGLdSbUzO/ mzmGDwUjMNbWYi1g+k4Nap7/8bIvCh/oRCiMC0+qr8yNMbSRQs2P269QPkAgpPbdgum0 SPqOFaq3Hi2PXA/7Aj0eo1k396BGkOdpVnnMs8n1eu1wkJ3d4UEN9vXElZLldrRlwXpY mEc8ot4/BFEL8HS0pDGRnTPxWLYk9KxR8dyYSccyZZcUO4RAqkigO+qlXguBBTahV9AF tPKEh5xSFqVdiyEzfxKuYAoYg5Fq9ctQXsj13JGm+IRclmNHA6DTk7Viuv5hr8HRDVCY wnfw== X-Gm-Message-State: AOJu0Yx2+tBwKVlFG1AgevuXGSpcHsAEaEslbtvy/qX7xSA9E25udu3G OxVgbMCIASTj0aOt5G1TKaFVrA== X-Google-Smtp-Source: AGHT+IFMpN65uhnlV0i2kTVp87cTtvt2OrtOJ5Z2OC+m7wSbr5rXwfLXh0yLBClSwJhc4Foxz6py0w== X-Received: by 2002:a0c:8e83:0:b0:635:e4ed:b6c9 with SMTP id x3-20020a0c8e83000000b00635e4edb6c9mr7944759qvb.24.1694455869210; Mon, 11 Sep 2023 11:11:09 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id f21-20020a0caa95000000b0063f7a2847bcsm3138363qvb.51.2023.09.11.11.11.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Sep 2023 11:11:08 -0700 (PDT) Date: Mon, 11 Sep 2023 14:11:08 -0400 From: Johannes Weiner To: Vlastimil Babka Cc: Mel Gorman , Lecopzer Chen , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, nsaenzju@redhat.com, yj.chiang@mediatek.com, Mark-pk Tsai , Joe Liu Subject: Re: [PATCH] mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback Message-ID: <20230911181108.GA104295@cmpxchg.org> References: <20230830111332.7599-1-lecopzer.chen@mediatek.com> <20230905090922.zy7srh33rg5c3zao@techsingularity.net> <20230911155727.GA102237@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 427A54000D X-Stat-Signature: echjtcgmp66tqbcyan8yhzt35tetfbn3 X-Rspam-User: X-HE-Tag: 1694455870-245316 X-HE-Meta: U2FsdGVkX1+ohYZ2MKP2QPFkR/ERKCfSETJsXgaeNOGZbLbCouxdKgCF5el+AgftHWvz60UWbEN5YWLlkCQDWrYGS3Q1ic9kCePHlH+qS7jPIfi92OTPsJQMeJqVsUaewh8r/XjPzOdA/bc7XvGqr+oJ2uM2MD2GaqkdzdyRtKnAvv7t/moOrXLKheveTCZ2T2cogjpDoDkz+ScbxYxbZznWLJFLrLA6byFHCQYST1ZhkUGUdJrKdSlxH5m/Hw4/g/TrJFllF7D2yc4Kow2Qfez8j1kRp2UcIsOnyPjzZAssDf3RxnEnt/XN3n8rUIowDogn9/lrKEXgxO7Z5SN3dYel6Wgo73GEtUJ1fIb+iZkS+cGj4ZeiNwXiWViPJ9Kwk8fPiFDUrfBJwZhgKS/cDCTJgLT1g9l6I14cyPe38qoaWteHgue+gyh1gsoKm1qJpzJwoJxuPyQ3/Ry1/EIR9kLKz2hqTB62EzgqEOltoC9sVq484iC6IwbXNcad9sreliApJ6onLvU3KJysCEKqwsz1t20Jq3vwzYADVPwVQMQEXO5CMcAiZ7PC54/7TbfVbf3xFBG3UDQe/CR8JzL2YJ++x13RgeC3cmnxx9pgens7z5ww5Gmk2zC7eWD5RhP2mDC9mXXWNeN3TR/C8BWYoNDX3xDfCLzBNtouj3zX68UK8NgI05v+DLffiKzlaEDQ8d5UNahjMwtLHQhXn2UHzrbwfHxXMx4AAqm3+wTIZL1JcbJSelZg3vmAvB/Np9Iq56xUqCdsAWc5d+bmWr7cJknhK3BHVS1C/Sc4UU17dH2SPwWt10V1Rice9CILhvUykguMJtL+/WLeCsYqH3zj15fT2hzQmwvekCAvzUPyk/9XOK73umck5SW3sP6+dmJ803l+wbIvGh+zlDTJPyRypZfEJchdxaP0F58qwHGQedz5CX1S3ccrvEn07oP5AXJYB94Vm0snmgkRSV5VAR6 SfnLizZL hPp/uYMh0txMBfxiSWycmGJDrfY6YFs0tUsKERC4yLT8QMzCAE7B5tRGKbx+QKtzGPRvhNZadfWbSIgU8YuYT+giGmLnMWelEYhd0izr1lNQ/qeta3y6C/hAkdLMVPM8PZDA4xNJek63F3+GolPcagXwh/0dt4sEzBNfrs2r3yEzvj+3N1eGVJpSLYxKQ4ZWoEcFzVuBP+Oa0D8JtQ1ZUxOuKj0xWunyyJ2zBOSDqoYI/bsMSDWR23/qkLaAbMEp9m1mBqxJAdoEckYfEBa/+jjNWNJjlb9CqOQb1uuE9ndobCdiIt6AjjUlKOQz56U8lWjYKhccf8y+xoH/DTBuXnYH9jUYKUc1wPtMruwRaRBgZzRBvKTjynmgyqoCJhWmiC6nScPWRd+jwEQzUiSjnUKkT6gwnnan07vfyURYftB5pcxo56V/AodMw0QD5y/Pd00a5xilTGVDOmygBLWNfdhcUjKeTvbBSLOWofxK71gmp36/RxZjOep5M7qOEjV8a/0JzwLME++ajHpcxZXaCqsfhWEEFvbmKn473S4bosmBm38ntPriRQlTcVpvTa4aZYEd/kZsxnEDYIqd/ASOxDD6FGvipgirP6tfRuGMZnlOzgL0gxg7EXimpSGjw3w8MEZQX4Xv7s7UPRiUIkStP/p3RBJcKSMzGcpSGJ/2s9ldmDEzW9gMPJNX+TF2fJR6hkgjzIIyehCgiV7g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 11, 2023 at 06:13:59PM +0200, Vlastimil Babka wrote: > On 9/11/23 17:57, Johannes Weiner wrote: > > On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote: > >> mm: page_alloc: Free pages to correct buddy list after PCP lock contention > >> > >> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") > >> returns pages to the buddy list on PCP lock contention. However, for > >> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have > >> been clobbered already for pages that are not being isolated. In > >> practice, this means that CMA pages may be returned to the wrong > >> buddy list. While this might be harmless in some cases as it is > >> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback > >> and prevent a future CMA allocation. Lookup the PCP migratetype > >> against unconditionally if the PCP lock is contended. > >> > >> [lecopzer.chen@mediatek.com: CMA-specific fix] > >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") > >> Reported-by: Joe Liu > >> Signed-off-by: Mel Gorman > >> --- > >> mm/page_alloc.c | 8 +++++++- > >> 1 file changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >> index 452459836b71..4053c377fee8 100644 > >> --- a/mm/page_alloc.c > >> +++ b/mm/page_alloc.c > >> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order) > >> free_unref_page_commit(zone, pcp, page, migratetype, order); > >> pcp_spin_unlock(pcp); > >> } else { > >> - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE); > >> + /* > >> + * The page migratetype may have been clobbered for types > >> + * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so > >> + * must be rechecked. > >> + */ > >> + free_one_page(zone, page, pfn, order, > >> + get_pcppage_migratetype(page), FPI_NONE); > >> } > >> pcp_trylock_finish(UP_flags); > >> } > >> > > > > I had sent a (similar) fix for this here: > > > > https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/ > > > > The context wasn't CMA, but HIGHATOMIC pages going to the movable > > freelist. But the class of bug is the same: the migratetype tweaking > > really only applies to the pcplist, not the buddy slowpath; I added a > > local pcpmigratetype to make it more clear, and hopefully prevent bugs > > of this nature down the line. > > Seems to be the cleanest solution to me, indeed. > > > I'm just preparing v2 of the above series. Do you want me to break > > this change out and send it separately? > > Works for me, if you combine the it with the information about what commit > that fixes, the CMA implications reported, and Cc stable. How about this? Based on v6.6-rc1. --- >From 84e4490095ed3d1f2991e7f0e58e2968e56cc7c0 Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Fri, 28 Jul 2023 14:29:41 -0400 Subject: [PATCH] mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") bypasses the pcplist on lock contention and returns the page directly to the buddy list of the page's migratetype. For pages that don't have their own pcplist, such as CMA and HIGHATOMIC, the migratetype is temporarily updated such that the page can hitch a ride on the MOVABLE pcplist. Their true type is later reassessed when flushing in free_pcppages_bulk(). However, when lock contention is detected after the type was already overriden, the bypass will then put the page on the wrong buddy list. Once on the MOVABLE buddy list, the page becomes eligible for fallbacks and even stealing. In the case of HIGHATOMIC, otherwise ineligible allocations can dip into the highatomic reserves. In the case of CMA, the page can be lost from the CMA region permanently. Use a separate pcpmigratetype variable for the pcplist override. Use the original migratetype when going directly to the buddy. This fixes the bug and should make the intentions more obvious in the code. Originally sent here to address the HIGHATOMIC case: https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/ Changelog updated in response to the CMA-specific bug report. [mgorman@techsingularity.net: updated changelog] Reported-by: Joe Liu Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") Cc: stable@vger.kernel.org Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0c5be12f9336..95546f376302 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2400,7 +2400,7 @@ void free_unref_page(struct page *page, unsigned int order) struct per_cpu_pages *pcp; struct zone *zone; unsigned long pfn = page_to_pfn(page); - int migratetype; + int migratetype, pcpmigratetype; if (!free_unref_page_prepare(page, pfn, order)) return; @@ -2408,24 +2408,24 @@ void free_unref_page(struct page *page, unsigned int order) /* * We only track unmovable, reclaimable and movable on pcp lists. * Place ISOLATE pages on the isolated list because they are being - * offlined but treat HIGHATOMIC as movable pages so we can get those - * areas back if necessary. Otherwise, we may have to free + * offlined but treat HIGHATOMIC and CMA as movable pages so we can + * get those areas back if necessary. Otherwise, we may have to free * excessively into the page allocator */ - migratetype = get_pcppage_migratetype(page); + migratetype = pcpmigratetype = get_pcppage_migratetype(page); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE); return; } - migratetype = MIGRATE_MOVABLE; + pcpmigratetype = MIGRATE_MOVABLE; } zone = page_zone(page); pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, pcpmigratetype, order); pcp_spin_unlock(pcp); } else { free_one_page(zone, page, pfn, order, migratetype, FPI_NONE); -- 2.42.0