From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 100EDC48BF6 for ; Thu, 7 Mar 2024 15:24:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F4476B01A7; Thu, 7 Mar 2024 10:24:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A4AE6B01A8; Thu, 7 Mar 2024 10:24:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66BBF6B01A9; Thu, 7 Mar 2024 10:24:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5442F6B01A7 for ; Thu, 7 Mar 2024 10:24:54 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F357DA182B for ; Thu, 7 Mar 2024 15:24:53 +0000 (UTC) X-FDA: 81870615666.10.A2E2A92 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id 1A29B40013 for ; Thu, 7 Mar 2024 15:24:51 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709825092; a=rsa-sha256; cv=none; b=AKNVPUOxPj9YBTUI8Wff7682i1lpL/FH7V92btiR3zOnMmwZ1m79awLfWsg567so5BYnVP fXkxd5sl8zaOAjOqb7dK+gRz1s859B0sZM2L7lfFLQVRAKjg3nGGXOBFZ6QFHzgTfPZyn2 okW90HImIU8Tlmg/1LWBywKGYAnnj8g= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709825092; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kLEdJeYWccL6AiPHOo+jL7u1WqglEFzQiaMiksWoO9w=; b=j3o96VodHAAyIvGYc3NRSeEBCb8z7hwJUo678aXHs6vpGI1adsrbe+V0TjcRlufzrDqdYA vSQ6qTZfHEOYw6uI3Aw+wT0bENt8/RWCaEi35HWJ+9cLxJn3DURAQ5chNGmC4nCim2j4vD 8RFejkWhlZcBKpu25NOKJFdhsN80p80= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E00551FB; Thu, 7 Mar 2024 07:25:27 -0800 (PST) Received: from [10.1.25.184] (XHFQ2J9959.cambridge.arm.com [10.1.25.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9C8F33F73F; Thu, 7 Mar 2024 07:24:49 -0800 (PST) Message-ID: Date: Thu, 7 Mar 2024 15:24:47 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Content-Language: en-GB To: Matthew Wilcox , "Yin, Fengwei" Cc: Zi Yan , Andrew Morton , linux-mm@kvack.org, Yang Shi , Huang Ying References: <20240227174254.710559-11-willy@infradead.org> <367a14f7-340e-4b29-90ae-bc3fcefdd5f4@arm.com> <85cc26ed-6386-4d6b-b680-1e5fba07843f@arm.com> <36bdda72-2731-440e-ad15-39b845401f50@arm.com> <03CE3A00-917C-48CC-8E1C-6A98713C817C@nvidia.com> <0f5bdbf3-725b-49c7-ba66-973b7cfc93be@intel.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1A29B40013 X-Stat-Signature: k1i14mzuo4rj8ybqhzqs7u7dqfopwify X-HE-Tag: 1709825091-459057 X-HE-Meta: U2FsdGVkX1/LqyzvCUPzoocdFBw+IgCZauuqIGKleYVWN9U0sWd6rOgujEwy9TsfLeJ4B24TErddkBHNpDLE9c59tAqulcFbhlfzwvb/UGtM6RKIOJ8+2qGTl5PHQEf4QS1Quu7t0OnrehgZpy9GhaIpzYxulPaFqUku5qm/Uwwn95mY3nCc72WSOtb1tRPJ3LqTCFD6jg3hfZRXmHwHmh0MUgIVnMqUDs14eR2/Q7zZzba+NbxfpMLOA92KWiKVln/C8xdGhmrICPYa5zejo+NchSerI6FGOVSGcp75OrinZgIkFOFD6W7Hq4mYVQBl4BcZDnFyHs0TKNORuEEFTfqSZINooCflcG2Nu+lKNTOoKwIfATAxeRlnLudO0j82+HvMhoCZQm2IcukCvncUG/Wmfb1o2IGN1WxFQ611sU3sGj8iNb/6e7NK7Jwk7vnYDa3mRtqylGYiYbrMlI1FwJPq933U/nZRcF8bVRXSsGODuVG6BXwt1j5/5Zqonu/IE44chRWBkoZ6ROPoHRfEILCiZW0+KgdFrZlixD9+PrBoS8Tf2r2vemoXnzlBVdS91cxXVmFEdgAOxbK459ZwoWgqx9N151NUupEHXOAQJQGNmMfdzy9uFtD6t2Nt3dwtX6xXIXqiJhcfHk2MJs4EprCKzjBBPXNcR3tM/rs4Die2an+stVM6n5jFytJTx1knpSZsELuQA+mEferrnrRQ5H9+INy36Prh+iGO7yZ1ils2yHhC3VNfc3697FdbqvptuviBIxtVBdUya+RLn/YHA4UO4FY9Fx5UqM8XFuptnxaCifLNScfsapfd2CscUa4clFwabVb7QdOYL3KjobvZ2rTRT2N1/Ltv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 07/03/2024 14:05, Matthew Wilcox wrote: > On Thu, Mar 07, 2024 at 09:50:09PM +0800, Yin, Fengwei wrote: >> >> >> On 3/7/2024 4:56 PM, wrote: >>> I just want to make sure I've understood correctly: CPU1's folio_put() >>> is not the last reference, and it keeps iterating through the local >>> list. Then CPU2 does the final folio_put() which causes list_del_init() >>> to modify the local list concurrently with CPU1's iteration, so CPU1 >>> probably goes into the weeds? >> >> My understanding is this can not corrupt the folio->deferred_list as >> this folio was iterated already. > > I am not convinced about that at all. It's possible this isn't the only > problem, but deleting something from a list without holding (the correct) > lock is something you have to think incredibly hard about to get right. > I didn't bother going any deeper into the analysis once I spotted the > locking problem, but the proof is very much on you that this is not a bug! > >> But I did see other strange thing: >> [ 76.269942] page: refcount:0 mapcount:1 mapping:0000000000000000 >> index:0xffffbd0a0 pfn:0x2554a0 >> [ 76.270483] note: kcompactd0[62] exited with preempt_count 1 >> [ 76.271344] head: order:0 entire_mapcount:1 nr_pages_mapped:0 pincount:0 >> >> This large folio has order 0? Maybe folio->_flags_1 was screwed? >> >> In free_unref_folios(), there is code like following: >> if (order > 0 && folio_test_large_rmappable(folio)) >> folio_undo_large_rmappable(folio); >> >> But with destroy_large_folio(): >> if (folio_test_large_rmappable(folio)) >> >> folio_undo_large_rmappable(folio); >> >> Can it connect to the folio has zero refcount still in deferred list >> with Matthew's patch? >> >> >> Looks like folio order was cleared unexpected somewhere. I think there could be something to this... I have a set up where, when running with Matthew's deferred split fix AND have commit 31b2ff82aefb "mm: handle large folios in free_unref_folios()" REVERTED, everything works as expected. And at the end, I have the expected amount of memory free (seen in meminfo and buddyinfo). But if I run only with the deferred split fix and DO NOT revert the other change, everything grinds to a halt when swapping 2M pages. Sometimes with RCU stalls where I can't even interact on the serial port. Sometimes (more usually) everything just gets stuck trying to reclaim and allocate memory. And when I kill the jobs, I still have barely any memory in the system - about 10% what I would expect. So is it possible that after commit 31b2ff82aefb "mm: handle large folios in free_unref_folios()", when freeing 2M folio back to the buddy, we are actually only telling it about the first 4K page? So we end up leaking the rest? > > No, we intentionally clear it: > > free_unref_folios -> free_unref_page_prepare -> free_pages_prepare -> > page[1].flags &= ~PAGE_FLAGS_SECOND; > > PAGE_FLAGS_SECOND includes the order, which is why we have to save it > away in folio->private so that we know what it is in the second loop. > So it's always been cleared by the time we call free_page_is_bad().