From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA2B0C7EE24 for ; Wed, 31 May 2023 09:58:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3AF38E0001; Wed, 31 May 2023 05:58:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E11806B0074; Wed, 31 May 2023 05:58:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D01E38E0001; Wed, 31 May 2023 05:58:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C104A6B0072 for ; Wed, 31 May 2023 05:58:20 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 80F2740347 for ; Wed, 31 May 2023 09:58:20 +0000 (UTC) X-FDA: 80850099960.08.0EC09C0 Received: from out-15.mta0.migadu.com (out-15.mta0.migadu.com [91.218.175.15]) by imf10.hostedemail.com (Postfix) with ESMTP id A0B16C0029 for ; Wed, 31 May 2023 09:58:17 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HibbmKwH; spf=pass (imf10.hostedemail.com: domain of qi.zheng@linux.dev designates 91.218.175.15 as permitted sender) smtp.mailfrom=qi.zheng@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685527098; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=PItdhrJgOO6Uf9HQuBa9pn2dZV15hG06si+vWLwhY4U=; b=Po3YvI2fO397MbbF2o+q6lyL4pKPQ3veFco+th4K8WwEniqN49wQ513HzbPdygPhBgeFM8 ImXmAiHL3RUEc/7ZWU9JVHyYl23Q0uRE+BuMhKrfgyb02APGtfkVNZ/ygzWR0AgTqCrYhI y8zvOPWphlq8tKBUHVcNpkeM5oaLvgo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685527098; a=rsa-sha256; cv=none; b=Qj5fzxP/HU5Sii/gsIxhFdURNhBcXkEL9sGHJ7sE6tbCVprrg4bFOynGtTOPeXCnhVTpI9 Z7SJ95ZrcHbaj8mX4ZzfFtLdDt0f7k5pl+p+lNGPtSjq/tyNYCpezsbEKYFrVnXzjfilgO aZet0RYgkzVSSj2kxBAp0MK7JTlmyVw= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HibbmKwH; spf=pass (imf10.hostedemail.com: domain of qi.zheng@linux.dev designates 91.218.175.15 as permitted sender) smtp.mailfrom=qi.zheng@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1685527095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=PItdhrJgOO6Uf9HQuBa9pn2dZV15hG06si+vWLwhY4U=; b=HibbmKwHH6i0eqMj0ivFsqhZE4MS2V1GSStWp+7t0p4zPvhN090WskQM6WID4MuKhmoHYh HbPpUh5MXnJlOxmVcpX5w1o3mJx5mVhfQl/IvoUYnLrgBzNTOTzS1pmFtZVvKS5aN8F/hC 8xlSAR6vOuOlLS1YfL7BM8u6CG2jE14= From: Qi Zheng To: akpm@linux-foundation.org, tkhai@ya.ru, roman.gushchin@linux.dev, vbabka@suse.cz, viro@zeniv.linux.org.uk, brauner@kernel.org, djwong@kernel.org, hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH 0/8] make unregistration of super_block shrinker more faster Date: Wed, 31 May 2023 09:57:34 +0000 Message-Id: <20230531095742.2480623-1-qi.zheng@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Stat-Signature: 1royamignotusq7g9qyoe5kdwq674kgf X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A0B16C0029 X-Rspam-User: X-HE-Tag: 1685527097-202438 X-HE-Meta: U2FsdGVkX1+n7xLzire8HKcdwKZ5Gejks8H2wLOiRXjhAHdnlFNGobgpitKiSObDAgLB1bd4nvR8nPJ13sl+8NbPWIIDPK2HtQmjYIseQdNoZH9zngft0njZAiw4+tOlh0HGut2lN2BDfb0MPR7AM68B0YGPKQMmbD0EIcUsTRopH+k5HGGVEYr+GyfhN/Su0NTxsdQcr+1C7iZwbbAVtIQI43Jr5+hx0IPhseUM/VYPfUgN+nG0rIYtd68AGhy8+Ge3P16YVNvYQR7QvCsErinQ1PC6XTn/D64KmGezP/+1gKfA83bC9AClSdATFTGHPPTJYdvvPQHqYRraTXBs9qPQTXVK0iFRCD0KU5crkLHNbbR1czrr3m8cf0v9DxK390C47nXiS312Y0A3rKNe9ac5mz/oo9DjTcnGrM1dDGrnnSd8yjqzRhzwVMvzznm2lIZaZq1leKXHS/yCDKQH/g4meqGstOJPouYysfUwZWie24u/JUSZy2fuH9NSvS4EwGkLaYI1MC9dOZ/ScFaZfuC73zD/LUnNFm30VA0ZUMP4Ihup4zt1hN/2KaqdBblv+DHNhp9KozVpC3FlxDgBNrZfm4Kww8K2Jlz1Jw3jy7R1CrWgUPsCZTInmMiPMWZkqrG462KgHonE8uV/GTD7qjECLzzIHZBSocyyl4F29mSLm+915B/1bCWJ1eDiDW2kC0OROuq0fV1qa4rtkkF+e3UaXOCFspp0iDLAQIPnH53eeVZA2ATofNQK0NOb//bRG6sM1FXmxdcvi+TXN8JCespUtQJ1iJeSA/PKgJhWTHThO+BC7bt3+1FqazjaEgrwudg4D4U+geG219e0wUZTKN/w740NsGLtklxL6hLaWHHZJ2tEKGLIEj9Lq+esT2vrolD7aP8T6Fwx30wkh5hxaih2u9m1DRUZ4P11wC7NiGKMGsAjUb/Kwf0oyUaetauZVpEv2lsebTpUFmcTXyn 6fTxomho bZQUFlGAFTmimUTbEFBEpcD07xLKQ+vMdxFOsCN9ctvgSMRULsabQANDv4qmg1/XndU54ekXBz6RU/q1Q2CY92F25tu91KoRDGAKa2oSiIyryxDjQYos/aLs4hliXVWJ4IqCVjPM9w6WkmtI9dX5VhEg/bD3wN+SK7Z6rEhVuSEGUGa6Xfhvp/9Eotc1AYPaoA55rXfo7poqbrfoUS4VX9JNZPOrDvm9XcheVcS0hzpgJaDW8b31GagLRDhRBkag+3X76zEs7p220rr0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qi Zheng Hi all, This patch series aims to make unregistration of super_block shrinker more faster. 1. Background ============= The kernel test robot noticed a -88.8% regression of stress-ng.ramfs.ops_per_sec on commit f95bdb700bc6 ("mm: vmscan: make global slab shrink lockless"). More details can be seen from the link[1] below. [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/ We can just use the following command to reproduce the result: stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 & 1) before commit f95bdb700bc6b: stress-ng: info: [11023] dispatching hogs: 9 ramfs stress-ng: info: [11023] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [11023] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [11023] ramfs 774966 60.00 10.18 169.45 12915.89 4314.26 stress-ng: info: [11023] for a 60.00s run time: stress-ng: info: [11023] 1920.11s available CPU time stress-ng: info: [11023] 10.18s user time ( 0.53%) stress-ng: info: [11023] 169.44s system time ( 8.82%) stress-ng: info: [11023] 179.62s total time ( 9.35%) stress-ng: info: [11023] load average: 8.99 2.69 0.93 stress-ng: info: [11023] successful run completed in 60.00s (1 min, 0.00 secs) 2) after commit f95bdb700bc6b: stress-ng: info: [37676] dispatching hogs: 9 ramfs stress-ng: info: [37676] stressor bogo ops real time usrtime sys time bogo ops/s bogo ops/s stress-ng: info: [37676] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [37676] ramfs 168673 60.00 1.61 39.66 2811.08 4087.47 stress-ng: info: [37676] for a 60.10s run time: stress-ng: info: [37676] 1923.36s available CPU time stress-ng: info: [37676] 1.60s user time ( 0.08%) stress-ng: info: [37676] 39.66s system time ( 2.06%) stress-ng: info: [37676] 41.26s total time ( 2.15%) stress-ng: info: [37676] load average: 7.69 3.63 2.36 stress-ng: info: [37676] successful run completed in 60.10s (1 min, 0.10 secs) The root cause is that SRCU has to be careful to not frequently check for srcu read-side critical section exits. Paul E. McKenney gave a detailed explanation: ``` In practice, the act of checking to see if there is anyone in an SRCU read-side critical section is a heavy-weight operation, involving at least one cache miss per CPU along with a number of full memory barriers. ``` Therefore, even if no one is currently in the SRCU read-side critical section, synchronize_srcu() cannot return quickly. That's why unregister_shrinker() has become slower. 2. Idea ======= 2.1 use synchronize_srcu_expedited() ? -------------------------------------- The synchronize_srcu_expedited() will let SRCU to be much more aggressive. If we use it to replace synchronize_srcu() in the unregister_shrinker(), the ops/s will return to previous levels: stress-ng: info: [13159] dispatching hogs: 9 ramfs stress-ng: info: [13159] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [13159] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [13159] ramfs 710062 60.00 9.63 157.26 11834.18 4254.75 stress-ng: info: [13159] for a 60.00s run time: stress-ng: info: [13159] 1920.14s available CPU time stress-ng: info: [13159] 9.62s user time ( 0.50%) stress-ng: info: [13159] 157.26s system time ( 8.19%) stress-ng: info: [13159] 166.88s total time ( 8.69%) stress-ng: info: [13159] load average: 9.49 4.02 1.65 stress-ng: info: [13159] successful run completed in 60.00s (1 min, 0.00 secs) But because SRCU (Sleepable RCU) is used here, the reader is allowed to sleep in the read-side critical section, so synchronize_srcu_expedited() may cause a lot of CPU consumption, so this is not a good choice. 2.2 move synchronize_srcu() to the asynchronous delayed work ------------------------------------------------------------ Kirill Tkhai proposed a better idea[2] in 2018: move synchronize_srcu() to the asynchronous delayed work, then it doesn't affect on user-visible unregistration speed. [2]. https://lore.kernel.org/lkml/153365636747.19074.12610817307548583381.stgit@localhost.localdomain/ After applying his patches ([PATCH RFC 04/10]~[PATCH RFC 10/10], with few conflicts), the ops/s is of course back to the previous levels: stress-ng: info: [11506] setting to a 60 second run per stressor stress-ng: info: [11506] dispatching hogs: 9 ramfs stress-ng: info: [11506] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [11506] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [11506] ramfs 829462 60.00 10.81 174.25 13824.14 4482.08 stress-ng: info: [11506] for a 60.00s run time: stress-ng: info: [11506] 1920.12s available CPU time stress-ng: info: [11506] 10.81s user time ( 0.56%) stress-ng: info: [11506] 174.25s system time ( 9.07%) stress-ng: info: [11506] 185.06s total time ( 9.64%) stress-ng: info: [11506] load average: 8.96 2.60 0.89 stress-ng: info: [11506] successful run completed in 60.00s (1 min, 0.00 secs) In order to continue to advance this patch set, I rebase these patches onto the next-20230525. Any comments and suggestions are welcome. Note: This patch serise is only for super_block shrinker, all further time-critical for unregistration places may be written in the same conception. Thanks, Qi Kirill Tkhai (7): mm: vmscan: split unregister_shrinker() fs: move list_lru_destroy() to destroy_super_work() fs: shrink only (SB_ACTIVE|SB_BORN) superblocks in super_cache_scan() fs: introduce struct super_operations::destroy_super() callback xfs: introduce xfs_fs_destroy_super() shmem: implement shmem_destroy_super() fs: use unregister_shrinker_delayed_{initiate, finalize} for super_block shrinker Qi Zheng (1): mm: vmscan: move shrinker_debugfs_remove() before synchronize_srcu() fs/super.c | 32 ++++++++++++++++++-------------- fs/xfs/xfs_super.c | 25 ++++++++++++++++++++++--- include/linux/fs.h | 6 ++++++ include/linux/shrinker.h | 2 ++ mm/shmem.c | 8 ++++++++ mm/vmscan.c | 26 ++++++++++++++++++++------ 6 files changed, 76 insertions(+), 23 deletions(-) -- 2.30.2