From: Waiman Long <longman@redhat.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Jonathan Corbet <corbet@lwn.net>, "Luis R. Rodriguez" <mcgrof@kernel.org>, Kees Cook <keescook@chromium.org> Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Jan Kara <jack@suse.cz>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Andrew Morton <akpm@linux-foundation.org>, Ingo Molnar <mingo@kernel.org>, Miklos Szeredi <mszeredi@redhat.com>, Matthew Wilcox <willy@infradead.org>, Larry Woodman <lwoodman@redhat.com>, James Bottomley <James.Bottomley@HansenPartnership.com>, "Wangkai (Kevin C)" <wangkai86@huawei.com>, Waiman Long <longman@redhat.com> Subject: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Date: Fri, 6 Jul 2018 15:32:45 -0400 [thread overview] Message-ID: <1530905572-817-1-git-send-email-longman@redhat.com> (raw) v5->v6: - Drop the neg_dentry_pc boot command line option, but add a "neg-dentry-pc" sysctl parameter instead. - Change the "enforce-neg-dentry-limit" sysctl parameter to "neg-dentry-enforce". - Add a patch to add negative dentry to the head of the LRU initially so that they will be the first to be removed if they are not accessed again. - Run some additional performance test. v4->v5: - Backed to the latest 4.18 kernel and modify the code accordingly. Patch 1 "Relocate dentry_kill() after lock_parent()" is now no longer necessary. - Make tracking and limiting of negative dentries a user configurable option (CONFIG_DCACHE_TRACK_NEG_ENTRY) so that users can decide if they want to include this capability in the kernel. - Make killing excess negative dentries an optional feature that can be enabled via a boot command line option or a sysctl parameter. - Spread negative dentry pruning across multiple CPUs. v4: https://lkml.org/lkml/2017/9/18/739 v5: https://lkml.org/lkml/2018/7/2/21 A rogue application can potentially create a large number of negative dentries in the system consuming most of the memory available if it is not under the direct control of a memory controller that enforce kernel memory limit. This patchset introduces changes to the dcache subsystem to track and optionally limit the number of negative dentries allowed to be created by background pruning of excess negative dentries or even kill it after use. This capability will help to limit the amount of memory that can be consumed by negative dentries. Patch 1 tracks the number of negative dentries present in the LRU lists and reports it in /proc/sys/fs/dentry-state. Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to specify a soft limit on the number of negative allowed as a percentage of total system memory. This parameter is 0 by default which means no negative dentry limiting will be performed. Patch 3 enables automatic pruning of least recently used negative dentries when the total number is close to the preset limit. Patch 4 spreads the negative dentry pruning effort to multiple CPUs to make it more fair. Patch 5 moves the negative dentries to the head of the LRU after they are initially created. They will be moved to the tail like the positive dentries the second time they are accessed. This will make sure that all those accssed-once negative dentries will be removed first when a shrinker is running. Patch 6 adds a "neg-dentry-enforce" sysctl parameter which can be dynamically enabled at run time to enforce the negative dentry limit by killing excess negative dentires right after use, if necessary. Patch 7 makes the negative dentry tracking and limiting code a user configurable option so that it can be configured out, if desired. With a 4.18 based kernel, the positive & negative dentries lookup rates (lookups per second) after initial boot on a 2-socket 24-core 48-thread 64GB memory system with and without the patch were as follows: ` Metric w/o patch neg_dentry_pc=0 neg_dentry_pc=1 ------ --------- --------------- --------------- Positive dentry lookup 584299 586749 582670 Negative dentry lookup 1422204 1439994 1438440 Negative dentry creation 643535 652194 641841 For the lookup rate, there isn't any signifcant difference with or without the patch or with a zero or non-zero value of neg_dentry_pc. The negative dentry creation test created 10 millions unique negative dentries. When neg_dentry_pc=1, the number of negative dentries exceeded the limit and hence the shrinker was activated. dcache: Negative dentry: percpu limit = 54871, free pool = 658461 As the shrinker was running on the CPU doing the negative dentry creation, there was a slight decrease of performance of about 1.5% which was not that significant. Running the AIM7 high-systime workload, the system had a jobs/min rate of 300,868. By reserving 48G of memory so that the system had effectively 16G of memory, a negative dentry generator was used to deplete free memory with neg_dentry_pc=0. The MemFree value dropped to as low as 130M before bouncing up with memory shrinker activated. The negative dentry count went up to about 75M. The AIM7 job rate dropped to as low as 167,562 when the memory shrinker was working. Even shutting the system could take a while because of the need to free up all the allocated dentries first. By setting both neg_dentry_pc and neg_dentry_enforce to 1, for example, the negative dentry count never went higher than 800k when the negative dentry generator was running. The AIM7 job rate was 297,994. There was a bit of performance drop, but nothing significant. Waiman Long (7): fs/dcache: Track & report number of negative dentries fs/dcache: Add sysctl parameter neg-dentry-pc as a soft limit on negative dentries fs/dcache: Enable automatic pruning of negative dentries fs/dcache: Spread negative dentry pruning across multiple CPUs fs/dcache: Add negative dentries to LRU head initially fs/dcache: Allow optional enforcement of negative dentry limit fs/dcache: Allow deconfiguration of negative dentry code to reduce kernel size Documentation/sysctl/fs.txt | 38 +++- fs/Kconfig | 10 + fs/dcache.c | 469 +++++++++++++++++++++++++++++++++++++++++++- include/linux/dcache.h | 17 +- include/linux/list_lru.h | 18 ++ kernel/sysctl.c | 23 +++ mm/list_lru.c | 23 ++- 7 files changed, 583 insertions(+), 15 deletions(-) -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman@redhat.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Jonathan Corbet <corbet@lwn.net>, "Luis R. Rodriguez" <mcgrof@kernel.org>, Kees Cook <keescook@chromium.org> Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Jan Kara <jack@suse.cz>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Andrew Morton <akpm@linux-foundation.org>, Ingo Molnar <mingo@kernel.org>, Miklos Szeredi <mszeredi@redhat.com>, Matthew Wilcox <willy@infradead.org>, Larry Woodman <lwoodman@redhat.com>, James Bottomley <James.Bottomley@HansenPartnership.com>, "Wangkai (Kevin C)" <wangkai86@huawei.com>, Waiman Long <longman@redhat.com> Subject: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Date: Fri, 6 Jul 2018 15:32:45 -0400 [thread overview] Message-ID: <1530905572-817-1-git-send-email-longman@redhat.com> (raw) v5->v6: - Drop the neg_dentry_pc boot command line option, but add a "neg-dentry-pc" sysctl parameter instead. - Change the "enforce-neg-dentry-limit" sysctl parameter to "neg-dentry-enforce". - Add a patch to add negative dentry to the head of the LRU initially so that they will be the first to be removed if they are not accessed again. - Run some additional performance test. v4->v5: - Backed to the latest 4.18 kernel and modify the code accordingly. Patch 1 "Relocate dentry_kill() after lock_parent()" is now no longer necessary. - Make tracking and limiting of negative dentries a user configurable option (CONFIG_DCACHE_TRACK_NEG_ENTRY) so that users can decide if they want to include this capability in the kernel. - Make killing excess negative dentries an optional feature that can be enabled via a boot command line option or a sysctl parameter. - Spread negative dentry pruning across multiple CPUs. v4: https://lkml.org/lkml/2017/9/18/739 v5: https://lkml.org/lkml/2018/7/2/21 A rogue application can potentially create a large number of negative dentries in the system consuming most of the memory available if it is not under the direct control of a memory controller that enforce kernel memory limit. This patchset introduces changes to the dcache subsystem to track and optionally limit the number of negative dentries allowed to be created by background pruning of excess negative dentries or even kill it after use. This capability will help to limit the amount of memory that can be consumed by negative dentries. Patch 1 tracks the number of negative dentries present in the LRU lists and reports it in /proc/sys/fs/dentry-state. Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to specify a soft limit on the number of negative allowed as a percentage of total system memory. This parameter is 0 by default which means no negative dentry limiting will be performed. Patch 3 enables automatic pruning of least recently used negative dentries when the total number is close to the preset limit. Patch 4 spreads the negative dentry pruning effort to multiple CPUs to make it more fair. Patch 5 moves the negative dentries to the head of the LRU after they are initially created. They will be moved to the tail like the positive dentries the second time they are accessed. This will make sure that all those accssed-once negative dentries will be removed first when a shrinker is running. Patch 6 adds a "neg-dentry-enforce" sysctl parameter which can be dynamically enabled at run time to enforce the negative dentry limit by killing excess negative dentires right after use, if necessary. Patch 7 makes the negative dentry tracking and limiting code a user configurable option so that it can be configured out, if desired. With a 4.18 based kernel, the positive & negative dentries lookup rates (lookups per second) after initial boot on a 2-socket 24-core 48-thread 64GB memory system with and without the patch were as follows: ` Metric w/o patch neg_dentry_pc=0 neg_dentry_pc=1 ------ --------- --------------- --------------- Positive dentry lookup 584299 586749 582670 Negative dentry lookup 1422204 1439994 1438440 Negative dentry creation 643535 652194 641841 For the lookup rate, there isn't any signifcant difference with or without the patch or with a zero or non-zero value of neg_dentry_pc. The negative dentry creation test created 10 millions unique negative dentries. When neg_dentry_pc=1, the number of negative dentries exceeded the limit and hence the shrinker was activated. dcache: Negative dentry: percpu limit = 54871, free pool = 658461 As the shrinker was running on the CPU doing the negative dentry creation, there was a slight decrease of performance of about 1.5% which was not that significant. Running the AIM7 high-systime workload, the system had a jobs/min rate of 300,868. By reserving 48G of memory so that the system had effectively 16G of memory, a negative dentry generator was used to deplete free memory with neg_dentry_pc=0. The MemFree value dropped to as low as 130M before bouncing up with memory shrinker activated. The negative dentry count went up to about 75M. The AIM7 job rate dropped to as low as 167,562 when the memory shrinker was working. Even shutting the system could take a while because of the need to free up all the allocated dentries first. By setting both neg_dentry_pc and neg_dentry_enforce to 1, for example, the negative dentry count never went higher than 800k when the negative dentry generator was running. The AIM7 job rate was 297,994. There was a bit of performance drop, but nothing significant. Waiman Long (7): fs/dcache: Track & report number of negative dentries fs/dcache: Add sysctl parameter neg-dentry-pc as a soft limit on negative dentries fs/dcache: Enable automatic pruning of negative dentries fs/dcache: Spread negative dentry pruning across multiple CPUs fs/dcache: Add negative dentries to LRU head initially fs/dcache: Allow optional enforcement of negative dentry limit fs/dcache: Allow deconfiguration of negative dentry code to reduce kernel size Documentation/sysctl/fs.txt | 38 +++- fs/Kconfig | 10 + fs/dcache.c | 469 +++++++++++++++++++++++++++++++++++++++++++- include/linux/dcache.h | 17 +- include/linux/list_lru.h | 18 ++ kernel/sysctl.c | 23 +++ mm/list_lru.c | 23 ++- 7 files changed, 583 insertions(+), 15 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2018-07-06 19:35 UTC|newest] Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-07-06 19:32 Waiman Long [this message] 2018-07-06 19:32 ` [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Waiman Long 2018-07-06 19:32 ` [PATCH v6 1/7] fs/dcache: Track & report number " Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 2/7] fs/dcache: Add sysctl parameter neg-dentry-pc as a soft limit on " Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 3/7] fs/dcache: Enable automatic pruning of " Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 4/7] fs/dcache: Spread negative dentry pruning across multiple CPUs Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 5/7] fs/dcache: Add negative dentries to LRU head initially Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 6/7] fs/dcache: Allow optional enforcement of negative dentry limit Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 19:32 ` [PATCH v6 7/7] fs/dcache: Allow deconfiguration of negative dentry code to reduce kernel size Waiman Long 2018-07-06 19:32 ` Waiman Long 2018-07-06 21:54 ` Eric Biggers 2018-07-06 21:54 ` Eric Biggers 2018-07-06 22:28 ` [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Al Viro 2018-07-06 22:28 ` Al Viro 2018-07-07 3:02 ` Waiman Long 2018-07-07 3:02 ` Waiman Long 2018-07-09 8:19 ` Michal Hocko 2018-07-09 8:19 ` Michal Hocko 2018-07-09 16:01 ` Waiman Long 2018-07-09 16:01 ` Waiman Long 2018-07-10 14:27 ` Michal Hocko 2018-07-10 14:27 ` Michal Hocko 2018-07-10 16:09 ` Waiman Long 2018-07-10 16:09 ` Waiman Long 2018-07-11 10:21 ` Michal Hocko 2018-07-11 10:21 ` Michal Hocko 2018-07-11 15:13 ` Waiman Long 2018-07-11 15:13 ` Waiman Long 2018-07-11 17:42 ` James Bottomley 2018-07-11 17:42 ` James Bottomley 2018-07-11 17:42 ` James Bottomley 2018-07-11 19:07 ` Waiman Long 2018-07-11 19:07 ` Waiman Long 2018-07-11 19:21 ` James Bottomley 2018-07-11 19:21 ` James Bottomley 2018-07-11 19:21 ` James Bottomley 2018-07-11 19:21 ` James Bottomley 2018-07-12 15:54 ` Waiman Long 2018-07-12 15:54 ` Waiman Long 2018-07-12 16:04 ` James Bottomley 2018-07-12 16:04 ` James Bottomley 2018-07-12 16:04 ` James Bottomley 2018-07-12 16:04 ` James Bottomley 2018-07-12 16:26 ` Waiman Long 2018-07-12 16:26 ` Waiman Long 2018-07-12 17:33 ` James Bottomley 2018-07-12 17:33 ` James Bottomley 2018-07-12 17:33 ` James Bottomley 2018-07-12 17:33 ` James Bottomley 2018-07-13 15:32 ` Waiman Long 2018-07-13 15:32 ` Waiman Long 2018-07-12 16:49 ` Matthew Wilcox 2018-07-12 16:49 ` Matthew Wilcox 2018-07-12 17:21 ` James Bottomley 2018-07-12 17:21 ` James Bottomley 2018-07-12 17:21 ` James Bottomley 2018-07-12 17:21 ` James Bottomley 2018-07-12 18:06 ` Linus Torvalds 2018-07-12 19:57 ` James Bottomley 2018-07-12 19:57 ` James Bottomley 2018-07-12 19:57 ` James Bottomley 2018-07-12 19:57 ` James Bottomley 2018-07-13 0:36 ` Dave Chinner 2018-07-13 0:36 ` Dave Chinner 2018-07-13 15:46 ` James Bottomley 2018-07-13 15:46 ` James Bottomley 2018-07-13 15:46 ` James Bottomley 2018-07-13 15:46 ` James Bottomley 2018-07-13 23:17 ` Dave Chinner 2018-07-13 23:17 ` Dave Chinner 2018-07-13 23:17 ` Dave Chinner 2018-07-13 23:17 ` Dave Chinner 2018-07-16 9:10 ` Michal Hocko 2018-07-16 9:10 ` Michal Hocko 2018-07-16 14:42 ` James Bottomley 2018-07-16 14:42 ` James Bottomley 2018-07-16 14:42 ` James Bottomley 2018-07-16 14:42 ` James Bottomley 2018-07-16 9:09 ` Michal Hocko 2018-07-16 9:09 ` Michal Hocko 2018-07-16 9:12 ` Michal Hocko 2018-07-16 9:12 ` Michal Hocko 2018-07-16 12:41 ` Matthew Wilcox 2018-07-16 12:41 ` Matthew Wilcox 2018-07-16 23:40 ` Andrew Morton 2018-07-16 23:40 ` Andrew Morton 2018-07-17 1:30 ` Matthew Wilcox 2018-07-17 1:30 ` Matthew Wilcox 2018-07-17 8:33 ` Michal Hocko 2018-07-17 8:33 ` Michal Hocko 2018-07-19 0:33 ` Dave Chinner 2018-07-19 0:33 ` Dave Chinner 2018-07-19 8:45 ` Michal Hocko 2018-07-19 8:45 ` Michal Hocko 2018-07-19 9:13 ` Jan Kara 2018-07-19 9:13 ` Jan Kara 2018-07-18 18:39 ` Waiman Long 2018-07-18 18:39 ` Waiman Long 2018-07-18 16:17 ` Waiman Long 2018-07-18 16:17 ` Waiman Long 2018-07-19 8:48 ` Michal Hocko 2018-07-19 8:48 ` Michal Hocko 2018-07-12 8:48 ` Michal Hocko 2018-07-12 8:48 ` Michal Hocko 2018-07-12 16:12 ` Waiman Long 2018-07-12 16:12 ` Waiman Long 2018-07-12 23:16 ` Andrew Morton 2018-07-12 23:16 ` Andrew Morton
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1530905572-817-1-git-send-email-longman@redhat.com \ --to=longman@redhat.com \ --cc=James.Bottomley@HansenPartnership.com \ --cc=akpm@linux-foundation.org \ --cc=corbet@lwn.net \ --cc=jack@suse.cz \ --cc=keescook@chromium.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=lwoodman@redhat.com \ --cc=mcgrof@kernel.org \ --cc=mingo@kernel.org \ --cc=mszeredi@redhat.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=torvalds@linux-foundation.org \ --cc=viro@zeniv.linux.org.uk \ --cc=wangkai86@huawei.com \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.