From: Michel Lespinasse <walken@google.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Dave Hansen <dave@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com>, Rik van Riel <riel@redhat.com>, Johannes Weiner <jweiner@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Hugh Dickins <hughd@google.com>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Michael Wolf <mjwolf@us.ibm.com> Subject: [PATCH 0/8] idle page tracking / working set estimation Date: Fri, 16 Sep 2011 20:39:05 -0700 [thread overview] Message-ID: <1316230753-8693-1-git-send-email-walken@google.com> (raw) Please comment on the following patches (which are against the v3.0 kernel). We are using these to collect memory utilization statistics for each cgroup accross many machines, and optimize job placement accordingly. The statistics are intended to be compared accross many machines - we don't just want to know which cgroup to reclaim from on an individual machine, we also need to know which machine is best to target a job onto within a large cluster. Also, we try to have a low impact on the normal MM algorithms - we think they already do a fine job balancing resources on individual machines, so we are not trying to mess up with that here. Patch 1 introduces no functionality; it modifies the page_referenced API so that it can be more easily extended in patch 3. Patch 2 documents the proposed features, and adds a configuration option for these. When the features are compiled in, they are still disabled until the administrator sets up the desired scanning interval; however the configuration option seems necessary as the features make use of 3 extra page flags - there is plenty of space for these in 64-bit builds, but less so in 32-bit builds... Patch 3 introduces page_referenced_kstaled(), which is similar to page_referenced() but is used for idle page tracking rather than for memory reclaimation. Since both functions clear the pte_young bits and we don't want them to interfere with each other, two new page flags are introduced that track when young pte references have been cleared by each of the page_referenced variants. The page_referenced functions are also extended to return the dirty status of any pte references encountered. Patch 4 introduces the 'kstaled' thread that handles idle page tracking. The thread starts disabled; one enables it by setting a scanning interval in /sys/kernel/mm/kstaled/scan_seconds. It then scans all physical memory pages, looking for idle pages - pages that have not been touched since the previous scan interval. These pages are further classified into idle_clean (which are immediately reclaimable), idle_dirty_swap (which are reclaimable if swap is enabled on the system), and idle_dirty_file (which are reclaimable after writeback occurs). These statistics are published for each cgroup in a new /dev/cgroup/*/memory.idle_page_stats file. We did not use the memory.stat file there because we thought these stats are different - first, they are meaningless until one sets the scan_seconds value, and then they are only updated once per scan interval where the memory.stat values are continually updated. Patch 5 is a small optimization skipping over memory holes. Patch 6 rate limits the idle page scanning so that it occurs in small chunks over the length of the scan interval, rather than all at once. Patch 7 adds extra functionality to track how long a given page has been idle, so that memory.idle_page_stats can report pages that have been idle for 1,2,5,15,30,60,120 or 240 consecutive scan intervals. Patch 8 adds extra functionality in the form of an incremental update feature. Here we only report immediately reclaimable idle pages; however we don't want to wait for the end of a scan interval to update this number if the system experiences a rapid increase in memory pressure. Michel Lespinasse (8): page_referenced: replace vm_flags parameter with struct pr_info kstaled: documentation and config option. kstaled: page_referenced_kstaled() and supporting infrastructure. kstaled: minimalistic implementation. kstaled: skip non-RAM regions. kstaled: rate limit pages scanned per second. kstaled: add histogram sampling functionality kstaled: add incrementally updating stale page count Documentation/cgroups/memory.txt | 103 ++++++++- arch/x86/include/asm/page_types.h | 8 + arch/x86/kernel/e820.c | 45 ++++ include/linux/ksm.h | 9 +- include/linux/mmzone.h | 11 + include/linux/page-flags.h | 50 ++++ include/linux/pagemap.h | 11 +- include/linux/rmap.h | 82 ++++++- mm/Kconfig | 10 + mm/internal.h | 1 + mm/ksm.c | 15 +- mm/memcontrol.c | 492 +++++++++++++++++++++++++++++++++++++ mm/memory_hotplug.c | 6 + mm/mlock.c | 1 + mm/rmap.c | 136 ++++++----- mm/swap.c | 1 + mm/vmscan.c | 20 +- 17 files changed, 904 insertions(+), 97 deletions(-) -- 1.7.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Michel Lespinasse <walken@google.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Dave Hansen <dave@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com>, Rik van Riel <riel@redhat.com>, Johannes Weiner <jweiner@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Hugh Dickins <hughd@google.com>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Michael Wolf <mjwolf@us.ibm.com> Subject: [PATCH 0/8] idle page tracking / working set estimation Date: Fri, 16 Sep 2011 20:39:05 -0700 [thread overview] Message-ID: <1316230753-8693-1-git-send-email-walken@google.com> (raw) Please comment on the following patches (which are against the v3.0 kernel). We are using these to collect memory utilization statistics for each cgroup accross many machines, and optimize job placement accordingly. The statistics are intended to be compared accross many machines - we don't just want to know which cgroup to reclaim from on an individual machine, we also need to know which machine is best to target a job onto within a large cluster. Also, we try to have a low impact on the normal MM algorithms - we think they already do a fine job balancing resources on individual machines, so we are not trying to mess up with that here. Patch 1 introduces no functionality; it modifies the page_referenced API so that it can be more easily extended in patch 3. Patch 2 documents the proposed features, and adds a configuration option for these. When the features are compiled in, they are still disabled until the administrator sets up the desired scanning interval; however the configuration option seems necessary as the features make use of 3 extra page flags - there is plenty of space for these in 64-bit builds, but less so in 32-bit builds... Patch 3 introduces page_referenced_kstaled(), which is similar to page_referenced() but is used for idle page tracking rather than for memory reclaimation. Since both functions clear the pte_young bits and we don't want them to interfere with each other, two new page flags are introduced that track when young pte references have been cleared by each of the page_referenced variants. The page_referenced functions are also extended to return the dirty status of any pte references encountered. Patch 4 introduces the 'kstaled' thread that handles idle page tracking. The thread starts disabled; one enables it by setting a scanning interval in /sys/kernel/mm/kstaled/scan_seconds. It then scans all physical memory pages, looking for idle pages - pages that have not been touched since the previous scan interval. These pages are further classified into idle_clean (which are immediately reclaimable), idle_dirty_swap (which are reclaimable if swap is enabled on the system), and idle_dirty_file (which are reclaimable after writeback occurs). These statistics are published for each cgroup in a new /dev/cgroup/*/memory.idle_page_stats file. We did not use the memory.stat file there because we thought these stats are different - first, they are meaningless until one sets the scan_seconds value, and then they are only updated once per scan interval where the memory.stat values are continually updated. Patch 5 is a small optimization skipping over memory holes. Patch 6 rate limits the idle page scanning so that it occurs in small chunks over the length of the scan interval, rather than all at once. Patch 7 adds extra functionality to track how long a given page has been idle, so that memory.idle_page_stats can report pages that have been idle for 1,2,5,15,30,60,120 or 240 consecutive scan intervals. Patch 8 adds extra functionality in the form of an incremental update feature. Here we only report immediately reclaimable idle pages; however we don't want to wait for the end of a scan interval to update this number if the system experiences a rapid increase in memory pressure. Michel Lespinasse (8): page_referenced: replace vm_flags parameter with struct pr_info kstaled: documentation and config option. kstaled: page_referenced_kstaled() and supporting infrastructure. kstaled: minimalistic implementation. kstaled: skip non-RAM regions. kstaled: rate limit pages scanned per second. kstaled: add histogram sampling functionality kstaled: add incrementally updating stale page count Documentation/cgroups/memory.txt | 103 ++++++++- arch/x86/include/asm/page_types.h | 8 + arch/x86/kernel/e820.c | 45 ++++ include/linux/ksm.h | 9 +- include/linux/mmzone.h | 11 + include/linux/page-flags.h | 50 ++++ include/linux/pagemap.h | 11 +- include/linux/rmap.h | 82 ++++++- mm/Kconfig | 10 + mm/internal.h | 1 + mm/ksm.c | 15 +- mm/memcontrol.c | 492 +++++++++++++++++++++++++++++++++++++ mm/memory_hotplug.c | 6 + mm/mlock.c | 1 + mm/rmap.c | 136 ++++++----- mm/swap.c | 1 + mm/vmscan.c | 20 +- 17 files changed, 904 insertions(+), 97 deletions(-) -- 1.7.3.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2011-09-17 3:39 UTC|newest] Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-09-17 3:39 Michel Lespinasse [this message] 2011-09-17 3:39 ` [PATCH 0/8] idle page tracking / working set estimation Michel Lespinasse 2011-09-17 3:39 ` [PATCH 1/8] page_referenced: replace vm_flags parameter with struct pr_info Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-17 3:44 ` Joe Perches 2011-09-17 3:44 ` Joe Perches 2011-09-17 4:51 ` Michel Lespinasse 2011-09-17 4:51 ` Michel Lespinasse 2011-09-20 19:05 ` Rik van Riel 2011-09-20 19:05 ` Rik van Riel 2011-09-21 2:51 ` Michel Lespinasse 2011-09-21 2:51 ` Michel Lespinasse 2011-09-17 3:39 ` [PATCH 2/8] kstaled: documentation and config option Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-20 21:23 ` Rik van Riel 2011-09-20 21:23 ` Rik van Riel 2011-09-23 19:27 ` Rik van Riel 2011-09-23 19:27 ` Rik van Riel 2011-09-17 3:39 ` [PATCH 3/8] kstaled: page_referenced_kstaled() and supporting infrastructure Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-20 19:36 ` Peter Zijlstra 2011-09-20 19:36 ` Peter Zijlstra 2011-09-17 3:39 ` [PATCH 4/8] kstaled: minimalistic implementation Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-22 23:14 ` Andrew Morton 2011-09-22 23:14 ` Andrew Morton 2011-09-23 8:37 ` Michel Lespinasse 2011-09-23 8:37 ` Michel Lespinasse 2011-09-17 3:39 ` [PATCH 5/8] kstaled: skip non-RAM regions Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-17 3:39 ` [PATCH 6/8] kstaled: rate limit pages scanned per second Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-22 23:15 ` Andrew Morton 2011-09-22 23:15 ` Andrew Morton 2011-09-23 10:18 ` Michel Lespinasse 2011-09-23 10:18 ` Michel Lespinasse 2011-09-17 3:39 ` [PATCH 7/8] kstaled: add histogram sampling functionality Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-22 23:15 ` Andrew Morton 2011-09-22 23:15 ` Andrew Morton 2011-09-23 10:26 ` Michel Lespinasse 2011-09-23 10:26 ` Michel Lespinasse 2011-09-17 3:39 ` [PATCH 8/8] kstaled: add incrementally updating stale page count Michel Lespinasse 2011-09-17 3:39 ` Michel Lespinasse 2011-09-22 23:13 ` [PATCH 0/8] idle page tracking / working set estimation Andrew Morton 2011-09-22 23:13 ` Andrew Morton 2011-09-23 1:23 ` Michel Lespinasse 2011-09-23 1:23 ` Michel Lespinasse 2011-09-27 10:03 ` Balbir Singh 2011-09-27 10:03 ` Balbir Singh 2011-09-27 10:14 ` Michel Lespinasse 2011-09-27 10:14 ` Michel Lespinasse 2011-09-27 16:50 ` Balbir Singh 2011-09-27 16:50 ` Balbir Singh
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1316230753-8693-1-git-send-email-walken@google.com \ --to=walken@google.com \ --cc=a.p.zijlstra@chello.nl \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=dave@linux.vnet.ibm.com \ --cc=hughd@google.com \ --cc=jweiner@redhat.com \ --cc=kamezawa.hiroyu@jp.fujitsu.com \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mjwolf@us.ibm.com \ --cc=riel@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.