From: Alex Shi <alex.shi@linux.alibaba.com> To: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com Cc: Alex Shi <alex.shi@linux.alibaba.com> Subject: [PATCH v10 00/15] per memcg lru lock Date: Mon, 27 Apr 2020 15:02:49 +0800 [thread overview] Message-ID: <1587970985-21629-1-git-send-email-alex.shi@linux.alibaba.com> (raw) This is a new version which bases on Johannes new patchset "mm: memcontrol: charge swapin pages on instantiation" https://lkml.org/lkml/2020/4/21/266 Johannes Weiner has suggested: "So here is a crazy idea that may be worth exploring: Right now, pgdat->lru_lock protects both PageLRU *and* the lruvec's linked list. Can we make PageLRU atomic and use it to stabilize the lru_lock instead, and then use the lru_lock only serialize list operations? ..." With the cleaning memcg charge path and this suggestion, we could isolate LRU pages to exclusive visit them in compaction, page migration, reclaim, memcg move_accunt, huge page split etc scenarios while keeping pages' memcg stable. Then possible to change per node lru locking to per memcg lru locking. As to pagevec_lru_move_fn funcs, it would be safe to let pages remain on lru list, lru lock could guard them for list integrity. This is version safely pass Hugh Dickins's swapping kernel building testcase, Thanks for the great case! I want to send out a bit early for more testing and review while people's memory is still hot with Johannes new memcg charge patch. :) I will do more testing beside. The patchset includes 3 parts: 1, some code cleanup and minimum optimization as a preparation. 2, use TestCleanPageLRU as page isolation's precondition 3, replace per node lru_lock with per memcg per node lru_lock The 3rd part moves per node lru_lock into lruvec, thus bring a lru_lock for each of memcg per node. So on a large machine, each of memcg don't have to suffer from per node pgdat->lru_lock competition. They could go fast with their self lru_lock Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice With this patchset, the readtwice performance increased about 80% in concurrent containers. Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this idea 8 years ago, and others who give comments as well: Daniel Jordan, Mel Gorman, Shakeel Butt, Matthew Wilcox etc. Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu, and Yun Wang. Alex Shi (13): mm/swap: use vmf clean up swapin funcs parameters mm/vmscan: remove unnecessary lruvec adding mm/page_idle: no unlikely double check for idle page counting mm/thp: move lru_add_page_tail func to huge_memory.c mm/thp: clean up lru_add_page_tail mm/thp: narrow lru locking mm/memcg: add debug checking in lock_page_memcg mm/lru: introduce TestClearPageLRU mm/compaction: do page isolation first in compaction mm/mlock: ClearPageLRU before get lru lock in munlock page isolation mm/lru: replace pgdat lru_lock with lruvec lock mm/lru: introduce the relock_page_lruvec function mm/pgdat: remove pgdat lru_lock Hugh Dickins (2): mm/vmscan: use relock for move_pages_to_lru mm/lru: revise the comments of lru_lock Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +- Documentation/admin-guide/cgroup-v1/memory.rst | 8 +- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 +-- include/linux/memcontrol.h | 92 +++++++++++ include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 5 +- include/linux/page-flags.h | 1 + include/linux/swap.h | 12 +- mm/compaction.c | 85 +++++++---- mm/filemap.c | 4 +- mm/huge_memory.c | 55 +++++-- mm/madvise.c | 11 +- mm/memcontrol.c | 87 ++++++++++- mm/mlock.c | 93 ++++++------ mm/mmzone.c | 1 + mm/page_alloc.c | 1 - mm/page_idle.c | 8 - mm/rmap.c | 2 +- mm/swap.c | 119 ++++----------- mm/swap_state.c | 23 ++- mm/swapfile.c | 8 +- mm/vmscan.c | 168 +++++++++++---------- mm/zswap.c | 3 +- 24 files changed, 497 insertions(+), 330 deletions(-) -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Alex Shi <alex.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org> To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt@public.gmane.org, tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, khlebnikov-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org, daniel.m.jordan-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, yang.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org, willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org, lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, iamjoonsoo.kim-Hm3cg6mZ9cc@public.gmane.org, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Cc: Alex Shi <alex.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org> Subject: [PATCH v10 00/15] per memcg lru lock Date: Mon, 27 Apr 2020 15:02:49 +0800 [thread overview] Message-ID: <1587970985-21629-1-git-send-email-alex.shi@linux.alibaba.com> (raw) This is a new version which bases on Johannes new patchset "mm: memcontrol: charge swapin pages on instantiation" https://lkml.org/lkml/2020/4/21/266 Johannes Weiner has suggested: "So here is a crazy idea that may be worth exploring: Right now, pgdat->lru_lock protects both PageLRU *and* the lruvec's linked list. Can we make PageLRU atomic and use it to stabilize the lru_lock instead, and then use the lru_lock only serialize list operations? ..." With the cleaning memcg charge path and this suggestion, we could isolate LRU pages to exclusive visit them in compaction, page migration, reclaim, memcg move_accunt, huge page split etc scenarios while keeping pages' memcg stable. Then possible to change per node lru locking to per memcg lru locking. As to pagevec_lru_move_fn funcs, it would be safe to let pages remain on lru list, lru lock could guard them for list integrity. This is version safely pass Hugh Dickins's swapping kernel building testcase, Thanks for the great case! I want to send out a bit early for more testing and review while people's memory is still hot with Johannes new memcg charge patch. :) I will do more testing beside. The patchset includes 3 parts: 1, some code cleanup and minimum optimization as a preparation. 2, use TestCleanPageLRU as page isolation's precondition 3, replace per node lru_lock with per memcg per node lru_lock The 3rd part moves per node lru_lock into lruvec, thus bring a lru_lock for each of memcg per node. So on a large machine, each of memcg don't have to suffer from per node pgdat->lru_lock competition. They could go fast with their self lru_lock Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice With this patchset, the readtwice performance increased about 80% in concurrent containers. Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this idea 8 years ago, and others who give comments as well: Daniel Jordan, Mel Gorman, Shakeel Butt, Matthew Wilcox etc. Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu, and Yun Wang. Alex Shi (13): mm/swap: use vmf clean up swapin funcs parameters mm/vmscan: remove unnecessary lruvec adding mm/page_idle: no unlikely double check for idle page counting mm/thp: move lru_add_page_tail func to huge_memory.c mm/thp: clean up lru_add_page_tail mm/thp: narrow lru locking mm/memcg: add debug checking in lock_page_memcg mm/lru: introduce TestClearPageLRU mm/compaction: do page isolation first in compaction mm/mlock: ClearPageLRU before get lru lock in munlock page isolation mm/lru: replace pgdat lru_lock with lruvec lock mm/lru: introduce the relock_page_lruvec function mm/pgdat: remove pgdat lru_lock Hugh Dickins (2): mm/vmscan: use relock for move_pages_to_lru mm/lru: revise the comments of lru_lock Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +- Documentation/admin-guide/cgroup-v1/memory.rst | 8 +- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 +-- include/linux/memcontrol.h | 92 +++++++++++ include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 5 +- include/linux/page-flags.h | 1 + include/linux/swap.h | 12 +- mm/compaction.c | 85 +++++++---- mm/filemap.c | 4 +- mm/huge_memory.c | 55 +++++-- mm/madvise.c | 11 +- mm/memcontrol.c | 87 ++++++++++- mm/mlock.c | 93 ++++++------ mm/mmzone.c | 1 + mm/page_alloc.c | 1 - mm/page_idle.c | 8 - mm/rmap.c | 2 +- mm/swap.c | 119 ++++----------- mm/swap_state.c | 23 ++- mm/swapfile.c | 8 +- mm/vmscan.c | 168 +++++++++++---------- mm/zswap.c | 3 +- 24 files changed, 497 insertions(+), 330 deletions(-) -- 1.8.3.1
next reply other threads:[~2020-04-27 7:03 UTC|newest] Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-27 7:02 Alex Shi [this message] 2020-04-27 7:02 ` [PATCH v10 00/15] per memcg lru lock Alex Shi 2020-04-27 7:02 ` [PATCH v10 01/15] mm/swap: use vmf clean up swapin funcs parameters Alex Shi 2020-04-27 7:02 ` Alex Shi 2020-04-27 7:02 ` [PATCH v10 02/15] mm/vmscan: remove unnecessary lruvec adding Alex Shi 2020-04-27 7:02 ` Alex Shi 2020-04-27 7:02 ` [PATCH v10 03/15] mm/page_idle: no unlikely double check for idle page counting Alex Shi 2020-04-27 7:02 ` [PATCH v10 04/15] mm/thp: move lru_add_page_tail func to huge_memory.c Alex Shi 2020-04-27 7:02 ` Alex Shi 2020-04-27 7:02 ` [PATCH v10 05/15] mm/thp: clean up lru_add_page_tail Alex Shi 2020-04-27 7:02 ` [PATCH v10 06/15] mm/thp: narrow lru locking Alex Shi 2020-04-27 7:02 ` [PATCH v10 07/15] mm/memcg: add debug checking in lock_page_memcg Alex Shi 2020-04-27 7:02 ` [PATCH v10 08/15] mm/lru: introduce TestClearPageLRU Alex Shi 2020-04-27 7:02 ` Alex Shi 2020-04-27 7:02 ` [PATCH v10 09/15] mm/compaction: do page isolation first in compaction Alex Shi 2020-04-27 7:02 ` [PATCH v10 10/15] mm/mlock: ClearPageLRU before get lru lock in munlock page isolation Alex Shi 2020-04-27 7:03 ` [PATCH v10 10/15] mm/mlock: isolation page before get lru lock in munlock Alex Shi 2020-04-27 7:03 ` [PATCH v10 11/15] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi 2020-04-27 7:03 ` [PATCH v10 12/15] mm/lru: introduce the relock_page_lruvec function Alex Shi 2020-04-27 7:03 ` Alex Shi 2020-04-27 7:03 ` [PATCH v10 13/15] mm/vmscan: use relock for move_pages_to_lru Alex Shi 2020-04-27 7:03 ` Alex Shi 2020-04-27 7:03 ` [PATCH v10 14/15] mm/pgdat: remove pgdat lru_lock Alex Shi 2020-04-27 7:03 ` [PATCH v10 15/15] mm/lru: revise the comments of lru_lock Alex Shi 2020-04-27 7:03 ` Alex Shi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1587970985-21629-1-git-send-email-alex.shi@linux.alibaba.com \ --to=alex.shi@linux.alibaba.com \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=daniel.m.jordan@oracle.com \ --cc=hannes@cmpxchg.org \ --cc=hughd@google.com \ --cc=iamjoonsoo.kim@lge.com \ --cc=khlebnikov@yandex-team.ru \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=lkp@intel.com \ --cc=mgorman@techsingularity.net \ --cc=richard.weiyang@gmail.com \ --cc=shakeelb@google.com \ --cc=tj@kernel.org \ --cc=willy@infradead.org \ --cc=yang.shi@linux.alibaba.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.