linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/10] per lruvec lru_lock for memcg
@ 2019-12-16  9:26 Alex Shi
  2019-12-16  9:26 ` [PATCH v6 01/10] mm/vmscan: remove unnecessary lruvec adding Alex Shi
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Alex Shi @ 2019-12-16  9:26 UTC (permalink / raw)
  To: cgroups, linux-kernel, linux-mm, akpm, mgorman, tj, hughd,
	khlebnikov, daniel.m.jordan, yang.shi, willy, shakeelb, hannes
  Cc: Alex Shi

This patchset move lru_lock into lruvec, give a lru_lock for each of
lruvec, thus bring a lru_lock for each of memcg per node.

We introduces function lock_page_lruvec, which will lock the page's
memcg and then memcg's lruvec->lru_lock(Thanks Johannes Weiner,
Hugh Dickins and Konstantin Khlebnikov suggestion/reminder) to replace
old pgdat->lru_lock.

Following to Daniel Jordan's suggestion, I run 208 'dd' with on 104
containers on a 2s * 26cores * HT box with a modefied case:
  https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice

With this patchset, the readtwice performance increased about 80%
with containers. And no performance drops w/o container. The previous
suspicious slightly drop has gone in v5.5-rc version. And we even got another
5% performance increase. Also the readtwice case performance increased from 5.4
to 5.5-rc.

Another way to guard move_account is by lru_lock instead of move_lock 
Considering the memcg move task path:
   mem_cgroup_move_task:
     mem_cgroup_move_charge:
	lru_add_drain_all();
	atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock
	synchronize_rcu();
	walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range):
	   isolate_lru_page();
	   mem_cgroup_move_account(page,)
		spin_lock(&from->move_lock) 
		page->mem_cgroup = to;
		spin_unlock(&from->move_lock) 
	   putback_lru_page(page)

to guard 'page->mem_cgroup = to' by to_vec->lru_lock has the similar effect with
move_lock. So for performance reason, both solutions are same.

Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
7 years ago.

Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan, 
Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.
and some testing support from Intel 0days!

v6, 
  a, rebase on v5.5-rc2, and do retesting.
  b, pick up Johanness' comments change and a lock_page_lru cleanup.

v5,
  a, locking page's memcg according JohannesW suggestion
  b, using macro for non memcg, according to Johanness and Metthew's suggestion.

v4: 
  a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
  b, remove the irqsave flags changes, thanks Metthew Wilcox
  c, merge/split patches for better understanding and bisection purpose

v3: rebase on linux-next, and fold the relock fix patch into introduceing patch

v2: bypass a performance regression bug and fix some function issues

v1: initial version, aim testing show 5% performance increase



Alex Shi (8):
  mm/vmscan: remove unnecessary lruvec adding
  mm/lru: replace pgdat lru_lock with lruvec lock
  mm/lru: introduce the relock_page_lruvec function
  mm/mlock: optimize munlock_pagevec by relocking
  mm/swap: only change the lru_lock iff page's lruvec is different
  mm/pgdat: remove pgdat lru_lock
  mm/lru: debug checking for page memcg moving and lock_page_memcg
  mm/memcg: fold lock in lock_page_lru

Hugh Dickins (1):
  mm/lru: revise the comments of lru_lock

Johannes Weiner (1):
  mm: revise the comments of mem_cgroup_page_lruvec

 Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +---
 Documentation/admin-guide/cgroup-v1/memory.rst     |  6 +-
 Documentation/trace/events-kmem.rst                |  2 +-
 Documentation/vm/unevictable-lru.rst               | 22 ++---
 include/linux/memcontrol.h                         | 63 ++++++++++++++
 include/linux/mm_types.h                           |  2 +-
 include/linux/mmzone.h                             |  5 +-
 mm/compaction.c                                    | 59 ++++++++-----
 mm/filemap.c                                       |  4 +-
 mm/huge_memory.c                                   | 18 ++--
 mm/memcontrol.c                                    | 95 +++++++++++++++++----
 mm/mlock.c                                         | 28 +++----
 mm/mmzone.c                                        |  1 +
 mm/page_alloc.c                                    |  1 -
 mm/page_idle.c                                     |  7 +-
 mm/rmap.c                                          |  2 +-
 mm/swap.c                                          | 75 +++++++----------
 mm/vmscan.c                                        | 97 ++++++++++++----------
 18 files changed, 309 insertions(+), 193 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-12-17 13:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16  9:26 [PATCH v6 00/10] per lruvec lru_lock for memcg Alex Shi
2019-12-16  9:26 ` [PATCH v6 01/10] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2019-12-16  9:26 ` [PATCH v6 02/10] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2019-12-16 12:14   ` Matthew Wilcox
2019-12-17  1:30     ` Alex Shi
2019-12-17  2:16       ` Matthew Wilcox
2019-12-17 13:11         ` Alex Shi
2019-12-16  9:26 ` [PATCH v6 03/10] mm/lru: introduce the relock_page_lruvec function Alex Shi
2019-12-16  9:26 ` [PATCH v6 04/10] mm/mlock: optimize munlock_pagevec by relocking Alex Shi
2019-12-16  9:26 ` [PATCH v6 05/10] mm/swap: only change the lru_lock iff page's lruvec is different Alex Shi
2019-12-16  9:26 ` [PATCH v6 06/10] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-12-16  9:26 ` [PATCH v6 07/10] mm/lru: revise the comments of lru_lock Alex Shi
2019-12-16  9:26 ` [PATCH v6 08/10] mm/lru: debug checking for page memcg moving and lock_page_memcg Alex Shi
2019-12-16  9:26 ` [PATCH v6 09/10] mm/memcg: fold lock in lock_page_lru Alex Shi
2019-12-16  9:26 ` [PATCH v6 10/10] mm: revise the comments of mem_cgroup_page_lruvec Alex Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).