All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@linux.alibaba.com>
To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, tj@kernel.org, hughd@google.com,
	khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com,
	yang.shi@linux.alibaba.com, willy@infradead.org,
	shakeelb@google.com, hannes@cmpxchg.org
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Subject: [PATCH v5 0/8] per lruvec lru_lock for memcg
Date: Tue, 10 Dec 2019 19:46:16 +0800	[thread overview]
Message-ID: <1575978384-222381-1-git-send-email-alex.shi@linux.alibaba.com> (raw)

Hi all,

Sorry for send out later.

This patchset move lru_lock into lruvec, give a lru_lock for each of
lruvec, thus bring a lru_lock for each of memcg per node.

This is the main patch to replace per node lru_lock with per memcg
lruvec lock.

We introduces function lock_page_lruvec, which will lock the page's
memcg and then memcg's lruvec->lru_lock. (Thanks Johannes Weiner,
Hugh Dickins and Konstantin Khlebnikov suggestion/reminder)

According to Daniel Jordan's suggestion, I run 208 'dd' with on 104
containers on a 2s * 26cores * HT box with a modefied case:
  https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice

With this and later patches, the readtwice performance increases about
80% with containers, but w/o memcg the readtwice performance drops
about 5%.(and another 5% drops with the last debug patch). Slighty
better than v4.(about 6% drop w/o memcg)

Considering the memcg move task path:
   mem_cgroup_move_task:
     mem_cgroup_move_charge:
	lru_add_drain_all();
	atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock
	synchronize_rcu();
	walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range):
	   isolate_lru_page();
	   mem_cgroup_move_account(page,)
		spin_lock(&from->move_lock) 
		page->mem_cgroup = to;
		spin_unlock(&from->move_lock) 
	   putback_lru_page(page)

to guard 'page->mem_cgroup = to' by to_vec->lru_lock has the similar effect with
move_lock. So for performance reason, both solutions are same.

Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
7 years ago.

Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan, 
Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.
and some testing support from Intel 0days!

v5,
  a, locking page's memcg according JohannesW suggestion
  b, using macro for non memcg, according to Johanness and Metthew's suggestion.

v4: 
  a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
  b, remove the irqsave flags changes, thanks Metthew Wilcox
  c, merge/split patches for better understanding and bisection purpose

v3: rebase on linux-next, and fold the relock fix patch into introduceing patch

v2: bypass a performance regression bug and fix some function issues

v1: initial version, aim testing show 5% performance increase


Alex Shi (7):
  mm/vmscan: remove unnecessary lruvec adding
  mm/lru: replace pgdat lru_lock with lruvec lock
  mm/lru: introduce the relock_page_lruvec function
  mm/mlock: optimize munlock_pagevec by relocking
  mm/swap: only change the lru_lock iff page's lruvec is different
  mm/pgdat: remove pgdat lru_lock
  mm/lru: debug checking for page memcg moving and lock_page_memcg

Hugh Dickins (1):
  mm/lru: revise the comments of lru_lock

 Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +---
 Documentation/admin-guide/cgroup-v1/memory.rst     |  6 +-
 Documentation/trace/events-kmem.rst                |  2 +-
 Documentation/vm/unevictable-lru.rst               | 22 ++---
 include/linux/memcontrol.h                         | 63 ++++++++++++++
 include/linux/mm_types.h                           |  2 +-
 include/linux/mmzone.h                             |  5 +-
 mm/compaction.c                                    | 59 ++++++++-----
 mm/filemap.c                                       |  4 +-
 mm/huge_memory.c                                   | 18 ++--
 mm/memcontrol.c                                    | 88 +++++++++++++++-----
 mm/mlock.c                                         | 28 +++----
 mm/mmzone.c                                        |  1 +
 mm/page_alloc.c                                    |  1 -
 mm/page_idle.c                                     |  7 +-
 mm/rmap.c                                          |  2 +-
 mm/swap.c                                          | 75 +++++++----------
 mm/vmscan.c                                        | 97 ++++++++++++----------
 18 files changed, 300 insertions(+), 195 deletions(-)

-- 
1.8.3.1


             reply	other threads:[~2019-12-10 11:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-10 11:46 Alex Shi [this message]
2019-12-10 11:46 ` [PATCH v5 1/8] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2019-12-10 11:46 ` [PATCH v5 2/8] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2019-12-10 11:46   ` Alex Shi
2019-12-10 13:41   ` Matthew Wilcox
2019-12-10 13:41     ` Matthew Wilcox
2019-12-11  3:51     ` Alex Shi
2019-12-11  3:51       ` Alex Shi
2019-12-10 11:46 ` [PATCH v5 3/8] mm/lru: introduce the relock_page_lruvec function Alex Shi
2019-12-10 11:46 ` [PATCH v5 4/8] mm/mlock: optimize munlock_pagevec by relocking Alex Shi
2019-12-10 11:46 ` [PATCH v5 5/8] mm/swap: only change the lru_lock iff page's lruvec is different Alex Shi
2019-12-10 11:46 ` [PATCH v5 6/8] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-12-10 11:46 ` [PATCH v5 7/8] mm/lru: revise the comments of lru_lock Alex Shi
2019-12-10 11:46   ` Alex Shi
2019-12-10 11:46 ` [PATCH v5 8/8] mm/lru: debug checking for page memcg moving and lock_page_memcg Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1575978384-222381-1-git-send-email-alex.shi@linux.alibaba.com \
    --to=alex.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.