linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: Alex Shi <alex.shi@linux.alibaba.com>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, tj@kernel.org, hughd@google.com,
	daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com,
	willy@infradead.org, shakeelb@google.com, hannes@cmpxchg.org
Subject: Re: [PATCH v4 0/9] per lruvec lru_lock for memcg
Date: Sun, 24 Nov 2019 18:49:12 +0300	[thread overview]
Message-ID: <aa5499cd-7947-39a5-fc17-bd277be25764@yandex-team.ru> (raw)
In-Reply-To: <1574166203-151975-1-git-send-email-alex.shi@linux.alibaba.com>

On 19/11/2019 15.23, Alex Shi wrote:
> Hi all,
> 
> This patchset move lru_lock into lruvec, give a lru_lock for each of
> lruvec, thus bring a lru_lock for each of memcg per node.
> 
> According to Daniel Jordan's suggestion, I run 64 'dd' with on 32
> containers on my 2s* 8 core * HT box with the modefied case:
>    https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
> 
> With this change above lru_lock censitive testing improved 17% with multiple
> containers scenario. And no performance lose w/o mem_cgroup.

Splitting lru_lock isn't only option for solving this lock contention.
Also it doesn't help if all this happens in one cgroup.

I think better batching could solve more problems with less overhead.

Like larger per-cpu vectors or queues for each numa node or even for each lruvec.
This will preliminarily sort and aggregate pages so actual modification under
lru_lock will be much cheaper and fine grained.

> 
> Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
> 7 years ago. Now I believe considering my testing result, and google internal
> using fact. This feature is clearly benefit multi-container users.
> 
> So I'd like to introduce it here.
> 
> Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan,
> Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.
> 
> v4:
>    a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
>    b, remove the irqsave flags changes, thanks Metthew Wilcox
>    c, merge/split patches for better understanding and bisection purpose
> 
> v3: rebase on linux-next, and fold the relock fix patch into introduceing patch
> 
> v2: bypass a performance regression bug and fix some function issues
> 
> v1: initial version, aim testing show 5% performance increase
> 
> 
> Alex Shi (9):
>    mm/swap: fix uninitialized compiler warning
>    mm/huge_memory: fix uninitialized compiler warning
>    mm/lru: replace pgdat lru_lock with lruvec lock
>    mm/mlock: only change the lru_lock iff page's lruvec is different
>    mm/swap: only change the lru_lock iff page's lruvec is different
>    mm/vmscan: only change the lru_lock iff page's lruvec is different
>    mm/pgdat: remove pgdat lru_lock
>    mm/lru: likely enhancement
>    mm/lru: revise the comments of lru_lock
> 
>   Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +----
>   Documentation/admin-guide/cgroup-v1/memory.rst     |  6 +-
>   Documentation/trace/events-kmem.rst                |  2 +-
>   Documentation/vm/unevictable-lru.rst               | 22 +++----
>   include/linux/memcontrol.h                         | 68 ++++++++++++++++++++
>   include/linux/mm_types.h                           |  2 +-
>   include/linux/mmzone.h                             |  5 +-
>   mm/compaction.c                                    | 67 +++++++++++++------
>   mm/filemap.c                                       |  4 +-
>   mm/huge_memory.c                                   | 17 ++---
>   mm/memcontrol.c                                    | 75 +++++++++++++++++-----
>   mm/mlock.c                                         | 27 ++++----
>   mm/mmzone.c                                        |  1 +
>   mm/page_alloc.c                                    |  1 -
>   mm/page_idle.c                                     |  5 +-
>   mm/rmap.c                                          |  2 +-
>   mm/swap.c                                          | 74 +++++++++------------
>   mm/vmscan.c                                        | 74 ++++++++++-----------
>   18 files changed, 287 insertions(+), 180 deletions(-)
> 

      parent reply	other threads:[~2019-11-24 15:49 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-19 12:23 [PATCH v4 0/9] per lruvec lru_lock for memcg Alex Shi
2019-11-19 12:23 ` [PATCH v4 1/9] mm/swap: fix uninitialized compiler warning Alex Shi
2019-11-19 15:41   ` Johannes Weiner
2019-11-20 11:42     ` Alex Shi
2019-11-19 12:23 ` [PATCH v4 2/9] mm/huge_memory: " Alex Shi
2019-11-19 15:42   ` Johannes Weiner
2019-11-19 12:23 ` [PATCH v4 3/9] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2019-11-19 15:57   ` Matthew Wilcox
2019-11-19 16:44     ` Johannes Weiner
2019-11-20 11:50       ` Alex Shi
2019-11-19 16:04   ` Johannes Weiner
2019-11-20 11:41     ` Alex Shi
2019-11-21 22:06       ` Johannes Weiner
2019-11-22  2:36         ` Alex Shi
2019-11-22 16:16           ` Johannes Weiner
2019-11-23  0:58             ` Hugh Dickins
2019-11-24 15:19             ` Alex Shi
2019-11-25  9:26             ` Alex Shi
2019-11-25 17:27               ` Shakeel Butt
2019-11-19 16:49   ` Shakeel Butt
2019-11-19 12:23 ` [PATCH v4 4/9] mm/mlock: only change the lru_lock iff page's lruvec is different Alex Shi
2019-11-19 12:23 ` [PATCH v4 5/9] mm/swap: " Alex Shi
2019-11-19 12:23 ` [PATCH v4 6/9] mm/vmscan: " Alex Shi
2019-11-19 12:23 ` [PATCH v4 7/9] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-11-19 12:23 ` [PATCH v4 8/9] mm/lru: likely enhancement Alex Shi
2019-11-19 12:23 ` [PATCH v4 9/9] mm/lru: revise the comments of lru_lock Alex Shi
2019-11-19 16:19   ` Johannes Weiner
2019-11-20 11:48     ` Alex Shi
2019-11-24 15:49 ` Konstantin Khlebnikov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa5499cd-7947-39a5-fc17-bd277be25764@yandex-team.ru \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).