From: Alexander Duyck <alexander.duyck@gmail.com>
To: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Tejun Heo <tj@kernel.org>, Hugh Dickins <hughd@google.com>,
Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
Yang Shi <yang.shi@linux.alibaba.com>,
Matthew Wilcox <willy@infradead.org>,
Johannes Weiner <hannes@cmpxchg.org>,
kbuild test robot <lkp@intel.com>, linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
cgroups@vger.kernel.org, Shakeel Butt <shakeelb@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Wei Yang <richard.weiyang@gmail.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>
Subject: Re: [PATCH v16 00/22] per memcg lru_lock
Date: Thu, 16 Jul 2020 07:11:12 -0700 [thread overview]
Message-ID: <CAKgT0UcKVyTXQ=tGv_uMV+fSvoH_-cuG9zA_zhE+S8Ou11gt=w@mail.gmail.com> (raw)
In-Reply-To: <1594429136-20002-1-git-send-email-alex.shi@linux.alibaba.com>
On Fri, Jul 10, 2020 at 5:59 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
> The new version which bases on v5.8-rc4. Add 2 more patchs:
> 'mm/thp: remove code path which never got into'
> 'mm/thp: add tail pages into lru anyway in split_huge_page()'
> and modified 'mm/mlock: reorder isolation sequence during munlock'
>
> Current lru_lock is one for each of node, pgdat->lru_lock, that guard for
> lru lists, but now we had moved the lru lists into memcg for long time. Still
> using per node lru_lock is clearly unscalable, pages on each of memcgs have
> to compete each others for a whole lru_lock. This patchset try to use per
> lruvec/memcg lru_lock to repleace per node lru lock to guard lru lists, make
> it scalable for memcgs and get performance gain.
>
> Currently lru_lock still guards both lru list and page's lru bit, that's ok.
> but if we want to use specific lruvec lock on the page, we need to pin down
> the page's lruvec/memcg during locking. Just taking lruvec lock first may be
> undermined by the page's memcg charge/migration. To fix this problem, we could
> take out the page's lru bit clear and use it as pin down action to block the
> memcg changes. That's the reason for new atomic func TestClearPageLRU.
> So now isolating a page need both actions: TestClearPageLRU and hold the
> lru_lock.
>
> The typical usage of this is isolate_migratepages_block() in compaction.c
> we have to take lru bit before lru lock, that serialized the page isolation
> in memcg page charge/migration which will change page's lruvec and new
> lru_lock in it.
>
> The above solution suggested by Johannes Weiner, and based on his new memcg
> charge path, then have this patchset. (Hugh Dickins tested and contributed much
> code from compaction fix to general code polish, thanks a lot!).
>
> The patchset includes 3 parts:
> 1, some code cleanup and minimum optimization as a preparation.
> 2, use TestCleanPageLRU as page isolation's precondition
> 3, replace per node lru_lock with per memcg per node lru_lock
>
> Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104
> containers on a 2s * 26cores * HT box with a modefied case:
> https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
> With this patchset, the readtwice performance increased about 80%
> in concurrent containers.
>
> Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this
> idea 8 years ago, and others who give comments as well: Daniel Jordan,
> Mel Gorman, Shakeel Butt, Matthew Wilcox etc.
>
> Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu,
> and Yun Wang. Hugh Dickins also shared his kbuild-swap case. Thanks!
Hi Alex,
I think I am seeing a regression with this patch set when I run the
will-it-scale/page_fault3 test. Specifically the processes result is
dropping from 56371083 to 43127382 when I apply these patches.
I haven't had a chance to bisect and figure out what is causing it,
and wanted to let you know in case you are aware of anything specific
that may be causing this.
Thanks.
- Alex
next prev parent reply other threads:[~2020-07-16 14:11 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-11 0:58 [PATCH v16 00/22] per memcg lru_lock Alex Shi
2020-07-11 0:58 ` [PATCH v16 01/22] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2020-07-11 0:58 ` [PATCH v16 02/22] mm/page_idle: no unlikely double check for idle page counting Alex Shi
2020-07-11 0:58 ` [PATCH v16 03/22] mm/compaction: correct the comments of compact_defer_shift Alex Shi
2020-07-11 0:58 ` [PATCH v16 04/22] mm/compaction: rename compact_deferred as compact_should_defer Alex Shi
2020-07-11 0:58 ` [PATCH v16 05/22] mm/thp: move lru_add_page_tail func to huge_memory.c Alex Shi
2020-07-16 8:59 ` Alex Shi
2020-07-16 13:17 ` Kirill A. Shutemov
2020-07-17 5:13 ` Alex Shi
2020-07-20 8:37 ` Kirill A. Shutemov
2020-07-11 0:58 ` [PATCH v16 06/22] mm/thp: clean up lru_add_page_tail Alex Shi
2020-07-20 8:43 ` Kirill A. Shutemov
2020-07-11 0:58 ` [PATCH v16 07/22] mm/thp: remove code path which never got into Alex Shi
2020-07-20 8:43 ` Kirill A. Shutemov
2020-07-11 0:58 ` [PATCH v16 08/22] mm/thp: narrow lru locking Alex Shi
2020-07-11 0:58 ` [PATCH v16 09/22] mm/memcg: add debug checking in lock_page_memcg Alex Shi
2020-07-11 0:58 ` [PATCH v16 10/22] mm/swap: fold vm event PGROTATED into pagevec_move_tail_fn Alex Shi
2020-07-11 0:58 ` [PATCH v16 11/22] mm/lru: move lru_lock holding in func lru_note_cost_page Alex Shi
2020-07-11 0:58 ` [PATCH v16 12/22] mm/lru: move lock into lru_note_cost Alex Shi
2020-07-11 0:58 ` [PATCH v16 13/22] mm/lru: introduce TestClearPageLRU Alex Shi
2020-07-16 9:06 ` Alex Shi
2020-07-16 21:12 ` Alexander Duyck
2020-07-17 7:45 ` Alex Shi
2020-07-17 18:26 ` Alexander Duyck
2020-07-19 4:45 ` Alex Shi
2020-07-19 11:24 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 14/22] mm/thp: add tail pages into lru anyway in split_huge_page() Alex Shi
2020-07-17 9:30 ` Alex Shi
2020-07-20 8:49 ` Kirill A. Shutemov
2020-07-20 9:04 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 15/22] mm/compaction: do page isolation first in compaction Alex Shi
2020-07-16 21:32 ` Alexander Duyck
2020-07-17 5:09 ` Alex Shi
2020-07-17 16:09 ` Alexander Duyck
2020-07-19 3:59 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 16/22] mm/mlock: reorder isolation sequence during munlock Alex Shi
2020-07-17 20:30 ` Alexander Duyck
2020-07-19 3:55 ` Alex Shi
2020-07-20 18:51 ` Alexander Duyck
2020-07-21 9:26 ` Alex Shi
2020-07-21 13:51 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 17/22] mm/swap: serialize memcg changes during pagevec_lru_move_fn Alex Shi
2020-07-11 0:58 ` [PATCH v16 18/22] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2020-07-17 21:38 ` Alexander Duyck
2020-07-18 14:15 ` Alex Shi
2020-07-19 9:12 ` Alex Shi
2020-07-19 15:14 ` Alexander Duyck
2020-07-20 5:47 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 19/22] mm/lru: introduce the relock_page_lruvec function Alex Shi
2020-07-17 22:03 ` Alexander Duyck
2020-07-18 14:01 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 20/22] mm/vmscan: use relock for move_pages_to_lru Alex Shi
2020-07-17 21:44 ` Alexander Duyck
2020-07-18 14:15 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 21/22] mm/pgdat: remove pgdat lru_lock Alex Shi
2020-07-17 21:09 ` Alexander Duyck
2020-07-18 14:17 ` Alex Shi
2020-07-11 0:58 ` [PATCH v16 22/22] mm/lru: revise the comments of lru_lock Alex Shi
2020-07-11 1:02 ` [PATCH v16 00/22] per memcg lru_lock Alex Shi
2020-07-16 8:49 ` Alex Shi
2020-07-16 14:11 ` Alexander Duyck [this message]
2020-07-17 5:24 ` Alex Shi
2020-07-19 15:23 ` Hugh Dickins
2020-07-20 3:01 ` Alex Shi
2020-07-20 4:47 ` Hugh Dickins
2020-07-20 7:30 ` Alex Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKgT0UcKVyTXQ=tGv_uMV+fSvoH_-cuG9zA_zhE+S8Ou11gt=w@mail.gmail.com' \
--to=alexander.duyck@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linux.alibaba.com \
--cc=cgroups@vger.kernel.org \
--cc=daniel.m.jordan@oracle.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=khlebnikov@yandex-team.ru \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mgorman@techsingularity.net \
--cc=richard.weiyang@gmail.com \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).