linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@kernel.org>,
	 Cgroups <cgroups@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Mel Gorman <mgorman@techsingularity.net>,
	Tejun Heo <tj@kernel.org>,  Shakeel Butt <shakeelb@google.com>,
	Yu Zhao <yuzhao@google.com>,
	 Daniel Jordan <daniel.m.jordan@oracle.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	 Vladimir Davydov <vdavydov.dev@gmail.com>,
	 Minchan Kim <minchan@kernel.org>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	 Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	 Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH 00/14] per memcg lru_lock
Date: Fri, 23 Aug 2019 18:59:21 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.1908231736001.16920@eggly.anvils> (raw)
In-Reply-To: <e9ac9c8c-15c1-8365-5c39-285c6d7b07a6@linux.alibaba.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3268 bytes --]

On Wed, 21 Aug 2019, Alex Shi wrote:
> 在 2019/8/21 上午2:24, Hugh Dickins 写道:
> > I'll set aside what I'm doing, and switch to rebasing ours to v5.3-rc
> > and/or mmotm.  Then compare with what Alex has, to see if there's any
> > good reason to prefer one to the other: if no good reason to prefer ours,
> > I doubt we shall bother to repost, but just use it as basis for helping
> > to review or improve Alex's.
> 
> For your review, my patchset are pretty straight and simple.
> It just use per lruvec lru_lock to replace necessary pgdat lru_lock.
> just this.  We could talk more after I back to work. :)

Sorry to be bearer of bad news, Alex, but when you said "straight and
simple", I feared that your patchset would turn out to be fundamentally
too simple.

And that is so. I have only to see the
		lruvec = mem_cgroup_page_lruvec(page, pgdat);
line in isolate_migratepages_block() in mm/compaction.c, and check
that mem_cgroup_page_lruvec() is little changed in mm/mempolicy.c.

The central problem with per-memcg lru_lock is that you do not know
for sure what lock to take (which memcg a page belongs to) until you
have taken the expected lock, and then checked whether page->memcg
is still the same - backing out and trying again if not.

Fix that central problem, and you end up with a more complicated
patchset, much like ours.  It's true that when ours was first developed,
the memcg situation was more complicated in several ways, and perhaps
some aspects of our patchset could be simplified now (though I've not
identified any).  Johannes in particular has done a great deal of
simplifying work in memcg over the last few years, but there are still
situations in which a page's memcg can change (move_charge_at_immigrate
and swapin readahead spring to mind - or perhaps the latter is only an
issue when MEMCG_SWAP is not enabled, I forget; and I often wonder if
reparenting will be brought back one day).

I did not review your patchset in detail, and wasn't able to get very
far in testing it.  At first I was put off by set_task_reclaim_state
warnings from mm/vmscan.c, but those turned out to be in v5.3-rc5
itself, not from your patchset or mine (but I've not yet investigated
what's responsible for them).  Within a minute of starting swapping
load, kcompactd compact_lock_irqsave() in isolate_migratepages_block()
would deadlock, and I did not get further.  (Though I did also notice
that booting the CONFIG_MEMCG=y kernel with "cgroup_disable=memory"
froze in booting - tiresomely, one has to keep both the memcg and
no-memcg locking to cope with that case, and I guess you had not.)

Rather than duplicating effort, I would advise you to give our patchset
a try, and if it works for you, help towards getting that one merged:
but of course, it's up to you.

I've attached a tarfile of it rebased to v5.3-rc5: I do not want to
spam the list with patches yet, because I do not have any stats or
argument in support of the series, as Andrew asked for years ago and
Michal asks again now.  But aside from that I consider it ready, and
will let Shakeel take it over from here, while I get back to what I
diverted from (but of course I'll try to answer questions on it).

Thanks,
Hugh

[-- Attachment #2: per-memcg lru_lock on v5.3-rc5 --]
[-- Type: APPLICATION/x-tar, Size: 102400 bytes --]

  reply	other threads:[~2019-08-24  2:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20  9:48 [PATCH 00/14] per memcg lru_lock Alex Shi
2019-08-20  9:48 ` [PATCH 01/14] mm/lru: move pgdat lru_lock into lruvec Alex Shi
2019-08-20 13:40   ` Matthew Wilcox
2019-08-20 14:11     ` Alex Shi
2019-08-20  9:48 ` [PATCH 02/14] lru/memcg: move the lruvec->pgdat sync out lru_lock Alex Shi
2019-08-20  9:48 ` [PATCH 03/14] lru/memcg: using per lruvec lock in un/lock_page_lru Alex Shi
2019-08-26  8:30   ` Konstantin Khlebnikov
2019-08-26 14:16     ` Alex Shi
2019-08-20  9:48 ` [PATCH 04/14] lru/compaction: use per lruvec lock in isolate_migratepages_block Alex Shi
2019-08-20  9:48 ` [PATCH 05/14] lru/huge_page: use per lruvec lock in __split_huge_page Alex Shi
2019-08-20  9:48 ` [PATCH 06/14] lru/mlock: using per lruvec lock in munlock Alex Shi
2019-08-20  9:48 ` [PATCH 07/14] lru/swap: using per lruvec lock in page_cache_release Alex Shi
2019-08-20  9:48 ` [PATCH 08/14] lru/swap: uer lruvec lock in activate_page Alex Shi
2019-08-20  9:48 ` [PATCH 09/14] lru/swap: uer per lruvec lock in pagevec_lru_move_fn Alex Shi
2019-08-20  9:48 ` [PATCH 10/14] lru/swap: use per lruvec lock in release_pages Alex Shi
2019-08-20  9:48 ` [PATCH 11/14] lru/vmscan: using per lruvec lock in lists shrinking Alex Shi
2019-08-20  9:48 ` [PATCH 12/14] lru/vmscan: use pre lruvec lock in check_move_unevictable_pages Alex Shi
2019-08-20  9:48 ` [PATCH 13/14] lru/vmscan: using per lruvec lru_lock in get_scan_count Alex Shi
2019-08-20  9:48 ` [PATCH 14/14] mm/lru: fix the comments of lru_lock Alex Shi
2019-08-20 14:00   ` Matthew Wilcox
2019-08-20 14:21     ` Alex Shi
2019-08-20 10:45 ` [PATCH 00/14] per memcg lru_lock Michal Hocko
2019-08-20 16:48   ` Shakeel Butt
2019-08-20 18:24     ` Hugh Dickins
2019-08-21  1:21       ` Alex Shi
2019-08-21  2:00       ` Alex Shi
2019-08-24  1:59         ` Hugh Dickins [this message]
2019-08-26 14:35           ` Alex Shi
2019-08-21 18:00 ` Daniel Jordan
2019-08-22 11:56   ` Alex Shi
2019-08-22 15:20     ` Daniel Jordan
2019-08-26  8:39       ` Konstantin Khlebnikov
2019-08-26 14:22         ` Alex Shi
2019-08-26 14:25       ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1908231736001.16920@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).