linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@linux.alibaba.com>
To: Hugh Dickins <hughd@google.com>
Cc: akpm@linux-foundation.org, mgorman@techsingularity.net,
	tj@kernel.org, khlebnikov@yandex-team.ru,
	daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com,
	willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org, shakeelb@google.com,
	iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com
Subject: Re: [PATCH v11 00/16] per memcg lru lock
Date: Fri, 12 Jun 2020 18:43:39 +0800	[thread overview]
Message-ID: <b6b5b74e-d960-8d2c-3f2f-f4a38eb633c3@linux.alibaba.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2006111409280.10801@eggly.anvils>



在 2020/6/12 上午6:09, Hugh Dickins 写道:
>> Anyway, I will send out new patchset
>> with the first issue fixed. and then let's discussion base on it.
> Sigh. I wish you had waited for me to send you fixes, or waited for an
> identifiable tag like 5.8-rc1.  Andrew has been very hard at work with
> mm patches to Linus, but it looks like there are still "data_race" mods
> to come before -rc1, which may stop your v12 from applying cleanly.

Sorry, I didn't aware you would had another sending... My fault.
And yes, offical 5.8-rc is better base.

> 
>>> In the second, I was using rcu_read_lock() instead of trylock_page()
>>> (like in my own patchset), but could not quite be sure of the case when
>>> PageSwapCache gets set at the wrong moment. Gave up for the night, and
>>> in the morning abandoned that, instead just shifting the call to
>>> __isolate_lru_page_prepare() after the get_page_unless_zero(),
>>> where that trylock_page() becomes safe (no danger of stomping on page
>>> flags while page is being freed or newly allocated to another owner).
>> Sorry, I don't know the problem of trylock_page here? Could you like to
>> describe it as a race?
> Races, yes. Look, I'll send you now patches 1 and 2: at least with those
> in it should be safe for you and others to test compaction (if 5.8-rc1
> turns out well: I think so much has gone in that it might have unrelated
> problems, and often the -rc2 is much more stable).
> 
> But no point in sending 3 and 4 at this point, since ...
> 

I guess some concern may come from next mm bug?

>>> I thought that a very safe change, but best to do some test runs with
>>> it in before finalizing. And was then unpleasantly surprised to hit a
>>> VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != page->mem_cgroup) from
>>> lock_page_lruvec_irqsave < relock_page_lruvec < pagevec_lru_move_fn <
>>> pagevec_move_tail < lru_add_drain_cpu after 6 hours on one machine.
>>> Then similar but < rotate_reclaimable_page after 8 hours on another.
>>>
>>> Only seen once before: that's what drove me to add patch 4 (with 3 to
>>> revert the locking before it): somehow, when adding the lruvec locking
>>> there, I just took it for granted that your patchset would have the
>>> appropriate locking (or TestClearPageLRU magic) at the other end.
>>>
>>> But apparently not. And I'm beginning to think that TestClearPageLRU
>>> was just to distract the audience from the lack of proper locking.
>>>
>>> I have certainly not concluded that yet, but I'm having to think about
>>> an area of the code which I'd imagined you had under control (and I'm
>>> puzzled why my testing has found it so very hard to hit). If we're
>>> lucky, I'll find that pagevec_move_tail is a special case, and
>>> nothing much else needs changing; but I doubt that will be so.
> ... shows that your locking primitives are not yet good enough
> to handle the case when tasks are moved between memcgs with
> move_charge_at_immigrate set.  "bin/cg m" in the tests I sent,
> but today I'm changing its "seconds=60" to "seconds=1" in hope
> of speeding up the reproduction.

Yes, I am using your great cases with 'm' parameter to do migration testing,
but unlockly, no error found in my box.

> 
> Ah, good, two machines crashed in 1.5 hours: but I don't need to
> examine the crashes, now that it's obvious there's no protection -
> please, think about rotate_reclaimable_page() (there will be more
> cases, but in practice that seems easiest to hit, so focus on that)
> and how it is not protected from mem_cgroup_move_account().
> > I'm thinking too. Maybe judicious use of lock_page_memcg() can fix it
> (8 years ago it was unsuitable, but a lot has changed for the better
> since then); otherwise it's back to what I've been doing all along,
> taking the likely lruvec lock, and checking under that lock whether
> we have the right lock (as your lruvec_memcg_debug() does), retrying
> if not. Which may be more efficient than involving lock_page_memcg().
> 
> But I guess still worth sending my first two patches, since most of us
> use move_charge_at_immigrate only for... testing move_charge_at_immigrate.
> Whereas compaction bugs can hit any of us at any time.
> 
>>> There's one other unexplained and unfixed bug I've seen several times
>>> while exercising mem_cgroup_move_account(): refcount_warn_saturate()
>>> from where __mem_cgroup_clear_mc() calls mem_cgroup_id_get_many().
>>> I'll be glad if that goes away when the lruvec locking is fixed,
>>> but don't understand the connection. And it's quite possible that
>>> this refcounting bug has nothing to do with your changes: I have
>>> not succeeded in reproducing it on 5.7 nor on 5.7-rc7-mm1,
>>> but I didn't really try long enough to be sure.
> I got one of those quite quickly too after setting "cg m"'s seconds=1.
> I think the best thing I can do while thinking and researching, is
> give 5.7-rc7-mm1 a run on that machine with the speeded up moving,
> to see whether or not that refcount bug reproduces.
> 

Millions thanks for help on this patchset!

Alex


  reply	other threads:[~2020-06-12 10:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28 11:00 [PATCH v11 00/16] per memcg lru lock Alex Shi
2020-05-28 11:00 ` [PATCH v11 01/16] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2020-05-28 11:00 ` [PATCH v11 02/16] mm/page_idle: no unlikely double check for idle page counting Alex Shi
2020-05-28 11:00 ` [PATCH v11 03/16] mm/compaction: correct the comments of compact_defer_shift Alex Shi
2020-05-28 11:00 ` [PATCH v11 04/16] mm/compaction: rename compact_deferred as compact_should_defer Alex Shi
2020-05-28 11:00 ` [PATCH v11 05/16] mm/thp: move lru_add_page_tail func to huge_memory.c Alex Shi
2020-05-28 11:00 ` [PATCH v11 06/16] mm/thp: clean up lru_add_page_tail Alex Shi
2020-05-28 11:00 ` [PATCH v11 07/16] mm/thp: narrow lru locking Alex Shi
2020-05-28 11:00 ` [PATCH v11 08/16] mm/memcg: add debug checking in lock_page_memcg Alex Shi
2020-05-28 11:00 ` [PATCH v11 09/16] mm/lru: introduce TestClearPageLRU Alex Shi
2020-05-28 11:00 ` [PATCH v11 10/16] mm/compaction: do page isolation first in compaction Alex Shi
2020-05-28 11:00 ` [PATCH v11 11/16] mm/mlock: reorder isolation sequence during munlock Alex Shi
2020-05-28 11:00 ` [PATCH v11 12/16] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2020-05-28 11:00 ` [PATCH v11 13/16] mm/lru: introduce the relock_page_lruvec function Alex Shi
2020-05-28 11:00 ` [PATCH v11 14/16] mm/vmscan: use relock for move_pages_to_lru Alex Shi
2020-05-28 11:00 ` [PATCH v11 15/16] mm/pgdat: remove pgdat lru_lock Alex Shi
2020-05-28 11:00 ` [PATCH v11 16/16] mm/lru: revise the comments of lru_lock Alex Shi
2020-06-08  4:15 ` [PATCH v11 00/16] per memcg lru lock Hugh Dickins
2020-06-08  6:13   ` Alex Shi
2020-06-10  3:22     ` Hugh Dickins
2020-06-11  6:06       ` Alex Shi
2020-06-11 22:09         ` Hugh Dickins
2020-06-12 10:43           ` Alex Shi [this message]
2020-06-16  6:14           ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6b5b74e-d960-8d2c-3f2f-f4a38eb633c3@linux.alibaba.com \
    --to=alex.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mgorman@techsingularity.net \
    --cc=richard.weiyang@gmail.com \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).