All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jia He <hejianet@gmail.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 6/9] mm: don't avoid high-priority reclaim on memcg limit reclaim
Date: Wed, 1 Mar 2017 20:13:59 +0100	[thread overview]
Message-ID: <20170301191359.GB24905@dhcp22.suse.cz> (raw)
In-Reply-To: <20170301173628.GA12664@cmpxchg.org>

On Wed 01-03-17 12:36:28, Johannes Weiner wrote:
> On Wed, Mar 01, 2017 at 04:40:27PM +0100, Michal Hocko wrote:
> > On Tue 28-02-17 16:40:04, Johannes Weiner wrote:
> > > 246e87a93934 ("memcg: fix get_scan_count() for small targets") sought
> > > to avoid high reclaim priorities for memcg by forcing it to scan a
> > > minimum amount of pages when lru_pages >> priority yielded nothing.
> > > This was done at a time when reclaim decisions like dirty throttling
> > > were tied to the priority level.
> > > 
> > > Nowadays, the only meaningful thing still tied to priority dropping
> > > below DEF_PRIORITY - 2 is gating whether laptop_mode=1 is generally
> > > allowed to write. But that is from an era where direct reclaim was
> > > still allowed to call ->writepage, and kswapd nowadays avoids writes
> > > until it's scanned every clean page in the system. Potential changes
> > > to how quick sc->may_writepage could trigger are of little concern.
> > > 
> > > Remove the force_scan stuff, as well as the ugly multi-pass target
> > > calculation that it necessitated.
> > 
> > I _really_ like this, I hated the multi-pass part. One thig that I am
> > worried about and changelog doesn't mention it is what we are going to
> > do about small (<16MB) memcgs. On one hand they were already ignored in
> > the global reclaim so this is nothing really new but maybe we want to
> > preserve the behavior for the memcg reclaim at least which would reduce
> > side effect of this patch which is a great cleanup otherwise. Or at
> > least be explicit about this in the changelog.
> 
> <16MB groups are a legitimate concern during global reclaim, but we
> have done it this way for a long time and it never seemed to have
> mattered in practice.

Yeah, this is not really easy to spot because there are usually other
memcgs which can be reclaimed.

> And for limit reclaim, this should be much less of a concern. It just
> means we no longer scan these groups at DEF_PRIORITY and will have to
> increase the scan window. I don't see a problem with that. And that
> consequence of higher priorities is right in the patch subject.

well the memory pressure spills over to others in the same hierarchy.
But I agree this shouldn't a disaster.

> > Btw. why cannot we simply force scan at least SWAP_CLUSTER_MAX
> > unconditionally?
> > 
> > > +		/*
> > > +		 * If the cgroup's already been deleted, make sure to
> > > +		 * scrape out the remaining cache.
> > 		   Also make sure that small memcgs will not get
> > 		   unnoticed during the memcg reclaim
> > 
> > > +		 */
> > > +		if (!scan && !mem_cgroup_online(memcg))
> > 
> > 		if (!scan && (!mem_cgroup_online(memcg) || !global_reclaim(sc)))
> 
> With this I'd be worried about regressing the setups pointed out in
> 6f04f48dc9c0 ("mm: only force scan in reclaim when none of the LRUs
> are big enough.").
> 
> Granted, that patch is a little dubious. IMO, we should be steering
> the LRU balance through references and, in that case in particular,
> with swappiness. Using the default 60 for zswap is too low.
> 
> Plus, I would expect the refault detection code that was introduced
> around the same time as this patch to counter-act the hot file
> thrashing that is mentioned in that patch's changelog.
> 
> Nevertheless, it seems a bit gratuitous to go against that change so
> directly when global reclaim hasn't historically been a problem with
> groups <16MB. Limit reclaim should be fine too.

As I've already mentioned, I really love this patch I just think this is
a subtle side effect. The above reasoning should be good enough I
believe.

Anyway I forgot to add, I will leave the decision whether to have this
in a separate patch or just added to the changelog to you.
Acked-by: Michal Hocko <mhocko@suse.com>
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jia He <hejianet@gmail.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 6/9] mm: don't avoid high-priority reclaim on memcg limit reclaim
Date: Wed, 1 Mar 2017 20:13:59 +0100	[thread overview]
Message-ID: <20170301191359.GB24905@dhcp22.suse.cz> (raw)
In-Reply-To: <20170301173628.GA12664@cmpxchg.org>

On Wed 01-03-17 12:36:28, Johannes Weiner wrote:
> On Wed, Mar 01, 2017 at 04:40:27PM +0100, Michal Hocko wrote:
> > On Tue 28-02-17 16:40:04, Johannes Weiner wrote:
> > > 246e87a93934 ("memcg: fix get_scan_count() for small targets") sought
> > > to avoid high reclaim priorities for memcg by forcing it to scan a
> > > minimum amount of pages when lru_pages >> priority yielded nothing.
> > > This was done at a time when reclaim decisions like dirty throttling
> > > were tied to the priority level.
> > > 
> > > Nowadays, the only meaningful thing still tied to priority dropping
> > > below DEF_PRIORITY - 2 is gating whether laptop_mode=1 is generally
> > > allowed to write. But that is from an era where direct reclaim was
> > > still allowed to call ->writepage, and kswapd nowadays avoids writes
> > > until it's scanned every clean page in the system. Potential changes
> > > to how quick sc->may_writepage could trigger are of little concern.
> > > 
> > > Remove the force_scan stuff, as well as the ugly multi-pass target
> > > calculation that it necessitated.
> > 
> > I _really_ like this, I hated the multi-pass part. One thig that I am
> > worried about and changelog doesn't mention it is what we are going to
> > do about small (<16MB) memcgs. On one hand they were already ignored in
> > the global reclaim so this is nothing really new but maybe we want to
> > preserve the behavior for the memcg reclaim at least which would reduce
> > side effect of this patch which is a great cleanup otherwise. Or at
> > least be explicit about this in the changelog.
> 
> <16MB groups are a legitimate concern during global reclaim, but we
> have done it this way for a long time and it never seemed to have
> mattered in practice.

Yeah, this is not really easy to spot because there are usually other
memcgs which can be reclaimed.

> And for limit reclaim, this should be much less of a concern. It just
> means we no longer scan these groups at DEF_PRIORITY and will have to
> increase the scan window. I don't see a problem with that. And that
> consequence of higher priorities is right in the patch subject.

well the memory pressure spills over to others in the same hierarchy.
But I agree this shouldn't a disaster.

> > Btw. why cannot we simply force scan at least SWAP_CLUSTER_MAX
> > unconditionally?
> > 
> > > +		/*
> > > +		 * If the cgroup's already been deleted, make sure to
> > > +		 * scrape out the remaining cache.
> > 		   Also make sure that small memcgs will not get
> > 		   unnoticed during the memcg reclaim
> > 
> > > +		 */
> > > +		if (!scan && !mem_cgroup_online(memcg))
> > 
> > 		if (!scan && (!mem_cgroup_online(memcg) || !global_reclaim(sc)))
> 
> With this I'd be worried about regressing the setups pointed out in
> 6f04f48dc9c0 ("mm: only force scan in reclaim when none of the LRUs
> are big enough.").
> 
> Granted, that patch is a little dubious. IMO, we should be steering
> the LRU balance through references and, in that case in particular,
> with swappiness. Using the default 60 for zswap is too low.
> 
> Plus, I would expect the refault detection code that was introduced
> around the same time as this patch to counter-act the hot file
> thrashing that is mentioned in that patch's changelog.
> 
> Nevertheless, it seems a bit gratuitous to go against that change so
> directly when global reclaim hasn't historically been a problem with
> groups <16MB. Limit reclaim should be fine too.

As I've already mentioned, I really love this patch I just think this is
a subtle side effect. The above reasoning should be good enough I
believe.

Anyway I forgot to add, I will leave the decision whether to have this
in a separate patch or just added to the changelog to you.
Acked-by: Michal Hocko <mhocko@suse.com>
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-01 19:15 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 21:39 [PATCH 0/9] mm: kswapd spinning on unreclaimable nodes - fixes and cleanups Johannes Weiner
2017-02-28 21:39 ` Johannes Weiner
2017-02-28 21:39 ` [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes Johannes Weiner
2017-02-28 21:39   ` Johannes Weiner
2017-03-02  3:23   ` Hillf Danton
2017-03-02  3:23     ` Hillf Danton
2017-03-02 23:30   ` Shakeel Butt
2017-03-02 23:30     ` Shakeel Butt
2017-03-03  1:26   ` Minchan Kim
2017-03-03  1:26     ` Minchan Kim
2017-03-03  7:59     ` Michal Hocko
2017-03-03  7:59       ` Michal Hocko
2017-03-06  1:37       ` Minchan Kim
2017-03-06  1:37         ` Minchan Kim
2017-03-06 16:24         ` Johannes Weiner
2017-03-06 16:24           ` Johannes Weiner
2017-03-07  0:59           ` Hillf Danton
2017-03-07  0:59             ` Hillf Danton
2017-03-07  7:28           ` Minchan Kim
2017-03-07  7:28             ` Minchan Kim
2017-03-07 10:17           ` Michal Hocko
2017-03-07 10:17             ` Michal Hocko
2017-03-07 16:56             ` Johannes Weiner
2017-03-07 16:56               ` Johannes Weiner
2017-03-09 14:20               ` Mel Gorman
2017-03-09 14:20                 ` Mel Gorman
2017-02-28 21:40 ` [PATCH 2/9] mm: fix check for reclaimable pages in PF_MEMALLOC reclaim throttling Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:02   ` Michal Hocko
2017-03-01 15:02     ` Michal Hocko
2017-03-02  3:25   ` Hillf Danton
2017-03-02  3:25     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 3/9] mm: remove seemingly spurious reclaimability check from laptop_mode gating Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:06   ` Michal Hocko
2017-03-01 15:06     ` Michal Hocko
2017-03-01 15:17   ` Mel Gorman
2017-03-01 15:17     ` Mel Gorman
2017-03-02  3:27   ` Hillf Danton
2017-03-02  3:27     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 4/9] mm: remove unnecessary reclaimability check from NUMA balancing target Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:14   ` Michal Hocko
2017-03-01 15:14     ` Michal Hocko
2017-03-02  3:28   ` Hillf Danton
2017-03-02  3:28     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 5/9] mm: don't avoid high-priority reclaim on unreclaimable nodes Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:21   ` Michal Hocko
2017-03-01 15:21     ` Michal Hocko
2017-03-02  3:31   ` Hillf Danton
2017-03-02  3:31     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 6/9] mm: don't avoid high-priority reclaim on memcg limit reclaim Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:40   ` Michal Hocko
2017-03-01 15:40     ` Michal Hocko
2017-03-01 17:36     ` Johannes Weiner
2017-03-01 17:36       ` Johannes Weiner
2017-03-01 19:13       ` Michal Hocko [this message]
2017-03-01 19:13         ` Michal Hocko
2017-03-02  3:32   ` Hillf Danton
2017-03-02  3:32     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 7/9] mm: delete NR_PAGES_SCANNED and pgdat_reclaimable() Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:41   ` Michal Hocko
2017-03-01 15:41     ` Michal Hocko
2017-03-02  3:34   ` Hillf Danton
2017-03-02  3:34     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 8/9] Revert "mm, vmscan: account for skipped pages as a partial scan" Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:51   ` Michal Hocko
2017-03-01 15:51     ` Michal Hocko
2017-03-02  3:36   ` Hillf Danton
2017-03-02  3:36     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 9/9] mm: remove unnecessary back-off function when retrying page reclaim Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 14:56   ` Michal Hocko
2017-03-01 14:56     ` Michal Hocko
2017-03-02  3:37   ` Hillf Danton
2017-03-02  3:37     ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301191359.GB24905@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hejianet@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.