linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: hejianet <hejianet@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>
Subject: Re: [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there
Date: Thu, 23 Feb 2017 08:21:21 +0100	[thread overview]
Message-ID: <20170223072120.5herkdrum3t4l223@dhcp22.suse.cz> (raw)
In-Reply-To: <28d09cda-e020-8289-1b1f-e19fbd3b3aeb@gmail.com>

On Thu 23-02-17 10:46:01, hejianet wrote:
> sorry, resend it due to a delivery-failure:
> "Wrong MIME labeling on 8-bit character texts"
> I am sorry if anybody received it twice
> ------------
> Hi Johannes
> On 23/02/2017 4:16 AM, Johannes Weiner wrote:
> > On Wed, Feb 22, 2017 at 05:04:48PM +0800, Jia He wrote:
> > > When I try to dynamically allocate the hugepages more than system total
> > > free memory:
> > 
> > > Then the kswapd will take 100% cpu for a long time(more than 3 hours, and
> > > will not be about to end)
> > 
> > > The root cause is kswapd3 is trying to do relaim again and again but it
> > > makes no progress
> > 
> > > At that time, there are no relaimable pages in that node:
> > 
> > Yes, this is a problem with the current kswapd code.
> > 
> > A less artificial scenario that I observed recently was machines with
> > two NUMA nodes, after being up for 200+ days, getting into a state
> > where node0 is mostly consumed by anon and some kernel allocations,
> > leaving less than the high watermark free. The machines don't have
> > swap, so the anon isn't reclaimable. But also, anon LRU is never even
> > *scanned*, so the "all unreclaimable" logic doesn't kick in. Kswapd is
> > spinning at 100% CPU calculating scan counts and checking zone states.
> > 
> > One specific problem with your patch, Jia, is that there might be some
> > cache pages that are pinned one way or another. That was the case on
> > our machines, and so reclaimable pages wasn't 0. Even if we check the
> > reclaimable pages, we need a hard cutoff after X attempts. And then it
> > sounds pretty much like what the allocator/direct reclaim already does.
> > 
> > Can we use the *exact* same cutoff conditions for direct reclaim and
> > kswapd, though? I don't think so. For direct reclaim, the goal is the
> > watermark, to make an allocation happen in the caller. While kswapd
> > tries to restore the watermarks too, it might never meet them but
> > still do useful work on behalf of concurrently allocating threads. It
> > should only stop when it tries and fails to free any pages at all.
> > 
> Yes, this is what I thought before this patch,but seems Michal
> doesn't like this idea :)
> Please see https://lkml.org/lkml/2017/1/24/543

Yeah, I didn't like the hard limit on kswapd retries as you proposed it.
It didn't make much sense to me because the current condition for kswapd
to back off is to have all zones balanced. Without further criterion
kswapd would just wake up and go around the same retry loops again with
no progress. I didn't realize that a direct reclaim progress might be
that criterion. Proposal from Johannes makes much more sense. I have to
think about it some more but this looks like a way forward.
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2017-02-23  7:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-22  9:04 [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there Jia He
2017-02-22 11:41 ` Michal Hocko
2017-02-22 14:31   ` hejianet
2017-02-22 15:48     ` Michal Hocko
2017-02-23  2:25       ` hejianet
2017-02-22 20:16 ` Johannes Weiner
2017-02-22 20:24   ` Johannes Weiner
2017-02-23  7:29     ` Michal Hocko
     [not found]   ` <28d09cda-e020-8289-1b1f-e19fbd3b3aeb@gmail.com>
2017-02-23  3:15     ` Fwd: " hejianet
2017-02-23  7:21     ` Michal Hocko [this message]
2017-02-23 10:19   ` Michal Hocko
2017-02-23 11:16   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170223072120.5herkdrum3t4l223@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hejianet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).