linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH] mm: Stop kswapd early when nothing's waiting for it to free pages
Date: Thu, 20 Feb 2020 09:29:19 +0100	[thread overview]
Message-ID: <20200220082919.GC20509@dhcp22.suse.cz> (raw)
In-Reply-To: <20200219204220.GA3488@sultan-book.localdomain>

On Wed 19-02-20 12:42:20, Sultan Alsawaf wrote:
> On Wed, Feb 19, 2020 at 09:05:27PM +0100, Michal Hocko wrote:
[...]
> > Again, do you have more details about the workload and what was the
> > cause of responsiveness issues? Because I would expect that the
> > situation would be quite opposite because it is usually the direct
> > reclaim that is a source of stalls visible from userspace. Or is this
> > about a single CPU situation where kswapd saturates the single CPU and
> > all other tasks are just not getting enough CPU cycles?
> 
> The workload was having lots of applications open at once. At a certain point
> when memory ran low, my system became sluggish and kswapd CPU usage skyrocketed.

Could you provide more details please? Is kswapd making a forward
progress? Have you checked why other precesses are slugish? They do not
get CPU time or they are blocked on something?

> I added printks into kswapd with this patch, and my premature exit in kswapd
> kicked in quite often.
> 
> > > On systems with more memory I tested (>=4G), kswapd becomes more expensive to
> > > run at its higher scan depths, so stopping kswapd prematurely when there aren't
> > > any memory allocations waiting for it prevents it from reaching the *really*
> > > expensive scan depths and burning through even more resources.
> > > 
> > > Combine a large amount of memory with a slow CPU and the current problematic
> > > behavior of kswapd at high memory pressure shows. My personal test scenario for
> > > this was an arm64 CPU with a variable amount of memory (up to 4G RAM + 2G swap).
> > 
> > But still, somebody has to put the system into balanced state so who is
> > going to do all the work?
> 
> All the work will be done by kswapd of course, but only if it's needed.
> 
> The real problem is that a single memory allocation failure, and free memory
> being some amount below the high watermark, are not good heuristics to predict
> *future* memory allocation needs. They are good for determining how to steer
> kswapd to help satisfy a failed allocation in the present, but anything more is
> pure speculation (which turns out to be wrong speculation, since this behavior
> causes problems).

Well, you might be right that there might be better heuristics than the
existing watermark based one. After all nobody can predict the future.
The existing heuristic aims at providing min_free_kbytes of free memory
as much as possible and that tends to work reasonably well for a large
set of workloads.

> If there are outstanding failed allocations that won't go away, then it's
> perfectly reasonable to keep kswapd running until it frees pages up to the high
> watermark. But beyond that is unnecessary, since there's no way to know if or
> when kswapd will need to fire up again. This makes sense considering how kswapd
> is currently invoked: it's fired up due to a failed allocation of some sort, not
> because the amount of free memory dropped below the high watermark.

Very broadly speaking (sorry if I am stating obvious here), the kswapd
is woken up when the allocator hits low watermark or the reguested high
order pages are depleted. Then allocator enters its slow path. That
means that the background reclaim then aims at reclaiming the high-low
watermark gap or invokes compaction to keep the balance. It takes to
consume that gap to wake the kswapd again for order-0 (most common)
requests. So this is usually not about a single allocation to trigger
the background reclaim and counting failures on low watermark attempts
is unlikely to work with the current code as you suggested.
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2020-02-20  8:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 18:25 [PATCH] mm: Stop kswapd early when nothing's waiting for it to free pages Sultan Alsawaf
2020-02-19 19:13 ` Dave Hansen
2020-02-19 19:40   ` Sultan Alsawaf
2020-02-19 20:05     ` Michal Hocko
2020-02-19 20:42       ` Sultan Alsawaf
2020-02-19 21:45         ` Mel Gorman
2020-02-19 22:42           ` Sultan Alsawaf
2020-02-20 10:19             ` Mel Gorman
2020-02-21  4:22               ` Sultan Alsawaf
2020-02-21  8:07                 ` Michal Hocko
     [not found]                   ` <20200221210824.GA3605@sultan-book.localdomain>
2020-02-21 21:24                     ` Dave Hansen
2020-02-25  9:09                     ` Michal Hocko
2020-02-25 17:12                       ` Sultan Alsawaf
2020-02-26  9:05                         ` Michal Hocko
2020-02-25 22:30                       ` Shakeel Butt
2020-02-26  9:08                         ` Michal Hocko
2020-02-26 17:00                           ` Shakeel Butt
2020-02-26 17:41                             ` Michal Hocko
     [not found]                       ` <20200226105137.9088-1-hdanton@sina.com>
2020-02-26 17:04                         ` Shakeel Butt
2020-02-21 18:04                 ` Shakeel Butt
2020-02-21 20:06                   ` Sultan Alsawaf
2020-02-20  8:29         ` Michal Hocko [this message]
2020-02-19 19:26 ` Andrew Morton
2020-02-19 22:45   ` Sultan Alsawaf
2020-02-19 19:35 ` Michal Hocko
2020-02-21  4:30 ` [PATCH v2] " Sultan Alsawaf
     [not found]   ` <20200221182201.GB4462@iweiny-DESK2.sc.intel.com>
2020-02-21 20:00     ` Sultan Alsawaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200220082919.GC20509@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=sultan@kerneltoast.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).