From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
akpm@linux-foundation.org, linux-mm@kvack.org,
kernel-team@fb.com, tj@kernel.org, chris@chrisdown.name,
cgroups@vger.kernel.org, shakeelb@google.com
Subject: Re: [PATCH mm v2 3/3] mm: automatically penalize tasks with high swap use
Date: Fri, 15 May 2020 09:14:58 +0200 [thread overview]
Message-ID: <20200515071458.GE29153@dhcp22.suse.cz> (raw)
In-Reply-To: <20200514202130.GA591266@cmpxchg.org>
On Thu 14-05-20 16:21:30, Johannes Weiner wrote:
> On Thu, May 14, 2020 at 09:42:46AM +0200, Michal Hocko wrote:
> > On Wed 13-05-20 11:36:23, Jakub Kicinski wrote:
> > > On Wed, 13 May 2020 10:32:49 +0200 Michal Hocko wrote:
> > > > On Tue 12-05-20 10:55:36, Jakub Kicinski wrote:
> > > > > On Tue, 12 May 2020 09:26:34 +0200 Michal Hocko wrote:
> > > > > > On Mon 11-05-20 15:55:16, Jakub Kicinski wrote:
> > > > > > > Use swap.high when deciding if swap is full.
> > > > > >
> > > > > > Please be more specific why.
> > > > >
> > > > > How about:
> > > > >
> > > > > Use swap.high when deciding if swap is full to influence ongoing
> > > > > swap reclaim in a best effort manner.
> > > >
> > > > This is still way too vague. The crux is why should we treat hard and
> > > > high swap limit the same for mem_cgroup_swap_full purpose. Please
> > > > note that I am not saying this is wrong. I am asking for a more
> > > > detailed explanation mostly because I would bet that somebody
> > > > stumbles over this sooner or later.
> > >
> > > Stumbles in what way?
> >
> > Reading the code and trying to understand why this particular decision
> > has been made. Because it might be surprising that the hard and high
> > limits are treated same here.
>
> I don't quite understand the controversy.
I do not think there is any controversy. All I am asking for is a
clarification because this is non-intuitive.
> The idea behind "swap full" is that as long as the workload has plenty
> of swap space available and it's not changing its memory contents, it
> makes sense to generously hold on to copies of data in the swap
> device, even after the swapin. A later reclaim cycle can drop the page
> without any IO. Trading disk space for IO.
>
> But the only two ways to reclaim a swap slot is when they're faulted
> in and the references go away, or by scanning the virtual address space
> like swapoff does - which is very expensive (one could argue it's too
> expensive even for swapoff, it's often more practical to just reboot).
>
> So at some point in the fill level, we have to start freeing up swap
> slots on fault/swapin. Otherwise we could eventually run out of swap
> slots while they're filled with copies of data that is also in RAM.
>
> We don't want to OOM a workload because its available swap space is
> filled with redundant cache.
Thanks this is a useful summary.
> That applies to physical swap limits, swap.max, and naturally also to
> swap.high which is a limit to implement userspace OOM for swap space
> exhaustion.
>
> > > Isn't it expected for the kernel to take reasonable precautions to
> > > avoid hitting limits?
> >
> > Isn't the throttling itself the precautious? How does the swap cache
> > and its control via mem_cgroup_swap_full interact here. See? This is
> > what I am asking to have explained in the changelog.
>
> It sounds like we need better documentation of what vm_swap_full() and
> friends are there for. It should have been obvious why swap.high - a
> limit on available swap space - hooks into it.
Agreed. The primary source for a confusion is the naming here. Because
vm_swap_full doesn't really try to tell that the swap is full. It merely
tries to tell that it is getting full and so duplicated data should be
dropped.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2020-05-15 7:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 22:55 [PATCH mm v2 0/3] memcg: Slow down swap allocation as the available space gets depleted Jakub Kicinski
2020-05-11 22:55 ` [PATCH mm v2 1/3] mm: prepare for swap over-high accounting and penalty calculation Jakub Kicinski
2020-05-12 7:08 ` Michal Hocko
2020-05-12 17:28 ` Jakub Kicinski
2020-05-13 8:06 ` Michal Hocko
2020-05-11 22:55 ` [PATCH mm v2 2/3] mm: move penalty delay clamping out of calculate_high_delay() Jakub Kicinski
2020-05-11 22:55 ` [PATCH mm v2 3/3] mm: automatically penalize tasks with high swap use Jakub Kicinski
2020-05-12 7:26 ` Michal Hocko
2020-05-12 17:55 ` Jakub Kicinski
2020-05-13 8:32 ` Michal Hocko
2020-05-13 18:36 ` Jakub Kicinski
2020-05-14 7:42 ` Michal Hocko
2020-05-14 20:21 ` Johannes Weiner
2020-05-15 7:14 ` Michal Hocko [this message]
2020-05-13 8:38 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200515071458.GE29153@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=chris@chrisdown.name \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=kuba@kernel.org \
--cc=linux-mm@kvack.org \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).