All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Alexey Avramov <hakavlad@inbox.lv>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	ValdikSS <iam@valdikss.org.ru>,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	corbet@lwn.net, mcgrof@kernel.org, keescook@chromium.org,
	yzaikin@google.com, oleksandr@natalenko.name, kernel@xanmod.org,
	aros@gmx.com, hakavlad@gmail.com, Yu Zhao <yuzhao@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH] mm/vmscan: add sysctl knobs for protecting the working set
Date: Mon, 6 Dec 2021 10:59:55 +0100	[thread overview]
Message-ID: <Ya3fG2rp+860Yb+t@dhcp22.suse.cz> (raw)
In-Reply-To: <20211203222710.3f0ba239@mail.inbox.lv>

On Fri 03-12-21 22:27:10, Alexey Avramov wrote:
> >I'd also like to know where that malfunction happens in this case.
> 
> User-space processes need to always access shared libraries to work.
> It can be tens or hundreds of megabytes, depending on the type of workload. 
> This is a hot cache, which is pushed out and then read leads to thrashing. 
> There is no way in the kernel to forbid evicting the minimum file cache. 
> This is the problem that the patch solves. And the malfunction is exactly
> that - the inability of the kernel to hold the minimum amount of the
> hottest cache in memory.

Executable pages are a protected resource already page_check_references.
Shared libraries have more page tables pointing to them so they are more
likely to be referenced and thus kept around. What is the other memory
demand to push those away and cause a trashing?

I do agree with Vlastimil that we should be addressing these problems
rather than papering them over by limits nobody will know how to set
up properly and so we will have to deal all sorts of misconfigured
systems. I have a first hand experience with that in a form of page
cache limit that we used to have in older SLES kernels.

[...]
> > The problem with PSI sensing is that it works after the fact (after 
> > the freeze has already occurred). It is not very different from issuing 
> > SysRq-f manually on a frozen system, although it would still be a 
> > handy feature for batched tasks and remote access. 
> 
> but Michal Hocko immediately criticized [7] the proposal unfairly. 
> This patch just implements ndrw's suggestion.

It would be more productive if you were more specific what you consider
an unfair criticism. Thrashing is a real problem and we all recognize
that. We have much better tools in our tool box these days (refault data
for both page cache and swapped back memory). The kernel itself is
rather conservative when using that data for OOM situations because
historically users were more concerned about pre-mature oom killer
invocations because that is a disruptive action.
For those who prefer very agile oom policy there are userspace tools
which can implement more advanced policies.
I am open to any idea to improve the kernel side of things as well.

As mentioned above I am against global knobs to special case the global
memory reclaim because that leads to inconsistencies with the memcg
reclaim, add future maintenance burden and most importantly it
outsources reponsibility to admins who will have hard time to know the
proper value for those knobs effectivelly pushing them towards all sorts
of cargo cult.

> [0] https://serverfault.com/a/319818
> [1] https://github.com/hakavlad/prelockd
> 
> [2] https://www.youtube.com/watch?v=vykUrP1UvcI
>     On this video: running fast memory hog in a loop on Debian 10 GNOME, 
>     4 GiB MemTotal without swap space. FS is ext4 on *HDD*.
>     - 1. prelockd enabled: about 500 MiB mlocked. Starting 
>         `while true; do tail /dev/zero; done`: no freezes. 
>         The OOM killer comes quickly, the system recovers quickly.
>     - 2. prelockd disabled: system hangs.
> 
> [3] https://www.youtube.com/watch?v=g9GCmp-7WXw
> [4] https://www.youtube.com/watch?v=iU3ikgNgp3M
> [5] Let's talk about the elephant in the room - the Linux kernel's 
>     inability to gracefully handle low memory pressure
>     https://lore.kernel.org/all/d9802b6a-949b-b327-c4a6-3dbca485ec20@gmx.com/
> [6] https://lore.kernel.org/all/806F5696-A8D6-481D-A82F-49DEC1F2B035@redhazel.co.uk/
> [7] https://lore.kernel.org/all/20190808163228.GE18351@dhcp22.suse.cz/

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-12-06 10:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 11:16 [PATCH] mm/vmscan: add sysctl knobs for protecting the working set Alexey Avramov
2021-11-30 15:28 ` Luis Chamberlain
2021-11-30 16:15 ` kernel test robot
2021-11-30 16:15   ` kernel test robot
2021-11-30 17:37 ` kernel test robot
2021-11-30 18:56 ` Oleksandr Natalenko
2021-12-01 15:51   ` Alexey Avramov
2021-12-02 18:05 ` ValdikSS
2021-12-02 21:58   ` Andrew Morton
2021-12-03 11:59     ` Vlastimil Babka
2021-12-03 13:27       ` Alexey Avramov
2021-12-06  9:59         ` Michal Hocko [this message]
2022-01-09 22:59           ` Barry Song
2021-12-03 14:01     ` Oleksandr Natalenko
2021-12-12 20:15     ` Alexey Avramov
2021-12-13  9:06       ` Barry Song
2021-12-13  9:07       ` Michal Hocko
2021-12-13  8:38   ` Barry Song
2022-01-25  8:19     ` ValdikSS
2022-02-12  0:01       ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya3fG2rp+860Yb+t@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=aros@gmx.com \
    --cc=corbet@lwn.net \
    --cc=hakavlad@gmail.com \
    --cc=hakavlad@inbox.lv \
    --cc=hannes@cmpxchg.org \
    --cc=iam@valdikss.org.ru \
    --cc=keescook@chromium.org \
    --cc=kernel@xanmod.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=vbabka@suse.cz \
    --cc=yuzhao@google.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.