From: ndrw <ndrw.xf@redhazel.co.uk>
To: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
"Artem S. Tashkinov" <aros@gmx.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure
Date: Fri, 9 Aug 2019 11:09:33 +0100 [thread overview]
Message-ID: <cdb392ee-e192-c136-41cb-48d9e4e4bf47@redhazel.co.uk> (raw)
In-Reply-To: <20190809085748.GN18351@dhcp22.suse.cz>
On 09/08/2019 09:57, Michal Hocko wrote:
> We already do have a reserve (min_free_kbytes). That gives kswapd some
> room to perform reclaim in the background without obvious latencies to
> allocating tasks (well CPU still be used so there is still some effect).
I tried this option in the past. Unfortunately, I didn't prevent
freezes. My understanding is this option reserves some amount of memory
to not be swapped out but does not prevent the kernel from evicting all
pages from cache when more memory is needed.
> Kswapd tries to keep a balance and free memory low but still with some
> room to satisfy an immediate memory demand. Once kswapd doesn't catch up
> with the memory demand we dive into the direct reclaim and that is where
> people usually see latencies coming from.
Reclaiming memory is fine, of course, but not all the way to 0 caches.
No caches means all executable pages, ro pages (e.g. fonts) are evicted
from memory and have to be constantly reloaded on every user action. All
this while competing with tasks that are using up all memory. This
happens with of without swap, although swap does spread this issue in
time a bit.
> The main problem here is that it is hard to tell from a single
> allocation latency that we have a bigger problem. As already said, the
> usual trashing scenario doesn't show problem during the reclaim because
> pages can be freed up very efficiently. The problem is that they are
> refaulted very quickly so we are effectively rotating working set like
> crazy. Compare that to a normal used-once streaming IO workload which is
> generating a lot of page cache that can be recycled in a similar pace
> but a working set doesn't get freed. Free memory figures will look very
> similar in both cases.
Thank you for the explanation. It is indeed a difficult problem - some
cached pages (streaming IO) will likely not be needed again and should
be discarded asap, other (like mmapped executable/ro pages of UI
utilities) will cause thrashing when evicted under high memory pressure.
Another aspect is that PSI is probably not the best measure of detecting
imminent thrashing. However, if it can at least detect a freeze that has
already occurred and force the OOM killer that is still a lot better
than a dead system, which is the current user experience.
> Good that earlyoom works for you.
I am giving it as an example of a heuristic that seems to work very well
for me. Something to look into. And yes, I wouldn't mind having such
mechanism built into the kernel.
> All I am saying is that this is not
> generally applicable heuristic because we do care about a larger variety
> of workloads. I should probably emphasise that the OOM killer is there
> as a _last resort_ hand break when something goes terribly wrong. It
> operates at times when any user intervention would be really hard
> because there is a lack of resources to be actionable.
It is indeed a last resort solution - without it the system is unusable.
Still, accuracy matters because killing a wrong task does not fix the
problem (a task hogging memory is still running) and may break the
system anyway if something important is killed instead.
[...]
> This is a useful feedback! What was your workload? Which kernel version?
I tested it by running a python script that processes a large amount of
data in memory (needs around 15GB of RAM). I normally run 2 instances of
that script in parallel but for testing I started 4 of them. I sometimes
experience the same issue when using multiple regular memory intensive
desktop applications in a manner described in the first post but that's
harder to reproduce because of the user input needed.
[ 0.000000] Linux version 5.0.0-21-generic (buildd@lgw01-amd64-036)
(gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1)) #22-Ubuntu SMP Tue Jul 2
13:27:33 UTC 2019 (Ubuntu 5.0.0-21.22-generic 5.0.15)
AMD CPU with 4 cores, 8 threads. AMDGPU graphics stack.
Best regards,
ndrw
next prev parent reply other threads:[~2019-08-09 10:09 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <d9802b6a-949b-b327-c4a6-3dbca485ec20@gmx.com>
2019-08-05 12:13 ` Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure Vlastimil Babka
2019-08-05 13:31 ` Michal Hocko
2019-08-05 16:47 ` Suren Baghdasaryan
2019-08-05 18:55 ` Johannes Weiner
2019-08-06 9:29 ` Michal Hocko
2019-08-05 19:31 ` Johannes Weiner
2019-08-06 1:08 ` Suren Baghdasaryan
2019-08-06 9:36 ` Vlastimil Babka
2019-08-06 14:27 ` Johannes Weiner
2019-08-06 14:36 ` Michal Hocko
2019-08-06 16:27 ` Suren Baghdasaryan
2019-08-06 22:01 ` Johannes Weiner
2019-08-07 7:59 ` Michal Hocko
2019-08-07 20:51 ` Johannes Weiner
2019-08-07 21:01 ` Andrew Morton
2019-08-07 21:34 ` Johannes Weiner
2019-08-07 21:12 ` Johannes Weiner
2019-08-08 11:48 ` Michal Hocko
2019-08-08 15:10 ` ndrw.xf
2019-08-08 16:32 ` Michal Hocko
2019-08-08 17:57 ` ndrw.xf
2019-08-08 18:59 ` Michal Hocko
2019-08-08 21:59 ` ndrw
2019-08-09 8:57 ` Michal Hocko
2019-08-09 10:09 ` ndrw [this message]
2019-08-09 10:50 ` Michal Hocko
2019-08-09 14:18 ` Pintu Agarwal
2019-08-10 12:34 ` ndrw
2019-08-12 8:24 ` Michal Hocko
2019-08-10 21:07 ` ndrw
2021-07-24 17:32 ` Alexey Avramov
2021-07-25 2:11 ` Hillf Danton
2019-08-08 14:47 ` Vlastimil Babka
2019-08-08 17:27 ` Johannes Weiner
2019-08-09 14:56 ` Vlastimil Babka
2019-08-09 17:31 ` Johannes Weiner
2019-08-13 13:47 ` Vlastimil Babka
2019-08-06 21:43 ` James Courtier-Dutton
2019-08-06 19:00 ` Florian Weimer
2019-08-05 9:05 Hillf Danton
2019-08-05 12:01 ` Artem S. Tashkinov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cdb392ee-e192-c136-41cb-48d9e4e4bf47@redhazel.co.uk \
--to=ndrw.xf@redhazel.co.uk \
--cc=akpm@linux-foundation.org \
--cc=aros@gmx.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).