From: Vlastimil Babka <vbabka@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
"Artem S. Tashkinov" <aros@gmx.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure
Date: Fri, 9 Aug 2019 16:56:28 +0200 [thread overview]
Message-ID: <6e7f0cd2-8b13-7534-1c0e-f3569f8b4c05@suse.cz> (raw)
In-Reply-To: <20190808172725.GA16900@cmpxchg.org>
On 8/8/19 7:27 PM, Johannes Weiner wrote:
> On Thu, Aug 08, 2019 at 04:47:18PM +0200, Vlastimil Babka wrote:
>> On 8/7/19 10:51 PM, Johannes Weiner wrote:
>>> From 9efda85451062dea4ea287a886e515efefeb1545 Mon Sep 17 00:00:00 2001
>>> From: Johannes Weiner <hannes@cmpxchg.org>
>>> Date: Mon, 5 Aug 2019 13:15:16 -0400
>>> Subject: [PATCH] psi: trigger the OOM killer on severe thrashing
>>
>> Thanks a lot, perhaps finally we are going to eat the elephant ;)
>>
>> I've tested this by booting with mem=8G and activating browser tabs as
>> long as I could. Then initially the system started thrashing and didn't
>> recover for minutes. Then I realized sysrq+f is disabled... Fixed that
>> up after next reboot, tried lower thresholds, also started monitoring
>> /proc/pressure/memory, and found out that after minutes of not being
>> able to move the cursor, both avg10 and avg60 shows only around 15 for
>> both some and full. Lowered thrashing_oom_level to 10 and (with
>> thrashing_oom_period of 5) the thrashing OOM finally started kicking,
>> and the system recovered by itself in reasonable time.
>
> It sounds like there is a missing annotation. The time has to be going
> somewhere, after all. One *known* missing vector I fixed recently is
> stalls in submit_bio() itself when refaulting, but it's not merged
> yet. Attaching the patch below, can you please test it?
It made a difference, but not enough, it seems. Before the patch I could
observe "io:full avg10" around 75% and "memory:full avg10" around 20%,
after the patch, "memory:full avg10" went to around 45%, while io stayed
the same (BTW should the refaults be discounted from the io counters, so
that the sum is still <=100%?)
As a result I could change the knobs to recover successfully with
thrashing detected for 10s of 40% memory pressure.
Perhaps being low on memory we can't detect refaults so well due to
limited number of shadow entries, or there was genuine non-refault I/O
in the mix. The detection would then probably have to look at both I/O
and memory?
Thanks,
Vlastimil
next prev parent reply other threads:[~2019-08-09 14:56 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-04 9:23 Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure Artem S. Tashkinov
2019-08-05 12:13 ` Vlastimil Babka
2019-08-05 13:31 ` Michal Hocko
2019-08-05 16:47 ` Suren Baghdasaryan
2019-08-05 18:55 ` Johannes Weiner
2019-08-06 9:29 ` Michal Hocko
2019-08-05 19:31 ` Johannes Weiner
2019-08-06 1:08 ` Suren Baghdasaryan
2019-08-06 9:36 ` Vlastimil Babka
2019-08-06 14:27 ` Johannes Weiner
2019-08-06 14:36 ` Michal Hocko
2019-08-06 16:27 ` Suren Baghdasaryan
2019-08-06 22:01 ` Johannes Weiner
2019-08-07 7:59 ` Michal Hocko
2019-08-07 20:51 ` Johannes Weiner
2019-08-07 21:01 ` Andrew Morton
2019-08-07 21:34 ` Johannes Weiner
2019-08-07 21:12 ` Johannes Weiner
2019-08-08 11:48 ` Michal Hocko
2019-08-08 15:10 ` ndrw.xf
2019-08-08 16:32 ` Michal Hocko
2019-08-08 17:57 ` ndrw.xf
2019-08-08 18:59 ` Michal Hocko
2019-08-08 21:59 ` ndrw
2019-08-09 8:57 ` Michal Hocko
2019-08-09 10:09 ` ndrw
2019-08-09 10:50 ` Michal Hocko
2019-08-09 14:18 ` Pintu Agarwal
2019-08-10 12:34 ` ndrw
2019-08-12 8:24 ` Michal Hocko
2019-08-10 21:07 ` ndrw
2021-07-24 17:32 ` Alexey Avramov
2019-08-08 14:47 ` Vlastimil Babka
2019-08-08 17:27 ` Johannes Weiner
2019-08-09 14:56 ` Vlastimil Babka [this message]
2019-08-09 17:31 ` Johannes Weiner
2019-08-13 13:47 ` Vlastimil Babka
2019-08-06 21:43 ` James Courtier-Dutton
2019-08-06 19:00 ` Florian Weimer
2019-08-20 6:46 ` Daniel Drake
2019-08-21 21:42 ` James Courtier-Dutton
2019-08-29 12:29 ` Michal Hocko
2019-09-02 20:15 ` Pavel Machek
2019-08-23 1:54 ` ndrw
2019-08-23 2:14 ` Daniel Drake
[not found] <20190805090514.5992-1-hdanton@sina.com>
2019-08-05 12:01 ` Artem S. Tashkinov
2019-08-06 8:57 Johannes Buchner
2019-08-06 19:43 Remi Gauvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6e7f0cd2-8b13-7534-1c0e-f3569f8b4c05@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=aros@gmx.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).