linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nils Holland <nholland@tisys.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.cz>,
	linux-btrfs@vger.kernel.org
Subject: Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)
Date: Tue, 27 Dec 2016 20:33:09 +0100	[thread overview]
Message-ID: <20161227193308.GA17454@boerne.fritz.box> (raw)
In-Reply-To: <20161227155532.GI1308@dhcp22.suse.cz>

On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote:
> Hi,
> could you try to run with the following patch on top of the previous
> one? I do not think it will make a large change in your workload but
> I think we need something like that so some testing under which is known
> to make a high lowmem pressure would be really appreciated. If you have
> more time to play with it then running with and without the patch with
> mm_vmscan_direct_reclaim_{start,end} tracepoints enabled could tell us
> whether it make any difference at all.

Of course, no problem!

First, about the events to trace: mm_vmscan_direct_reclaim_start
doesn't seem to exist, but mm_vmscan_direct_reclaim_begin does. I'm
sure that's what you meant and so I took that one instead.

Then I have to admit in both cases (once without the latest patch,
once with) very little trace data was actually produced. In the case
without the patch, the reclaim was started more often and reclaimed a
smaller number of pages each time, in the case with the patch it was
invoked less often, and with the last time it was invoked it reclaimed
a rather big number of pages. I have no clue, however, if that
happened "by chance" or if it was actually causes by the patch and
thus an expected change.

In both cases, my test case was: Reboot, setup logging, do "emerge
firefox" (which unpacks and builds the firefox sources), then, when
the emerge had come so far that the unpacking was done and the
building had started, switch to another console and untar the latest
kernel, libreoffice and (once more) firefox sources there. After that
had completed, I aborted the emerge build process and stopped tracing.

Here's the trace data captured without the latest patch applied:

khugepaged-22    [000] ....   566.123383: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [000] .N..   566.165520: mm_vmscan_direct_reclaim_end: nr_reclaimed=1100
khugepaged-22    [001] ....   587.515424: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [000] ....   587.596035: mm_vmscan_direct_reclaim_end: nr_reclaimed=1029
khugepaged-22    [001] ....   599.879536: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [000] ....   601.000812: mm_vmscan_direct_reclaim_end: nr_reclaimed=1100
khugepaged-22    [001] ....   601.228137: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   601.309952: mm_vmscan_direct_reclaim_end: nr_reclaimed=1081
khugepaged-22    [001] ....   694.935267: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] .N..   695.081943: mm_vmscan_direct_reclaim_end: nr_reclaimed=1071
khugepaged-22    [001] ....   701.370707: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   701.372798: mm_vmscan_direct_reclaim_end: nr_reclaimed=1089
khugepaged-22    [001] ....   764.752036: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [000] ....   771.047905: mm_vmscan_direct_reclaim_end: nr_reclaimed=1039
khugepaged-22    [000] ....   781.760515: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   781.826543: mm_vmscan_direct_reclaim_end: nr_reclaimed=1040
khugepaged-22    [001] ....   782.595575: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [000] ....   782.638591: mm_vmscan_direct_reclaim_end: nr_reclaimed=1040
khugepaged-22    [001] ....   782.930455: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   782.993608: mm_vmscan_direct_reclaim_end: nr_reclaimed=1040
khugepaged-22    [001] ....   783.330378: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   783.369653: mm_vmscan_direct_reclaim_end: nr_reclaimed=1040

And this is the same with the patch applied:

khugepaged-22    [001] ....   523.599997: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   523.683110: mm_vmscan_direct_reclaim_end: nr_reclaimed=1092
khugepaged-22    [001] ....   535.345477: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   535.401189: mm_vmscan_direct_reclaim_end: nr_reclaimed=1078
khugepaged-22    [000] ....   692.876716: mm_vmscan_direct_reclaim_begin: order=9 may_writepage=1 gfp_flags=GFP_TRANSHUGE classzone_idx=3
khugepaged-22    [001] ....   703.312399: mm_vmscan_direct_reclaim_end: nr_reclaimed=197759

If my test case and thus the results don't sound good, I could of
course try some other test cases ... like capturing for a longer
period of time or trying to produce more memory pressure by running
more processes at the same time, or something like that.

Besides that I can say that the patch hasn't produced any warnings or
other issues so far, so at first glance, it doesn't seem to hurt
anything.

Greetings
Nils

  parent reply	other threads:[~2016-12-27 19:33 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-15 22:57 OOM: Better, but still there on 4.9 Nils Holland
2016-12-16  7:39 ` Michal Hocko
2016-12-16 15:58   ` OOM: Better, but still there on Michal Hocko
2016-12-16 15:58     ` [PATCH 1/2] mm: consolidate GFP_NOFAIL checks in the allocator slowpath Michal Hocko
2016-12-16 15:58     ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko
2016-12-16 17:31       ` Johannes Weiner
2016-12-16 22:12         ` Michal Hocko
2016-12-17 11:17           ` Tetsuo Handa
2016-12-18 16:37             ` Michal Hocko
2016-12-16 18:47     ` OOM: Better, but still there on Nils Holland
2016-12-17  0:02       ` Michal Hocko
2016-12-17 12:59         ` Nils Holland
2016-12-17 14:44           ` Tetsuo Handa
2016-12-17 17:11             ` Nils Holland
2016-12-17 21:06             ` Nils Holland
2016-12-18  5:14               ` Tetsuo Handa
2016-12-19 13:45               ` Michal Hocko
2016-12-20  2:08                 ` Nils Holland
2016-12-21  7:36                   ` Michal Hocko
2016-12-21 11:00                     ` Tetsuo Handa
2016-12-21 11:16                       ` Michal Hocko
2016-12-21 14:04                         ` Chris Mason
2016-12-22 10:10                     ` Nils Holland
2016-12-22 10:27                       ` Michal Hocko
2016-12-22 10:35                         ` Nils Holland
2016-12-22 10:46                           ` Tetsuo Handa
2016-12-22 19:17                       ` Michal Hocko
2016-12-22 21:46                         ` Nils Holland
2016-12-23 10:51                           ` Michal Hocko
2016-12-23 12:18                             ` Nils Holland
2016-12-23 12:57                               ` Michal Hocko
2016-12-23 14:47                                 ` [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on) Michal Hocko
2016-12-23 22:26                                   ` Nils Holland
2016-12-26 12:48                                     ` Michal Hocko
2016-12-26 18:57                                       ` Nils Holland
2016-12-27  8:08                                         ` Michal Hocko
2016-12-27 11:23                                           ` Nils Holland
2016-12-27 11:27                                             ` Michal Hocko
2016-12-27 15:55                                       ` Michal Hocko
2016-12-27 16:28                                         ` [PATCH] mm, vmscan: consider eligible zones in get_scan_count kbuild test robot
2016-12-28  8:51                                           ` Michal Hocko
2016-12-27 19:33                                         ` Nils Holland [this message]
2016-12-28  8:57                                           ` [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on) Michal Hocko
2016-12-29  1:20                                         ` Minchan Kim
2016-12-29  9:04                                           ` Michal Hocko
2016-12-30  2:05                                             ` Minchan Kim
2016-12-30 10:40                                               ` Michal Hocko
2016-12-29  0:31                                       ` Minchan Kim
2016-12-29  0:48                                         ` Minchan Kim
2016-12-29  8:52                                           ` Michal Hocko
2016-12-30 10:19                                       ` Mel Gorman
2016-12-30 11:05                                         ` Michal Hocko
2016-12-30 12:43                                           ` Mel Gorman
2016-12-25 22:25                                   ` [lkp-developer] [mm, memcg] d18e2b2aca: WARNING:at_mm/memcontrol.c:#mem_cgroup_update_lru_size kernel test robot
2016-12-26 12:26                                     ` Michal Hocko
2016-12-26 12:50                                       ` Michal Hocko
2016-12-18  0:28             ` OOM: Better, but still there on Xin Zhou
2016-12-16 18:15   ` OOM: Better, but still there on 4.9 Chris Mason
2016-12-16 22:14     ` Michal Hocko
2016-12-16 22:47       ` Chris Mason
2016-12-16 23:31         ` Michal Hocko
2016-12-16 19:50   ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161227193308.GA17454@boerne.fritz.box \
    --to=nholland@tisys.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=hannes@cmpxchg.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).