linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nils Holland <nholland@tisys.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.cz>,
	linux-btrfs@vger.kernel.org
Subject: Re: OOM: Better, but still there on
Date: Tue, 20 Dec 2016 03:08:29 +0100	[thread overview]
Message-ID: <20161220020829.GA5449@boerne.fritz.box> (raw)
In-Reply-To: <20161219134534.GC5164@dhcp22.suse.cz>

On Mon, Dec 19, 2016 at 02:45:34PM +0100, Michal Hocko wrote:

> Unfortunatelly shrink_active_list doesn't have any tracepoint so we do
> not know whether we managed to rotate those pages. If they are referenced
> quickly enough we might just keep refaulting them... Could you try to apply
> the followin diff on top what you have currently. It should add some more
> tracepoint data which might tell us more. We can reduce the amount of
> tracing data by enabling only mm_vmscan_lru_isolate,
> mm_vmscan_lru_shrink_inactive and mm_vmscan_lru_shrink_active.

So, the results are in! I applied your patch and rebuild the kernel,
then I rebooted the machine, set up tracing so that only the three
events you mentioned were being traced, and captured the output over
the network.

Things went a bit different this time: The trace events started to
appear after a while and a whole lot of them were generated, but
suddenly they stopped. A short while later, we get

[ 1661.485568] btrfs-transacti: page alloction stalls for 611058ms, order:0, mode:0x2420048(GFP_NOFS|__GFP_HARDWALL|__GFP_MOVABLE)

along with a backtrace and memory information, and then there was
silence. When I walked up to the machine, it had completely died; it
wouldn't turn on its screen on key press any more, blindly trying to
reboot via SysRequest had no effect, but the caps lock LED also wasn't
blinking, like it normally does when a kernel panic occurs. Good
question what state it was in. The OOM reaper didn't really seem to
kick in and kill processes this time, it seems.

The complete capture is up at:

http://ftp.tisys.org/pub/misc/teela_2016-12-20.log.xz

Greetings
Nils

  reply	other threads:[~2016-12-20  2:08 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-15 22:57 OOM: Better, but still there on 4.9 Nils Holland
2016-12-16  7:39 ` Michal Hocko
2016-12-16 15:58   ` OOM: Better, but still there on Michal Hocko
2016-12-16 15:58     ` [PATCH 1/2] mm: consolidate GFP_NOFAIL checks in the allocator slowpath Michal Hocko
2016-12-16 15:58     ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko
2016-12-16 17:31       ` Johannes Weiner
2016-12-16 22:12         ` Michal Hocko
2016-12-17 11:17           ` Tetsuo Handa
2016-12-18 16:37             ` Michal Hocko
2016-12-16 18:47     ` OOM: Better, but still there on Nils Holland
2016-12-17  0:02       ` Michal Hocko
2016-12-17 12:59         ` Nils Holland
2016-12-17 14:44           ` Tetsuo Handa
2016-12-17 17:11             ` Nils Holland
2016-12-17 21:06             ` Nils Holland
2016-12-18  5:14               ` Tetsuo Handa
2016-12-19 13:45               ` Michal Hocko
2016-12-20  2:08                 ` Nils Holland [this message]
2016-12-21  7:36                   ` Michal Hocko
2016-12-21 11:00                     ` Tetsuo Handa
2016-12-21 11:16                       ` Michal Hocko
2016-12-21 14:04                         ` Chris Mason
2016-12-22 10:10                     ` Nils Holland
2016-12-22 10:27                       ` Michal Hocko
2016-12-22 10:35                         ` Nils Holland
2016-12-22 10:46                           ` Tetsuo Handa
2016-12-22 19:17                       ` Michal Hocko
2016-12-22 21:46                         ` Nils Holland
2016-12-23 10:51                           ` Michal Hocko
2016-12-23 12:18                             ` Nils Holland
2016-12-23 12:57                               ` Michal Hocko
2016-12-23 14:47                                 ` [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on) Michal Hocko
2016-12-23 22:26                                   ` Nils Holland
2016-12-26 12:48                                     ` Michal Hocko
2016-12-26 18:57                                       ` Nils Holland
2016-12-27  8:08                                         ` Michal Hocko
2016-12-27 11:23                                           ` Nils Holland
2016-12-27 11:27                                             ` Michal Hocko
2016-12-27 15:55                                       ` Michal Hocko
2016-12-27 16:28                                         ` [PATCH] mm, vmscan: consider eligible zones in get_scan_count kbuild test robot
2016-12-28  8:51                                           ` Michal Hocko
2016-12-27 19:33                                         ` [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on) Nils Holland
2016-12-28  8:57                                           ` Michal Hocko
2016-12-29  1:20                                         ` Minchan Kim
2016-12-29  9:04                                           ` Michal Hocko
2016-12-30  2:05                                             ` Minchan Kim
2016-12-30 10:40                                               ` Michal Hocko
2016-12-29  0:31                                       ` Minchan Kim
2016-12-29  0:48                                         ` Minchan Kim
2016-12-29  8:52                                           ` Michal Hocko
2016-12-30 10:19                                       ` Mel Gorman
2016-12-30 11:05                                         ` Michal Hocko
2016-12-30 12:43                                           ` Mel Gorman
2016-12-25 22:25                                   ` [lkp-developer] [mm, memcg] d18e2b2aca: WARNING:at_mm/memcontrol.c:#mem_cgroup_update_lru_size kernel test robot
2016-12-26 12:26                                     ` Michal Hocko
2016-12-26 12:50                                       ` Michal Hocko
2016-12-18  0:28             ` OOM: Better, but still there on Xin Zhou
2016-12-16 18:15   ` OOM: Better, but still there on 4.9 Chris Mason
2016-12-16 22:14     ` Michal Hocko
2016-12-16 22:47       ` Chris Mason
2016-12-16 23:31         ` Michal Hocko
2016-12-16 19:50   ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161220020829.GA5449@boerne.fritz.box \
    --to=nholland@tisys.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).