All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, rientjes@google.com,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	oleg@redhat.com
Subject: Re: [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular OOM killer path
Date: Wed, 13 Apr 2016 20:08:24 +0900	[thread overview]
Message-ID: <201604132008.CHC00016.FOQVOFtMJLSHOF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160411134321.GI23157@dhcp22.suse.cz>

Michal Hocko wrote:
> There are many other possible reasons for thses symptoms. Have you
> actually seen any _evidence_ they the hang they are seeing is due to
> oom deadlock, though. A single crash dump or consistent sysrq output
> which would point that direction.

Yes. I saw several OOM livelock cases occurred in the customer's servers.

One case I was able to identify the cause was request_module() local DoS
( https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2012-4398 ). That server
was running Java based enterprise application. The java process tried to
create IPv6 socket on a system where IPv6 was configured to be disabled,
and request_module() was every time called from socket() syscall for trying
to load ipv6.ko module, and the OOM killer was invoked and the java process
was selected for the OOM victim, and since request_module() was not killable,
the system got the OOM livelock.

Another case I saw is interrupts from virtio being disabled due to a bug in
qemu-kvm. Since the cron daemon continuously starts cron jobs even after
storage I/O started stalling (because the qemu-kvm stopped sending interrupts),
all memory was consumed for cron jobs and async file write requests, and the
OOM killer was invoked. And since a cron job which was selected as the OOM
victim while trying to write to file was unable to terminate due to waiting
for fs writeback, the system got the OOM livelock.

Yet another case I saw is a hangup where a process is blocked at
down_read(&mm->mmap_sem) in __access_remote_vm() while reading /proc/pid/
entries. Since I had zero knowledge about OOM livelock at that time, I was
not able to tell whether it was an OOM livelock or not.

There would be some more, but I can't recall them because I left the support
center one year ago and I have no chance to re-examine these cases.

But in general, it is rare that I can find the OOM killer messages
because their servers are force rebooted without capturing kdump or SysRq.
Hints I can use are limited to /var/log/messages which lacks suspicious
messages, /var/log/sa/ which shows that there was little free memory and
/proc/sys/kernel/hung_task_warnings already being 0 (if sosreport is also
provided).

> > I'm suggesting you to at least emit diagnostic messages when something went
> > wrong. That is what kmallocwd is for. And if you do not want to emit
> > diagnostic messages, I'm fine with timeout based approach.
>
> I am all for more diagnostic but what you were proposing was so heavy
> weight it doesn't really seem worth it.

I suspect that the reason hung_task_warnings becomes 0 is related to
use of the same watermark for GFP_KERNEL/GFP_NOFS/GFP_NOIO, but I can't
ask customers to replace their kernels for debugging. So, the first step is
to merge kmallocwd upstream, then wait until customers start using that
kernel (it may be within a few months if they are about to develop a new
server, but it may be 10 years away if they already decided not to update
kernels for their servers' lifetime).

> Anyway yet again this is getting largely off-topic...

OK. I'll stop posting to this thread.

WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, rientjes@google.com,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	oleg@redhat.com
Subject: Re: [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular OOM killer path
Date: Wed, 13 Apr 2016 20:08:24 +0900	[thread overview]
Message-ID: <201604132008.CHC00016.FOQVOFtMJLSHOF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160411134321.GI23157@dhcp22.suse.cz>

Michal Hocko wrote:
> There are many other possible reasons for thses symptoms. Have you
> actually seen any _evidence_ they the hang they are seeing is due to
> oom deadlock, though. A single crash dump or consistent sysrq output
> which would point that direction.

Yes. I saw several OOM livelock cases occurred in the customer's servers.

One case I was able to identify the cause was request_module() local DoS
( https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2012-4398 ). That server
was running Java based enterprise application. The java process tried to
create IPv6 socket on a system where IPv6 was configured to be disabled,
and request_module() was every time called from socket() syscall for trying
to load ipv6.ko module, and the OOM killer was invoked and the java process
was selected for the OOM victim, and since request_module() was not killable,
the system got the OOM livelock.

Another case I saw is interrupts from virtio being disabled due to a bug in
qemu-kvm. Since the cron daemon continuously starts cron jobs even after
storage I/O started stalling (because the qemu-kvm stopped sending interrupts),
all memory was consumed for cron jobs and async file write requests, and the
OOM killer was invoked. And since a cron job which was selected as the OOM
victim while trying to write to file was unable to terminate due to waiting
for fs writeback, the system got the OOM livelock.

Yet another case I saw is a hangup where a process is blocked at
down_read(&mm->mmap_sem) in __access_remote_vm() while reading /proc/pid/
entries. Since I had zero knowledge about OOM livelock at that time, I was
not able to tell whether it was an OOM livelock or not.

There would be some more, but I can't recall them because I left the support
center one year ago and I have no chance to re-examine these cases.

But in general, it is rare that I can find the OOM killer messages
because their servers are force rebooted without capturing kdump or SysRq.
Hints I can use are limited to /var/log/messages which lacks suspicious
messages, /var/log/sa/ which shows that there was little free memory and
/proc/sys/kernel/hung_task_warnings already being 0 (if sosreport is also
provided).

> > I'm suggesting you to at least emit diagnostic messages when something went
> > wrong. That is what kmallocwd is for. And if you do not want to emit
> > diagnostic messages, I'm fine with timeout based approach.
>
> I am all for more diagnostic but what you were proposing was so heavy
> weight it doesn't really seem worth it.

I suspect that the reason hung_task_warnings becomes 0 is related to
use of the same watermark for GFP_KERNEL/GFP_NOFS/GFP_NOIO, but I can't
ask customers to replace their kernels for debugging. So, the first step is
to merge kmallocwd upstream, then wait until customers start using that
kernel (it may be within a few months if they are about to develop a new
server, but it may be 10 years away if they already decided not to update
kernels for their servers' lifetime).

> Anyway yet again this is getting largely off-topic...

OK. I'll stop posting to this thread.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-04-13 11:08 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06 14:13 [PATCH 0/3] oom reaper follow ups v1 Michal Hocko
2016-04-06 14:13 ` Michal Hocko
2016-04-06 14:13 ` [PATCH 1/3] mm, oom: move GFP_NOFS check to out_of_memory Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-06 14:13 ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular OOM killer path Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-07 11:38   ` Tetsuo Handa
2016-04-07 11:38     ` Tetsuo Handa
2016-04-08 11:19     ` Tetsuo Handa
2016-04-08 11:19       ` Tetsuo Handa
2016-04-08 11:50       ` Michal Hocko
2016-04-08 11:50         ` Michal Hocko
2016-04-09  4:39         ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skipregular " Tetsuo Handa
2016-04-09  4:39           ` Tetsuo Handa
2016-04-11 12:02           ` Michal Hocko
2016-04-11 12:02             ` Michal Hocko
2016-04-11 13:26             ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular " Tetsuo Handa
2016-04-11 13:26               ` Tetsuo Handa
2016-04-11 13:43               ` Michal Hocko
2016-04-11 13:43                 ` Michal Hocko
2016-04-13 11:08                 ` Tetsuo Handa [this message]
2016-04-13 11:08                   ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko
2016-04-08 11:34       ` Michal Hocko
2016-04-08 13:14   ` Michal Hocko
2016-04-08 13:14     ` Michal Hocko
2016-04-06 14:13 ` [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-07 11:55   ` Tetsuo Handa
2016-04-07 11:55     ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko
2016-04-08 11:34       ` Michal Hocko
2016-04-16  2:51       ` Tetsuo Handa
2016-04-17 11:54         ` Michal Hocko
2016-04-18 11:59           ` Tetsuo Handa
2016-04-19 14:17             ` Michal Hocko
2016-04-19 15:07               ` Tetsuo Handa
2016-04-19 19:32                 ` Michal Hocko
2016-04-08 13:07   ` Michal Hocko
2016-04-08 13:07     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201604132008.CHC00016.FOQVOFtMJLSHOF@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.