linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Edward Chron <echron@arista.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	David Rientjes <rientjes@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ivan Delalande <colona@arista.com>
Subject: Re: [PATCH 00/10] OOM Debug print selection and additional information
Date: Thu, 29 Aug 2019 14:44:01 -0400	[thread overview]
Message-ID: <1567104241.5576.30.camel@lca.pw> (raw)
In-Reply-To: <CAM3twVSgJdFKbzkg1V+7voFMi-SYQTCz6jCBobLBQ72Cg8k5VQ@mail.gmail.com>

On Thu, 2019-08-29 at 09:09 -0700, Edward Chron wrote:

> > Feel like you are going in circles to "sell" without any new information. If
> > you
> > need to deal with OOM that often, it might also worth working with FB on
> > oomd.
> > 
> > https://github.com/facebookincubator/oomd
> > 
> > It is well-known that kernel OOM could be slow and painful to deal with, so
> > I
> > don't buy-in the argument that kernel OOM recover is better/faster than a
> > kdump
> > reboot.
> > 
> > It is not unusual that when the system is triggering a kernel OOM, it is
> > almost
> > trashed/dead. Although developers are working hard to improve the recovery
> > after
> > OOM, there are still many error-paths that are not going to survive which
> > would
> > leak memories, introduce undefined behaviors, corrupt memory etc.
> 
> But as you have pointed out many people are happy with current OOM processing
> which is the report and recovery so for those people a kdump reboot is
> overkill.
> Making the OOM report at least optionally a bit more informative has value.
> Also
> making sure it doesn't produce excessive output is desirable.
> 
> I do agree for developers having to have all the system state a kdump
> provides that
> and as long as you can reproduce the OOM event that works well. But
> that is not the
> common case as has already been discussed.
> 
> Also, OOM events that are due to kernel bugs could leak memory and over time
> and cause a crash, true. But that is not what we typically see. In
> fact we've had
> customers come back and report issues on systems that have been in continuous
> operation for years. No point in crashing their system. Linux if
> properly maintained
> is thankfully quite stable. But OOMs do happen and root causing them to
> prevent
> future occurrences is desired.

This is not what I meant. After an OOM event happens, many kernel memory
allocations could fail. Since very few people are testing those error-paths due
to allocation failures, it is considered one of those most buggy areas in the
kernel. Developers have mostly been focus on making sure the kernel OOM should
not happen in the first place.

I still think the time is better spending on improving things like eBPF, oomd
and kdump etc to solve your problem, but leave the kernel OOM report code alone.


  reply	other threads:[~2019-08-29 18:44 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-26 19:36 [PATCH 00/10] OOM Debug print selection and additional information Edward Chron
2019-08-26 19:36 ` [PATCH 01/10] mm/oom_debug: Add Debug base code Edward Chron
2019-08-27 13:28   ` kbuild test robot
2019-08-26 19:36 ` [PATCH 02/10] mm/oom_debug: Add System State Summary Edward Chron
2019-08-26 19:36 ` [PATCH 03/10] mm/oom_debug: Add Tasks Summary Edward Chron
2019-08-26 19:36 ` [PATCH 04/10] mm/oom_debug: Add ARP and ND Table Summary usage Edward Chron
2019-08-26 19:36 ` [PATCH 05/10] mm/oom_debug: Add Select Slabs Print Edward Chron
2019-08-26 19:36 ` [PATCH 06/10] mm/oom_debug: Add Select Vmalloc Entries Print Edward Chron
2019-08-26 19:36 ` [PATCH 07/10] mm/oom_debug: Add Select Process " Edward Chron
2019-08-26 19:36 ` [PATCH 08/10] mm/oom_debug: Add Slab Select Always Print Enable Edward Chron
2019-08-26 19:36 ` [PATCH 09/10] mm/oom_debug: Add Enhanced Slab Print Information Edward Chron
2019-08-26 19:36 ` [PATCH 10/10] mm/oom_debug: Add Enhanced Process " Edward Chron
2019-08-28  0:21   ` kbuild test robot
2019-08-27  7:15 ` [PATCH 00/10] OOM Debug print selection and additional information Michal Hocko
2019-08-27 10:10   ` Tetsuo Handa
2019-08-27 10:38     ` Michal Hocko
2019-08-28  1:07   ` Edward Chron
2019-08-28  6:59     ` Michal Hocko
     [not found]       ` <CAM3twVR_OLffQ1U-SgQOdHxuByLNL5sicfnObimpGpPQ1tJ0FQ@mail.gmail.com>
2019-08-28 20:18         ` Qian Cai
2019-08-28 21:17           ` Edward Chron
2019-08-28 21:34             ` Qian Cai
2019-08-29  7:11         ` Michal Hocko
2019-08-29 10:14           ` Tetsuo Handa
2019-08-29 11:56             ` Michal Hocko
2019-08-29 14:09               ` Tetsuo Handa
2019-08-29 15:48                 ` Edward Chron
2019-08-29 15:03               ` Edward Chron
2019-08-29 15:42                 ` Qian Cai
2019-08-29 16:09                   ` Edward Chron
2019-08-29 18:44                     ` Qian Cai [this message]
2019-08-29 22:41                       ` Edward Chron
2019-08-29 16:17                 ` Michal Hocko
2019-08-29 16:35                   ` Edward Chron
2019-08-29 15:20           ` Edward Chron
2019-08-27 12:40 ` Qian Cai
     [not found]   ` <CAM3twVQEMGWMQEC0dduri0JWt3gH6F2YsSqOmk55VQz+CZDVKg@mail.gmail.com>
2019-08-28  0:50     ` Qian Cai
2019-08-28  1:13       ` Edward Chron
2019-08-28  1:32         ` Qian Cai
2019-08-28  2:47           ` Edward Chron
2019-08-28  7:08             ` Michal Hocko
2019-08-28 10:12               ` Tetsuo Handa
2019-08-28 10:32                 ` Michal Hocko
2019-08-28 10:56                   ` Tetsuo Handa
2019-08-28 11:12                     ` Michal Hocko
2019-08-28 20:04                 ` Edward Chron
2019-08-29  3:31                   ` Edward Chron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1567104241.5576.30.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=colona@arista.com \
    --cc=echron@arista.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).