linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>,
	"Theodore Ts'o" <tytso@mit.edu>, Greg Thelen <gthelen@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Muchun Song <songmuchun@bytedance.com>,
	riel@surriel.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/4] mm/oom: handle remote ooms
Date: Mon, 15 Nov 2021 09:32:42 -0800	[thread overview]
Message-ID: <CALvZod7+P-2JR=Wsi_eYDymabAt=DFz4eR4w2KNtC-k1+AUtBg@mail.gmail.com> (raw)
In-Reply-To: <YZI9ZbRVdRtE2m70@dhcp22.suse.cz>

On Mon, Nov 15, 2021 at 2:58 AM Michal Hocko <mhocko@suse.com> wrote:
>
[...]
> >
> > The behavior I saw returning ENOMEM for this edge case was that the
> > code was forever looping the pagefault, and I was (seemingly
> > incorrectly) under the impression that a suggestion to forever loop
> > the pagefault would be completely fundamentally unacceptable.
>
> Well, I have to say I am not entirely sure what is the best way to
> handle this situation. Another option would be to treat this similar to
> ENOSPACE situation. This would result into SIGBUS IIRC.
>
> The main problem with OOM killer is that it will not resolve the
> underlying problem in most situations. Shmem files would likely stay
> laying around and their charge along with them.

This and similar topics were discussed during LSFMM 2019
(https://lwn.net/Articles/787626/).

> Killing the allocating
> task has problems on its own because this could be just a DoS vector by
> other unrelated tasks sharing the shmem mount point without a gracefull
> fallback. Retrying the page fault is hard to detect. SIGBUS might be
> something that helps with the latest. The question is how to communicate
> this requerement down to the memcg code to know that the memory reclaim
> should happen (Should it? How hard we should try?) but do not invoke the
> oom killer. The more I think about this the nastier this is.
> --

IMHO we should punt the resolution to the userspace and keep the
kernel simple. This is an opt-in feature and the user is expected to
know and handle exceptional scenarios. The kernel just needs to tell
the userspace that this exceptional situation is happening somehow.

How about for remote ooms irrespective of page fault path or not, keep
the allocator looping but keep incrementing a new memcg event
MEMCG_OOM_NO_VICTIM? The userspace will get to know the situation
either through inotify or polling and can handle the situation by
either increasing the limit or by releasing the memory of the
monitored memcg.

thanks,
Shakeel

  reply	other threads:[~2021-11-16  2:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211111234203.1824138-1-almasrymina@google.com>
2021-11-11 23:42 ` [PATCH v3 1/4] mm/shmem: support deterministic charging of tmpfs Mina Almasry
2021-11-11 23:42 ` [PATCH v3 2/4] mm/oom: handle remote ooms Mina Almasry
2021-11-12  7:51   ` Michal Hocko
2021-11-12  8:12     ` Mina Almasry
2021-11-12  8:36       ` Michal Hocko
2021-11-12 17:59         ` Mina Almasry
2021-11-15 10:58           ` Michal Hocko
2021-11-15 17:32             ` Shakeel Butt [this message]
2021-11-16  0:58             ` Mina Almasry
2021-11-16  9:28               ` Michal Hocko
2021-11-16  9:39                 ` Michal Hocko
2021-11-16 10:17                 ` Mina Almasry
2021-11-16 11:29                   ` Michal Hocko
2021-11-16 21:27                     ` Mina Almasry
2021-11-16 21:55                       ` Shakeel Butt
2021-11-18  8:48                         ` Michal Hocko
2021-11-19 22:32                           ` Mina Almasry
2021-11-18  8:47                       ` Michal Hocko
2021-11-11 23:42 ` [PATCH v3 3/4] mm, shmem: add tmpfs memcg= option documentation Mina Almasry
2021-11-11 23:42 ` [PATCH v3 4/4] mm, shmem, selftests: add tmpfs memcg= mount option tests Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALvZod7+P-2JR=Wsi_eYDymabAt=DFz4eR4w2KNtC-k1+AUtBg@mail.gmail.com' \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=songmuchun@bytedance.com \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).