From: Michal Hocko <mhocko@kernel.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens
Date: Tue, 26 Nov 2019 11:22:13 +0100 [thread overview]
Message-ID: <20191126102213.GD20912@dhcp22.suse.cz> (raw)
In-Reply-To: <CALOAHbBfCk-1qt4z3C5dCjWr__exKNvd15hXZ3_Wo3cLS7jdOw@mail.gmail.com>
On Tue 26-11-19 18:02:27, Yafang Shao wrote:
> On Tue, Nov 26, 2019 at 5:50 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Tue 26-11-19 17:35:59, Yafang Shao wrote:
> > > On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > On Tue 26-11-19 11:52:19, Yafang Shao wrote:
> > > > > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > > > >
> > > > > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote:
> > > > > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote:
> > > > > > > > When there're no processes, we don't need to protect the pages. You
> > > > > > > > can consider it as 'fault tolerance' .
> > > > > > >
> > > > > > > I have already tried to explain why this is a bold statement that
> > > > > > > doesn't really hold universally and that the kernel doesn't really have
> > > > > > > enough information to make an educated guess.
> > > > > >
> > > > > > I agree, this is not obviously true. And the kernel shouldn't try to
> > > > > > guess whether the explicit userspace configuration is still desirable
> > > > > > to userspace or not. Should we also delete the cgroup when it becomes
> > > > > > empty for example?
> > > > > >
> > > > > > It's better to implement these kinds of policy decisions from
> > > > > > userspace.
> > > > > >
> > > > > > There is a cgroup.events file that can be polled, and its "populated"
> > > > > > field shows conveniently whether there are tasks in a subtree or
> > > > > > not. You can use that to clear protection settings.
> > > > >
> > > > > Why isn't force_empty supported in cgroup2 ?
> > > >
> > > > There wasn't any sound usecase AFAIR.
> > > >
> > > > > In this case we can free the protected file pages immdiately with force_empty.
> > > >
> > > > You can do the same thing by setting the hard limit to 0.
> > >
> > > I look though the code, and the difference between setting the hard
> > > limit to 0 and force empty is that setting the hard limit to 0 will
> > > generate some OOM reports, that should not happen in this case.
> > > I think we should make little improvement as bellow,
> >
> > Yes, if you are not able to reclaim all of the memory then the OOM
> > killer is triggered. And that was not the case with force_empty. I
> > didn't mean that the two are equivalent, sorry if I misled you.
> > I merely wanted to point out that you have means to cleanup the memcg
> > with the existing API.
> >
> > > @@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct
> > > kernfs_open_file *of,
> > > continue;
> > > }
> > >
> > > - memcg_memory_event(memcg, MEMCG_OOM);
> > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > - break;
> > > + if (cgroup_is_populated(memcg->css.cgroup)) {
> > > + memcg_memory_event(memcg, MEMCG_OOM);
> > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> > > + break;
> > > + }
> > > }
> >
> > If there are no killable tasks then
> > "Out of memory and no killable processes..."
> > is printed and that really reflects the situation and is the right thing
> > to do. Your above patch would suppress that information which might be
> > important.
> >
>
> Not only this output.
> Pls. see dump_header(), many outputs and even worse is that the
> dump_stack() is also executed.
Yes, there will be the full oom report. I have outlined the "no
killable" part because this is the main distinguisher for the "no tasks"
case.
> > > Well, if someone don't want to kill proesses but only want ot drop
> > > page caches, setting the hard limit to 0 won't work.
> >
> > Could you be more specific about a real world example when somebody
> > wants to drop per-memcg pagecache?
>
> For example, if one memcg has lots of negtive denties, that causes
> the file page cache continuesly been reclaimed, so we want to drop all
> these negtive dentries. force_empty is a better workaround so far, and
> that can give us more chance to analyze why negtive dentries are
> generated.
force_empty sounds like a brute force to clean negative dentries TBH.
And it is not really way too much different from shrinking the hard
limit.
Why doesn't a normal reclaim work for those situation? Anyway, this is
getting really tangent to the original topic so I would suggest to start
a new email thread with a clear description of a problem you are facing
and we can go from there.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2019-11-26 10:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-25 10:14 [PATCH] mm, memcg: clear page protection when memcg oom group happens Yafang Shao
2019-11-25 11:08 ` Michal Hocko
2019-11-25 11:37 ` Yafang Shao
2019-11-25 11:54 ` Michal Hocko
2019-11-25 12:17 ` Yafang Shao
2019-11-25 12:31 ` Michal Hocko
2019-11-25 12:37 ` Yafang Shao
2019-11-25 12:45 ` Michal Hocko
2019-11-25 14:11 ` Yafang Shao
2019-11-25 14:21 ` Michal Hocko
2019-11-25 14:42 ` Johannes Weiner
2019-11-25 14:45 ` Yafang Shao
2019-11-26 3:52 ` Yafang Shao
2019-11-26 7:31 ` Michal Hocko
2019-11-26 9:35 ` Yafang Shao
2019-11-26 9:50 ` Michal Hocko
2019-11-26 10:02 ` Yafang Shao
2019-11-26 10:22 ` Michal Hocko [this message]
2019-11-26 10:56 ` Yafang Shao
2019-11-25 14:44 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191126102213.GD20912@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).