From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FC4EC432C0 for ; Tue, 26 Nov 2019 10:03:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 46AD520865 for ; Tue, 26 Nov 2019 10:03:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SLiHy/d5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46AD520865 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D1FCB6B02D5; Tue, 26 Nov 2019 05:03:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD0496B02D6; Tue, 26 Nov 2019 05:03:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE88A6B02D7; Tue, 26 Nov 2019 05:03:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0230.hostedemail.com [216.40.44.230]) by kanga.kvack.org (Postfix) with ESMTP id A96E96B02D5 for ; Tue, 26 Nov 2019 05:03:05 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 4DCFD181AEF07 for ; Tue, 26 Nov 2019 10:03:05 +0000 (UTC) X-FDA: 76197990330.28.cover28_551a58b770038 X-HE-Tag: cover28_551a58b770038 X-Filterd-Recvd-Size: 7136 Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Tue, 26 Nov 2019 10:03:04 +0000 (UTC) Received: by mail-il1-f193.google.com with SMTP id a7so17061560ild.6 for ; Tue, 26 Nov 2019 02:03:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0Xt4JdHcLeEgc+r1SG15XeP0DnzsR+wWwPEOAu4L03w=; b=SLiHy/d5dd9VrB+vyxk9UbS7Js5DoKzycuCIOI8fjHiPIPHM1fFWocSrK9+NZKEKaT Fdw2cgjNRIDEfwR4ctK34fmneDyc7N0K+O0h2aQLWR2odenNRE3IwKNgFsamfqiC0vLE ogj+zXFM4W/E59jSHVepnK+kRkO0zf6u/JrTOBWw0S1xI9XV8t32GiKAdxR6TcZYaEg9 Tet8rtvC11ZUoMRHluJwv1nPqn6c6vKxnun0ZPU2uYxMfcDyBIzVsUomlO2Ci5ZCw2Bf A9whs1AJl6EJK5Mg3z02eueWVUxGWl0wqAPNCgFFXtjhLVG05HxIrDAFefe4CzokTFzl RtdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0Xt4JdHcLeEgc+r1SG15XeP0DnzsR+wWwPEOAu4L03w=; b=lKztz4i9kwj+wMgLDYtYtsX3gea20lsFWxo/jyioiJoVqrG6GgUg27A6AMLVLjb4DD xSyC09EOdGljKmvwJtr5olwRoR2uXIKcGb8sqonWcgu+qmdxyXI28EHxXGH9TwrMFAX1 adKV1qnwfoHrX+Pcyi49Z8c1O6moX4M9x2B7TyX1HTx5o3FeiLAGRdaSTzkc0EPLuitQ FVGvGIuCdrFc70xbwxS9crRvDUSqEl/OFS55Qo1fl1RCWVHScwZ1X8R5D9Nd6MnqF48U isJWhJnfLfqXnl9z/8Y6j/swF9iRWc0mZBijpvZ3I3hk6RoDVAFjCIf2JxSMo30Rm56T wdBA== X-Gm-Message-State: APjAAAWw8k0HrZdJan9I0vgbQ8SWQ+QGj/tJgax41DJcxc10zD6FGRnV XeQ+3WH6r7B4ea4jbMHErXXZylLipdOobzwHS78= X-Google-Smtp-Source: APXvYqymIVgGsg/I0bgUfRQLdzlqzsspwtf23nV0qZot8PX4GzE3aKivrmeztcZKlHUgZdIgbefHTBc4tJI2oOWSFEo= X-Received: by 2002:a05:6e02:8ab:: with SMTP id a11mr20306251ilt.203.1574762584106; Tue, 26 Nov 2019 02:03:04 -0800 (PST) MIME-Version: 1.0 References: <20191125123123.GL31714@dhcp22.suse.cz> <20191125124553.GM31714@dhcp22.suse.cz> <20191125142150.GP31714@dhcp22.suse.cz> <20191125144213.GB602168@cmpxchg.org> <20191126073129.GA20912@dhcp22.suse.cz> <20191126095033.GC20912@dhcp22.suse.cz> In-Reply-To: <20191126095033.GC20912@dhcp22.suse.cz> From: Yafang Shao Date: Tue, 26 Nov 2019 18:02:27 +0800 Message-ID: Subject: Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens To: Michal Hocko Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 26, 2019 at 5:50 PM Michal Hocko wrote: > > On Tue 26-11-19 17:35:59, Yafang Shao wrote: > > On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko wrote: > > > > > > On Tue 26-11-19 11:52:19, Yafang Shao wrote: > > > > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner wrote: > > > > > > > > > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote: > > > > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote: > > > > > > > When there're no processes, we don't need to protect the pages. You > > > > > > > can consider it as 'fault tolerance' . > > > > > > > > > > > > I have already tried to explain why this is a bold statement that > > > > > > doesn't really hold universally and that the kernel doesn't really have > > > > > > enough information to make an educated guess. > > > > > > > > > > I agree, this is not obviously true. And the kernel shouldn't try to > > > > > guess whether the explicit userspace configuration is still desirable > > > > > to userspace or not. Should we also delete the cgroup when it becomes > > > > > empty for example? > > > > > > > > > > It's better to implement these kinds of policy decisions from > > > > > userspace. > > > > > > > > > > There is a cgroup.events file that can be polled, and its "populated" > > > > > field shows conveniently whether there are tasks in a subtree or > > > > > not. You can use that to clear protection settings. > > > > > > > > Why isn't force_empty supported in cgroup2 ? > > > > > > There wasn't any sound usecase AFAIR. > > > > > > > In this case we can free the protected file pages immdiately with force_empty. > > > > > > You can do the same thing by setting the hard limit to 0. > > > > I look though the code, and the difference between setting the hard > > limit to 0 and force empty is that setting the hard limit to 0 will > > generate some OOM reports, that should not happen in this case. > > I think we should make little improvement as bellow, > > Yes, if you are not able to reclaim all of the memory then the OOM > killer is triggered. And that was not the case with force_empty. I > didn't mean that the two are equivalent, sorry if I misled you. > I merely wanted to point out that you have means to cleanup the memcg > with the existing API. > > > @@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct > > kernfs_open_file *of, > > continue; > > } > > > > - memcg_memory_event(memcg, MEMCG_OOM); > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > - break; > > + if (cgroup_is_populated(memcg->css.cgroup)) { > > + memcg_memory_event(memcg, MEMCG_OOM); > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > + break; > > + } > > } > > If there are no killable tasks then > "Out of memory and no killable processes..." > is printed and that really reflects the situation and is the right thing > to do. Your above patch would suppress that information which might be > important. > Not only this output. Pls. see dump_header(), many outputs and even worse is that the dump_stack() is also executed. > > Well, if someone don't want to kill proesses but only want ot drop > > page caches, setting the hard limit to 0 won't work. > > Could you be more specific about a real world example when somebody > wants to drop per-memcg pagecache? For example, if one memcg has lots of negtive denties, that causes the file page cache continuesly been reclaimed, so we want to drop all these negtive dentries. force_empty is a better workaround so far, and that can give us more chance to analyze why negtive dentries are generated. Thanks Yafang