From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Vpcj=6S=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI,
	SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 2B52DC3A5A9
	for <linux-mm@archiver.kernel.org>; Mon,  4 May 2020 14:11:42 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id EB4B820705
	for <linux-mm@archiver.kernel.org>; Mon,  4 May 2020 14:11:41 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB4B820705
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 9396B8E0018; Mon,  4 May 2020 10:11:41 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8EA648E0003; Mon,  4 May 2020 10:11:41 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 7FF658E0018; Mon,  4 May 2020 10:11:41 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130])
	by kanga.kvack.org (Postfix) with ESMTP id 65BE08E0003
	for <linux-mm@kvack.org>; Mon,  4 May 2020 10:11:41 -0400 (EDT)
Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay05.hostedemail.com (Postfix) with ESMTP id 2071E181AC9BF
	for <linux-mm@kvack.org>; Mon,  4 May 2020 14:11:41 +0000 (UTC)
X-FDA: 76779224802.04.nut08_48cfdbd389654
X-HE-Tag: nut08_48cfdbd389654
X-Filterd-Recvd-Size: 5508
Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66])
	by imf09.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Mon,  4 May 2020 14:11:40 +0000 (UTC)
Received: by mail-wm1-f66.google.com with SMTP id x25so8639678wmc.0
        for <linux-mm@kvack.org>; Mon, 04 May 2020 07:11:39 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:from:to:cc:subject:message-id:references
         :mime-version:content-disposition:in-reply-to;
        bh=Zhg/lpI81ciSxh0fDjxdl8lxOebSyQ78Sv338/w3oPQ=;
        b=YJ4HcqhgRYsOBm7JNKjHGYQAIHS4W+G1roTZwvidQQlZbJ1maARXbx0nNYKfvxItJf
         YgxlivzQXO/CbhROrqfzPtmbT8TfOTkMx2Cq5HkDLgk4EdzIBioU9zKQw9HONQ7u0oOt
         By02f4ADxw4McIU/fLWlUvdIDkfSNa4NBeN63kc9+o6uJ8dLTjnxJnSGBOS0zjgGG6/S
         kGaCeSphcFJD0SZ5oVOC1QVM90yT4jav0iSImqOdgbngQ8MAykp5f3PjMrJj/MfzCasn
         A/i4BPcnkHuQEHzBCfJdv9T6VgOeypC3ppcqi99G9PhAGUK6IMVzoTMHSyz3oy/4hXL7
         ZFPA==
X-Gm-Message-State: AGi0PuZmaJlvoTy9q/q/YW2UJx7FxUJx6qQS58QaO3AhrNsLxudwoNGv
	1TqzZFseoYep7VnIUw9DjYI=
X-Google-Smtp-Source: APiQypLBTMdBsqI3e05mfgV9qMZytUOnyP9gk3F560DYzx00KopeXxRT1ghjjg42a4hKAmUBPodayQ==
X-Received: by 2002:a1c:9e43:: with SMTP id h64mr14579214wme.0.1588601498915;
        Mon, 04 May 2020 07:11:38 -0700 (PDT)
Received: from localhost (ip-37-188-183-9.eurotel.cz. [37.188.183.9])
        by smtp.gmail.com with ESMTPSA id i25sm13328487wml.43.2020.05.04.07.11.37
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 04 May 2020 07:11:38 -0700 (PDT)
Date: Mon, 4 May 2020 16:11:36 +0200
From: Michal Hocko <mhocko@kernel.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>, Roman Gushchin <guro@fb.com>,
	Greg Thelen <gthelen@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>, Cgroups <cgroups@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max
Message-ID: <20200504141136.GR22838@dhcp22.suse.cz>
References: <20200430182712.237526-1-shakeelb@google.com>
 <20200504065600.GA22838@dhcp22.suse.cz>
 <CALvZod5Ao2PEFPEOckW6URBfxisp9nNpNeon1GuctuHehqk_6Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALvZod5Ao2PEFPEOckW6URBfxisp9nNpNeon1GuctuHehqk_6Q@mail.gmail.com>
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon 04-05-20 06:54:40, Shakeel Butt wrote:
> On Sun, May 3, 2020 at 11:56 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Thu 30-04-20 11:27:12, Shakeel Butt wrote:
> > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > succeed. However if oom-killer does not find a process for killing, it
> > > dumps a lot of warnings.
> >
> > It shouldn't dump much more than the regular OOM report AFAICS. Sure
> > there is "Out of memory and no killable processes..." message printed as
> > well but is that a real problem?
> >
> > > Deleting a memcg does not reclaim memory from it and the memory can
> > > linger till there is a memory pressure. One normal way to proactively
> > > reclaim such memory is to set memory.max to 0 just before deleting the
> > > memcg. However if some of the memcg's memory is pinned by others, this
> > > operation can trigger an oom-kill without any process and thus can log a
> > > lot un-needed warnings. So, ignore all such warnings from memory.max.
> >
> > OK, I can see why you might want to use memory.max for that purpose but
> > I do not really understand why the oom report is a problem here.
> 
> It may not be a problem for an individual or small scale deployment
> but when "sweep before tear down" is the part of the workflow for
> thousands of machines cycling through hundreds of thousands of cgroups
> then we can potentially flood the logs with not useful dumps and may
> hide (or overflow) any useful information in the logs.

If you are doing this in a large scale and the oom report is really a
problem then you shouldn't be resetting hard limit to 0 in the first
place.

> > memory.max can trigger the oom kill and user should be expecting the oom
> > report under that condition. Why is "no eligible task" so special? Is it
> > because you know that there won't be any tasks for your particular case?
> > What about other use cases where memory.max is not used as a "sweep
> > before tear down"?
> 
> What other such use-cases would be? The only use-case I can envision
> of adjusting limits dynamically of a live cgroup are resource
> managers. However for cgroup v2, memory.high is the recommended way to
> limit the usage, so, why would resource managers be changing
> memory.max instead of memory.high? I am not sure. What do you think?

There are different reasons to use the hard limit. Mostly to contain
potential runaways. While high limit might be a sufficient measure to
achieve that as well the hard limit is the last resort. And it clearly
has the oom killer semantic so I am not really sure why you are
comparing the two.

> FB is moving away from limits setting, so, not sure if they have
> thought of these cases.
> 
> BTW for such use-cases, shouldn't we be taking the memcg's oom_lock?

This is a good question. I would have to go and double check the code
but I suspect that this is an omission.
-- 
Michal Hocko
SUSE Labs