From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7219EC54E8D for ; Mon, 11 May 2020 15:57:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 367F9206D7 for ; Mon, 11 May 2020 15:57:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="oj6nl8Et" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 367F9206D7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C588A900063; Mon, 11 May 2020 11:57:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE130900036; Mon, 11 May 2020 11:57:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD0DE900063; Mon, 11 May 2020 11:57:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 927C0900036 for ; Mon, 11 May 2020 11:57:07 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 53339180AD80F for ; Mon, 11 May 2020 15:57:07 +0000 (UTC) X-FDA: 76804892094.15.step47_5b3752195985d X-HE-Tag: step47_5b3752195985d X-Filterd-Recvd-Size: 6173 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 May 2020 15:57:06 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id v4so7413816qte.3 for ; Mon, 11 May 2020 08:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=0mkmAQYTPxUwTzaufmQ28FUE0wAOW0XuP/PajjBfBuI=; b=oj6nl8Etq+Cp7D5OexjUrLRZfJ6Boh1fr3GR6ETsCeQgMNybVUKDK0SlyrXGit3FGs Kk4Yshjf2ZNzTx3DiZrm/f03A9DD5twUHxDzBbOQtkSX+qhqGRELd8hPWbxYR6XcP7c3 AqZYcLK6CdIjDSO+xG1LoDdPVgEkch+Eb627llInyJOkzFEe2tTFHu4jBztuP60v8bx+ EY2XURnrX3zUqwtOy1Iu5TwM6uN1XpzGxwBv+RXpvyYam2nJ8ocqSqgbtTUnSltdPVTU gEXP2FC/T77cVQ2cXgLbdF4X+1qS94vQvbOEXrkDdjbjGNU59A7GMlhFsW8TmO0UPNM1 4WDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=0mkmAQYTPxUwTzaufmQ28FUE0wAOW0XuP/PajjBfBuI=; b=RCqdfb/EdCuMcxS9xmtmBCJmLA3dbN03KdaSLK4hBsefYacZF0dGOkWquPOaAtgR14 ufj+zRvGwz/iu/7jKs99qnXfsZyZsv+eCqDuIpgl9Mdy2LNvOYWvkE6plsiSj9SZLxm/ /vnzO+DbGGKPcMStRdAzAlnL/GYz6by/VBSqyUWRKJeSjYVEbsBi7KVBlq4AjNuffvu8 H69VdbElWqyUfB86M3vZEp1+MMw5uU2rqWn1EZeOOL6VwBS3yAGOnjegtOfSToRKjkeu 9temJ5ffqnCCfddEUSWbkHgZY9GTMVoOyI0tocxVCCSfQqrJQ4AUhGJFag8/ST+6uC/Z YbSA== X-Gm-Message-State: AGi0PubDckqu4Bahrpb0yI6unLvsS0pQNubzVqkjWwyuq0q5QhJaj+wR CK/yfy49s8bHaA77AcI4yAHZIQ== X-Google-Smtp-Source: APiQypKs7/83y9Fy2tkC+anfR2Z5fLWFUYcXTlQfFzq+XRDx4OvJDfxmyT3B/ozTjq84cb6AEEtRrQ== X-Received: by 2002:ac8:4686:: with SMTP id g6mr17447463qto.144.1589212625970; Mon, 11 May 2020 08:57:05 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:2627]) by smtp.gmail.com with ESMTPSA id q17sm8637971qkq.111.2020.05.11.08.57.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2020 08:57:05 -0700 (PDT) Date: Mon, 11 May 2020 11:56:46 -0400 From: Johannes Weiner To: Shakeel Butt Cc: Michal Hocko , Roman Gushchin , Greg Thelen , Andrew Morton , Linux MM , Cgroups , LKML Subject: Re: [PATCH] memcg: effective memory.high reclaim for remote charging Message-ID: <20200511155646.GB306292@cmpxchg.org> References: <20200507163301.229070-1-shakeelb@google.com> <20200507164653.GM6345@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 07, 2020 at 10:00:07AM -0700, Shakeel Butt wrote: > On Thu, May 7, 2020 at 9:47 AM Michal Hocko wrote: > > > > On Thu 07-05-20 09:33:01, Shakeel Butt wrote: > > [...] > > > @@ -2600,8 +2596,23 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > > schedule_work(&memcg->high_work); > > > break; > > > } > > > - current->memcg_nr_pages_over_high += batch; > > > - set_notify_resume(current); > > > + > > > + if (gfpflags_allow_blocking(gfp_mask)) > > > + reclaim_over_high(memcg, gfp_mask, batch); > > > + > > > + if (page_counter_read(&memcg->memory) <= > > > + READ_ONCE(memcg->high)) > > > + break; > > > > I am half way to a long weekend so bear with me. Shouldn't this be continue? The > > parent memcg might be still in excess even the child got reclaimed, > > right? > > > > The reclaim_high() actually already does this walk up to the root and > reclaim from ones who are still over their high limit. Though having > 'continue' here is correct too. If reclaim was weak and failed to bring the child back in line, we still do set_notify_resume(). We should do that for ancestors too. But it seems we keep adding hierarchy walks and it's getting somewhat convoluted: page_counter does it, then we check high overage recursively, and now we add the call to reclaim which itself is a walk up the ancestor line. Can we hitchhike on the page_counter_try_charge() walk, which already has the concept of identifying counters with overage? Rename the @fail to @limited and return the first counter that is in excess of its high as well, even when the function succeeds? Then we could ditch the entire high checking loop here and simply replace it with done_restock: ... if (*limited) { if (gfpflags_allow_blocking()) reclaim_over_high(memcg_from_counter(limited)); /* Reclaim may not be able to do much, ... */ set_notify_resume(); // or schedule_work() }; In the long-term, the best thing might be to integrate memory.high reclaim with the regular reclaim that try_charge() is already doing. Especially the part where it retries several times - we currently give up on memory.high unnecessarily early. Make page_counter_try_charge() fail on high and max equally, and after several reclaim cycles, instead of invoking the OOM killer, inject the penalty sleep and force the charges. OOM killing and throttling is supposed to be the only difference between the two, anyway, and yet the code diverges far more than that for no apparent reason. But I also appreciate that this is a cleanup beyond the scope of this patch here, so it's up to you how far you want to take it.