From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A76AEC47255 for ; Mon, 11 May 2020 21:44:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5486F2070B for ; Mon, 11 May 2020 21:44:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Xde/V+Am" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5486F2070B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0CE67900086; Mon, 11 May 2020 17:44:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A6DC900036; Mon, 11 May 2020 17:44:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F22D7900086; Mon, 11 May 2020 17:44:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id D7C02900036 for ; Mon, 11 May 2020 17:44:39 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9560F180AD801 for ; Mon, 11 May 2020 21:44:39 +0000 (UTC) X-FDA: 76805767878.26.page00_45bf4e95ece1b X-HE-Tag: page00_45bf4e95ece1b X-Filterd-Recvd-Size: 6256 Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 May 2020 21:44:39 +0000 (UTC) Received: by mail-lj1-f196.google.com with SMTP id h4so11249490ljg.12 for ; Mon, 11 May 2020 14:44:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Qi8QFxJcn8ospcUaTTL/msn/H2PnYlfa7gEDxoqIEDk=; b=Xde/V+AmEr9RhRI0OF3eoEMOlc1isYqe7VOjexZ9+yvqXUITqDMjYykWWvmO4V34Mf JOKkt+8ipvHoG6K6EnRJr+X4BE5BAeOIwVncM8mZiP+p8lL6D28UrUf9zrcUO4HYJ7Li /D0qRklBhVjc7H/pRd3r6sLDxIMmGPBdiSdMvnDuECFum3YPjKLEReOjRUX1lEMISjPa UxRo1lcjtjDBeAqBK9a6l4tU/5RJrH/CC/Tv8dqgB7LOsv/2hUPnh1rhH6QKtsylPMHk BdDMHIPOaPWBeuE+Acy+5O+YHoJMjaQHW8cBtfuVR6w/jN0CQ0EMrdo5tDVPzgaOXoPu HV7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Qi8QFxJcn8ospcUaTTL/msn/H2PnYlfa7gEDxoqIEDk=; b=NfQ2BGHea0Q6v18eEoMp5v048aDj+In7jaEAn+JWWczXbRlJnpPQl4lyYlmTOJSAPs wf5AOysBD4CbgZmITLlRRIb2dwo+rmzEfTbneNy5mupEO02vvX/SQXT/83YlzalVLtn6 tcmvuUOUxVQdHPNxfnD0G9c/zqywBtOv9MJ+Ya1y0OG1AqK7SN7c3aoBHuE4WyO3weUE PMSAM8gx681sXbzhqP8pb8PeiEro8FlH+API0QvYyYv3B3cH2yqJPiPjicgyBI7xiHke uNDpwxObnwdn/vsve0NRnuevBWwZ/lKs4ebbSrZLpvieNKKo8Vln28asptf1HqpBJk2n NtWg== X-Gm-Message-State: AOAM531miBn9EIFLJMvVr9lP9x3MLL6M9n+n0iWmYNfdX4riaFT/mXYR 3aThXJu8JId3DsEE+WbZVkQI70qvYRrDS52dWyi0zA== X-Google-Smtp-Source: ABdhPJyDBiceL/nRp52X70KoPTj1Z2mRtQJpKkdBDk00bdjCphFTyVenSJCXU23TcFAtFOAlthEP8MRn7HZ4gVuWRHg= X-Received: by 2002:a2e:9a54:: with SMTP id k20mr12277759ljj.265.1589233477493; Mon, 11 May 2020 14:44:37 -0700 (PDT) MIME-Version: 1.0 References: <20200507163301.229070-1-shakeelb@google.com> <20200507164653.GM6345@dhcp22.suse.cz> <20200511155646.GB306292@cmpxchg.org> In-Reply-To: <20200511155646.GB306292@cmpxchg.org> From: Shakeel Butt Date: Mon, 11 May 2020 14:44:26 -0700 Message-ID: Subject: Re: [PATCH] memcg: effective memory.high reclaim for remote charging To: Johannes Weiner Cc: Michal Hocko , Roman Gushchin , Greg Thelen , Andrew Morton , Linux MM , Cgroups , LKML Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 11, 2020 at 8:57 AM Johannes Weiner wrote: > > On Thu, May 07, 2020 at 10:00:07AM -0700, Shakeel Butt wrote: > > On Thu, May 7, 2020 at 9:47 AM Michal Hocko wrote: > > > > > > On Thu 07-05-20 09:33:01, Shakeel Butt wrote: > > > [...] > > > > @@ -2600,8 +2596,23 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > > > schedule_work(&memcg->high_work); > > > > break; > > > > } > > > > - current->memcg_nr_pages_over_high += batch; > > > > - set_notify_resume(current); > > > > + > > > > + if (gfpflags_allow_blocking(gfp_mask)) > > > > + reclaim_over_high(memcg, gfp_mask, batch); > > > > + > > > > + if (page_counter_read(&memcg->memory) <= > > > > + READ_ONCE(memcg->high)) > > > > + break; > > > > > > I am half way to a long weekend so bear with me. Shouldn't this be continue? The > > > parent memcg might be still in excess even the child got reclaimed, > > > right? > > > > > > > The reclaim_high() actually already does this walk up to the root and > > reclaim from ones who are still over their high limit. Though having > > 'continue' here is correct too. > > If reclaim was weak and failed to bring the child back in line, we > still do set_notify_resume(). We should do that for ancestors too. > > But it seems we keep adding hierarchy walks and it's getting somewhat > convoluted: page_counter does it, then we check high overage > recursively, and now we add the call to reclaim which itself is a walk > up the ancestor line. > > Can we hitchhike on the page_counter_try_charge() walk, which already > has the concept of identifying counters with overage? Rename the @fail > to @limited and return the first counter that is in excess of its high > as well, even when the function succeeds? > > Then we could ditch the entire high checking loop here and simply > replace it with > > done_restock: > ... > > if (*limited) { > if (gfpflags_allow_blocking()) > reclaim_over_high(memcg_from_counter(limited)); > /* Reclaim may not be able to do much, ... */ > set_notify_resume(); // or schedule_work() > }; > I will try to code the above and will give a shot to the following long-term suggestion as well. > In the long-term, the best thing might be to integrate memory.high > reclaim with the regular reclaim that try_charge() is already > doing. Especially the part where it retries several times - we > currently give up on memory.high unnecessarily early. Make > page_counter_try_charge() fail on high and max equally, and after > several reclaim cycles, instead of invoking the OOM killer, inject the > penalty sleep and force the charges. OOM killing and throttling is > supposed to be the only difference between the two, anyway, and yet > the code diverges far more than that for no apparent reason. > > But I also appreciate that this is a cleanup beyond the scope of this > patch here, so it's up to you how far you want to take it. Thanks, Shakeel