From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35231C43331 for ; Tue, 12 Nov 2019 18:45:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E99712067B for ; Tue, 12 Nov 2019 18:45:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="eLwd7Iy5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E99712067B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 83C766B0006; Tue, 12 Nov 2019 13:45:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EE7F6B0007; Tue, 12 Nov 2019 13:45:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7046F6B000A; Tue, 12 Nov 2019 13:45:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0251.hostedemail.com [216.40.44.251]) by kanga.kvack.org (Postfix) with ESMTP id 573376B0006 for ; Tue, 12 Nov 2019 13:45:58 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 16CD9181AEF00 for ; Tue, 12 Nov 2019 18:45:58 +0000 (UTC) X-FDA: 76148504796.23.frog58_34083d53d1f32 X-HE-Tag: frog58_34083d53d1f32 X-Filterd-Recvd-Size: 6951 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Tue, 12 Nov 2019 18:45:57 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id n1so19677911wra.10 for ; Tue, 12 Nov 2019 10:45:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hxMhnNTPNMqUl9aYg/+q9b2xf6WaAQRl4vSfzI/n7p4=; b=eLwd7Iy5L1mI4E21IClk5SY41lrRU+CDtVSYFCoaOR/KUjfehCrHpcwrUvhp5HmTAT uhs0ezHEN43/gLsummGlDTM/iWi7hu/cK+YLg7/dcWasjaAw10a2+vjz46+e6Ll6PbEd kk7UC9to+igCpH1Qy7B86f84kz2U9b0jeR7oGoeHgPuti9b6KYvUj9VoslJCfooG0qk/ 72DcEX4NgQ9jNxxJ7WebVvv6PsNmtNSc3uYsBxmg6jwEtpuvxGLrflrVyJzoCJk2ByKe 9QVYH8BACYdUA2k9t12xq9WwexnMZMO5giUKpafTjobHbHZ3dpi4zK2f6QHCXUkyDnMY G88g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hxMhnNTPNMqUl9aYg/+q9b2xf6WaAQRl4vSfzI/n7p4=; b=s9VaDDP1vkSbNONZuX9yAx4q5wu3VgpK8qi+4qmrxxyQk5WwD5Q7MT7dLkieah9Ylx /XtGfsHRrMnXkOHkQLeCZ9WdpCfJZ0/JIuGyapcMyZxQIwpnxnM8CmWu3R3c4GGRPi20 uAmCxyLEsHKM3X1vA6N7uz9uAjSoUpUsUUWrims1Z/Xb1+ZnoM7McV+Rl9/BufnFDhE2 ++q6dmd47LGP/kY2fQE4j/atuMzrU5GL/H+2a2gFxur4x3tY6lVekBY1jtMadpwn4Z5U WZ5me3mLf1oAy/DEKpd0SWrVx1rFMfa0D8lWF0vcrarU3/eAQCIeWGFVYPQvXvBzw8kK GNIQ== X-Gm-Message-State: APjAAAXcbIoaqVPCNFkYATuvlJd91awVO04z5CSwtfhdeZiXq5HhuIx/ LlnTsi86wVSyRb4s9S7+ck/XpHYzMnzTBy7jsJq7BQ== X-Google-Smtp-Source: APXvYqwQYYci7I67aRBonFwOTDDHZvisnT9MPd62fmUFGWvgmGL1eI7K9On3FwQg9b+v550fPLGLr8uzThpKIIMkhe0= X-Received: by 2002:a5d:678c:: with SMTP id v12mr2383882wru.116.1573584355560; Tue, 12 Nov 2019 10:45:55 -0800 (PST) MIME-Version: 1.0 References: <20191107205334.158354-1-hannes@cmpxchg.org> <20191107205334.158354-3-hannes@cmpxchg.org> <20191112174533.GA178331@cmpxchg.org> In-Reply-To: <20191112174533.GA178331@cmpxchg.org> From: Suren Baghdasaryan Date: Tue, 12 Nov 2019 10:45:44 -0800 Message-ID: Subject: Re: [PATCH 2/3] mm: vmscan: detect file thrashing at the reclaim root To: Johannes Weiner Cc: Andrew Morton , Andrey Ryabinin , Shakeel Butt , Rik van Riel , Michal Hocko , linux-mm , cgroups mailinglist , LKML , kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 12, 2019 at 9:45 AM Johannes Weiner wrote: > > On Sun, Nov 10, 2019 at 06:01:18PM -0800, Suren Baghdasaryan wrote: > > On Thu, Nov 7, 2019 at 12:53 PM Johannes Weiner wrote: > > > > > > We use refault information to determine whether the cache workingset > > > is stable or transitioning, and dynamically adjust the inactive:active > > > file LRU ratio so as to maximize protection from one-off cache during > > > stable periods, and minimize IO during transitions. > > > > > > With cgroups and their nested LRU lists, we currently don't do this > > > correctly. While recursive cgroup reclaim establishes a relative LRU > > > order among the pages of all involved cgroups, refaults only affect > > > the local LRU order in the cgroup in which they are occuring. As a > > > result, cache transitions can take longer in a cgrouped system as the > > > active pages of sibling cgroups aren't challenged when they should be. > > > > > > [ Right now, this is somewhat theoretical, because the siblings, under > > > continued regular reclaim pressure, should eventually run out of > > > inactive pages - and since inactive:active *size* balancing is also > > > done on a cgroup-local level, we will challenge the active pages > > > eventually in most cases. But the next patch will move that relative > > > size enforcement to the reclaim root as well, and then this patch > > > here will be necessary to propagate refault pressure to siblings. ] > > > > > > This patch moves refault detection to the root of reclaim. Instead of > > > remembering the cgroup owner of an evicted page, remember the cgroup > > > that caused the reclaim to happen. When refaults later occur, they'll > > > correctly influence the cross-cgroup LRU order that reclaim follows. > > > > I spent some time thinking about the idea of calculating refault > > distance using target_memcg's inactive_age and then activating > > refaulted page in (possibly) another memcg and I am still having > > trouble convincing myself that this should work correctly. However I > > also was unable to convince myself otherwise... We use refault > > distance to calculate the deficit in inactive LRU space and then > > activate the refaulted page if that distance is less that > > active+inactive LRU size. However making that decision based on LRU > > sizes of one memcg and then activating the page in another one seems > > very counterintuitive to me. Maybe that's just me though... > > It's not activating in a random, unrelated memcg - it's the parental > relationship that makes it work. > > If you have a cgroup tree > > root > | > A > / \ > B1 B2 > > and reclaim is driven by a limit in A, we are reclaiming the pages in > B1 and B2 as if they were on a single LRU list A (it's approximated by > the round-robin reclaim and has some caveats, but that's the idea). > > So when a page that belongs to B2 gets evicted, it gets evicted from > virtual LRU list A. When it refaults later, we make the (in)active > size and distance comparisons against virtual LRU list A as well. > > The pages on the physical LRU list B2 are not just ordered relative to > its B2 peers, they are also ordered relative to the pages in B1. And > that of course is necessary if we want fair competition between them > under shared reclaim pressure from A. Thanks for clarification. The testcase in your description when group B has a large inactive cache which does not get reclaimed while its sibling group A has to drop its active cache got me under the impression that sibling cgroups (in your reply above B1 and B2) can cause memory pressure in each other. Maybe that's not a legit case and B1 would not cause pressure in B2 without causing pressure in their shared parent A? It now makes more sense to me and I want to confirm that is the case.