From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C59CBC433E0 for ; Sun, 14 Mar 2021 20:59:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 98F5064E89 for ; Sun, 14 Mar 2021 20:59:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233841AbhCNU7O (ORCPT ); Sun, 14 Mar 2021 16:59:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229870AbhCNU6t (ORCPT ); Sun, 14 Mar 2021 16:58:49 -0400 Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA85DC061574 for ; Sun, 14 Mar 2021 13:58:48 -0700 (PDT) Received: by mail-lf1-x12b.google.com with SMTP id m22so53725716lfg.5 for ; Sun, 14 Mar 2021 13:58:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+6Di/Guy0IiLQUX4jWIlkT7I+6XHE/84/IRYxlv8c/I=; b=QFI2F/wwMxZYsp5KsWQ8CwWKELPlhpD2ZGNvO+1vzxC+a4k0X4d8fSilP6Wl2ws4s5 NbEPosz7r+pOalhoaqZYJLyxtjZ2D3yI8Vk6EnNYA4DQEQZLLt9Qpf/Tj6nMnUclbXyG PmiJOZH8HINpiRBURmPW81mAC4Z/Vj/KkXhZ69TkHFuKMr3uluWHW82ektNF73jr5x5A dIGJCjzYAb7RIaJUqZA1USveLPqPM04/t+/Fk+cb7NBmb3rxf/cGS5i7EPYukqBVmCmX vzusxYlossm2OC7svX/oVxzkXMpPG6Ua5zucLEKx9vkGY6lvuR1k4sXvV8BJNXVSSmcU KjoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+6Di/Guy0IiLQUX4jWIlkT7I+6XHE/84/IRYxlv8c/I=; b=nUXzhwCHvKYiNLlRzWVGOZHNRBp7GPkxKGm5YQmMexK0DdxJY+k3qhb2oTAVAUjeeU 42LbyQvtWf54bWQCxMcjd32WeSYQc/DrXPq3S3/sIOIadBVv/yi6/h0D8u7fTVmsXlnH 2a8ssqGC5vFvNH8WLEv968kdO8EtE93N13oIm4XzLFEiMJZZkSPEZ6J9eidHnW3NUGUL dsoCtWbiD0b4BV/KPOV88Z2Qidt70SgPjikmdVzxM2zKDV/sLGGGf6RRnj+B0eAFuN+U EdarMSOyrev18j0zPaQUOGtFA158NdPUkpi27eIHCxAyeLzIlQXCXB97ZSxPSJrPni5n Yjsg== X-Gm-Message-State: AOAM531Q/R7OGu7Z/T8oYBQkCTk2vaUH+clkH5KhapEvoFZHFvoTEnAD zbozoq3rwR5WuSvjdNh9S3ww6posAga6M3rlBivmRQ== X-Google-Smtp-Source: ABdhPJymNe0gWeDhOnFG1Pw30LoYMn9W1Y1JQ4wpMCETqOJ2d1ujk5rrAIX0IpHLkXL3Ix7isKa6YGwkybZ1hmyMXMw= X-Received: by 2002:a05:6512:39c9:: with SMTP id k9mr5804908lfu.432.1615755527065; Sun, 14 Mar 2021 13:58:47 -0700 (PDT) MIME-Version: 1.0 References: <20210311004449.1170308-1-ying.huang@intel.com> <87v99yvzq8.fsf@yhuang-dev.intel.com> In-Reply-To: <87v99yvzq8.fsf@yhuang-dev.intel.com> From: Shakeel Butt Date: Sun, 14 Mar 2021 13:58:33 -0700 Message-ID: Subject: Re: [PATCH] vmscan: retry without cache trim mode if nothing scanned To: "Huang, Ying" Cc: Tejun Heo , Andrew Morton , Linux MM , LKML , Mel Gorman , Johannes Weiner , Vladimir Davydov , Michal Hocko , Joonsoo Kim Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 11, 2021 at 12:52 AM Huang, Ying wrote: > > Hi, Butt, > > Shakeel Butt writes: > > > On Wed, Mar 10, 2021 at 4:47 PM Huang, Ying wrote: > >> > >> From: Huang Ying > >> > >> In shrink_node(), to determine whether to enable cache trim mode, the > >> LRU size is gotten via lruvec_page_state(). That gets the value from > >> a per-CPU counter (mem_cgroup_per_node->lruvec_stat[]). The error of > >> the per-CPU counter from CPU local counting and the descendant memory > >> cgroups may cause some issues. We run into this in 0-Day performance > >> test. > >> > >> 0-Day uses the RAM file system as root file system, so the number of > >> the reclaimable file pages is very small. In the swap testing, the > >> inactive file LRU list will become almost empty soon. But the size of > >> the inactive file LRU list gotten from the per-CPU counter may keep a > >> much larger value (say, 33, 50, etc.). This will enable cache trim > >> mode, but nothing can be scanned in fact. The following pattern > >> repeats for long time in the test, > >> > >> priority inactive_file_size cache_trim_mode > >> 12 33 0 > >> 11 33 0 > >> ... > >> 6 33 0 > >> 5 33 1 > >> ... > >> 1 33 1 > >> > >> That is, the cache_trim_mode will be enabled wrongly when the scan > >> priority decreases to 5. And the problem will not be recovered for > >> long time. > >> > >> It's hard to get the more accurate size of the inactive file list > >> without much more overhead. And it's hard to estimate the error of > >> the per-CPU counter too, because there may be many descendant memory > >> cgroups. But after the actual scanning, if nothing can be scanned > >> with the cache trim mode, it should be wrong to enable the cache trim > >> mode. So we can retry with the cache trim mode disabled. This patch > >> implement this policy. > > > > Instead of playing with the already complicated heuristics, we should > > improve the accuracy of the lruvec stats. Johannes already fixed the > > memcg stats using rstat infrastructure and Tejun has suggestions on > > how to use rstat infrastructure efficiently for lruvec stats at > > https://lore.kernel.org/linux-mm/YCFgr300eRiEZwpL@slm.duckdns.org/. > > Thanks for your information! It should be better if we can improve the > accuracy of lruvec stats without much overhead. But that may be not a > easy task. > > If my understanding were correct, what Tejun suggested is to add a fast > read interface to rstat to be used in hot path. And its accuracy is > similar as that of traditional per-CPU counter. But if we can regularly > update the lruvec rstat with something like vmstat_update(), that should > be OK for the issue described in this patch. > This is also my understanding. Tejun, please correct us if we misunderstood you. BTW Johannes was working on rstat-based lruvec stats patch. Johannes, are you planning to work on the optimization Tejun has suggested.