From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34982C433E1 for ; Fri, 17 Jul 2020 12:17:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CDFFA20717 for ; Fri, 17 Jul 2020 12:17:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chrisdown.name header.i=@chrisdown.name header.b="IV2BvGMe" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CDFFA20717 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=chrisdown.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F6B86B00AA; Fri, 17 Jul 2020 08:17:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1CE136B00AB; Fri, 17 Jul 2020 08:17:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10DAA8D0040; Fri, 17 Jul 2020 08:17:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id F25CE6B00AA for ; Fri, 17 Jul 2020 08:17:53 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6F31030178 for ; Fri, 17 Jul 2020 12:17:53 +0000 (UTC) X-FDA: 77047469226.07.ship49_51135f326f0a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 4A1851846D308 for ; Fri, 17 Jul 2020 12:17:53 +0000 (UTC) X-HE-Tag: ship49_51135f326f0a X-Filterd-Recvd-Size: 5597 Received: from mail-ej1-f65.google.com (mail-ej1-f65.google.com [209.85.218.65]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Jul 2020 12:17:52 +0000 (UTC) Received: by mail-ej1-f65.google.com with SMTP id rk21so10513179ejb.2 for ; Fri, 17 Jul 2020 05:17:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=n/pa5dW3hIKbP2t0G0f2EcEX6PNbXUQ5fuTdYCae6vA=; b=IV2BvGMeNhyhjNOB2Es2XKswxXP3OHp2eQYgh22DqMsl6TerVaUZO23vPyszdLx9Vq cFTst0pdov917g0M4eFrsOopy4Cz34VvQ0sBXmpZlszX7IvigmOqk0zp5ZJeXNZMYjF/ q10TaZMjTNhDQu2wLqIjdFpM8JxdSQ8TzCI80= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=n/pa5dW3hIKbP2t0G0f2EcEX6PNbXUQ5fuTdYCae6vA=; b=rmXPQsFkKK5JN8SN+z1R+OwUIPLvIOgtyFQUz34LMP7KZIH3PekNdgnkCNc+nucOjY aFu9Ifloyp+sRTpDAEVOb8tavdjzddjMaYQ7+JO51rN9LW9GHmSwueE6n8EY4wu8RYqd v2dplMWfxzOZJyJyTg3JqBUJLFQPYk9qqidhbGzQcfywXgygejkA5gzp8yNntTgpK2tf bluXloeoR0eV7S52lgkx9yAsEYEzUNcpsE230qSASiGmFA+oLQiCACOUPIO2sRT1ev3i yPdXobp9E0eZxdQsWXpnYZIn1ZXalL7swY6ZVGG9m6O/FRcjxHucXDjQfoCob2wSE6cB 2EiQ== X-Gm-Message-State: AOAM531+xmRMS47bb9KmlCyicfbOgNSyM3W0xYaRdose6CJM36g8lGMu g6I+BzGwm/KmEf4mh9Y+nMZTpQ== X-Google-Smtp-Source: ABdhPJz7wKfXvuyTUniCMMV5pqpKoyxtlthWPk8n9t0shkwzJ5H8cFc1LinEnHX0oWfIs2EbzfMVzw== X-Received: by 2002:a17:906:748:: with SMTP id z8mr8513197ejb.257.1594988271544; Fri, 17 Jul 2020 05:17:51 -0700 (PDT) Received: from localhost ([2620:10d:c093:400::5:a5b1]) by smtp.gmail.com with ESMTPSA id v25sm8161318edr.74.2020.07.17.05.17.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Jul 2020 05:17:51 -0700 (PDT) Date: Fri, 17 Jul 2020 13:17:50 +0100 From: Chris Down To: David Rientjes Cc: Andrew Morton , Yang Shi , Michal Hocko , Shakeel Butt , Yang Shi , Roman Gushchin , Greg Thelen , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, memcg: provide a stat to describe reclaimable memory Message-ID: <20200717121750.GA367633@chrisdown.name> References: <20200715131048.GA176092@chrisdown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.14.6 (2020-07-11) X-Rspamd-Queue-Id: 4A1851846D308 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi David, David Rientjes writes: >With the proposed anon_reclaimable, do you have any reliability concerns? >This would be the amount of lazy freeable memory and memory that can be >uncharged if compound pages from the deferred split queue are split under >memory pressure. It seems to be a very precise value (as slab_reclaimable >already in memory.stat is), so I'm not sure why there is a reliability >concern. Maybe you can elaborate? Ability to reclaim a page is largely about context at the time of reclaim. For example, if you are running at the edge of swap, at a metric that truly describes "reclaimable memory" will contain vastly different numbers from one second to the next as cluster and page availability increases and decreases. We may also have to do things like look for youngness at reclaim time, so I'm not convinced metrics like this makes sense in the general case. >Today, this information is indeed possible to calculate from userspace. >The idea is to present this information that will be backwards compatible, >however, as the kernel implementation changes. When lazy freeable memory >was added, for instance, userspace likely would not have preemptively been >doing an "active_file + inactive_file - file" calculation to factor that >in as reclaimable anon :) I agree it's hard to calculate from userspace without assistance, but I also generally think generally exposing a highly nuanced and situational value to userspace is a recipe for confusion. The user either knows mm internals and can understand it, or don't and probably only misunderstand it. There is a non-zero cognitive cost to adding more metrics like this, which is why I'm interested in knowing more about the userspace usage semantics intended :-) >The example I gave earlier in the thread showed how dramatically different >memory.current is before and after the introduction of deferred split >queues. Userspace sees ballooning memcg usage and alerts on it (suspects >a memory leak, for example) when in reality this is purely reclaimable >memory under pressure and is the result of a kernel implementation detail. Again, I'm curious why this can't be solved by artificial workingset pressurisation and monitoring. Generally, the most reliable reclaim metrics come from operating reclaim itself. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Down Subject: Re: [patch] mm, memcg: provide a stat to describe reclaimable memory Date: Fri, 17 Jul 2020 13:17:50 +0100 Message-ID: <20200717121750.GA367633@chrisdown.name> References: <20200715131048.GA176092@chrisdown.name> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=n/pa5dW3hIKbP2t0G0f2EcEX6PNbXUQ5fuTdYCae6vA=; b=IV2BvGMeNhyhjNOB2Es2XKswxXP3OHp2eQYgh22DqMsl6TerVaUZO23vPyszdLx9Vq cFTst0pdov917g0M4eFrsOopy4Cz34VvQ0sBXmpZlszX7IvigmOqk0zp5ZJeXNZMYjF/ q10TaZMjTNhDQu2wLqIjdFpM8JxdSQ8TzCI80= Content-Disposition: inline In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" Content-Transfer-Encoding: 7bit To: David Rientjes Cc: Andrew Morton , Yang Shi , Michal Hocko , Shakeel Butt , Yang Shi , Roman Gushchin , Greg Thelen , Johannes Weiner , Vladimir Davydov , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org Hi David, David Rientjes writes: >With the proposed anon_reclaimable, do you have any reliability concerns? >This would be the amount of lazy freeable memory and memory that can be >uncharged if compound pages from the deferred split queue are split under >memory pressure. It seems to be a very precise value (as slab_reclaimable >already in memory.stat is), so I'm not sure why there is a reliability >concern. Maybe you can elaborate? Ability to reclaim a page is largely about context at the time of reclaim. For example, if you are running at the edge of swap, at a metric that truly describes "reclaimable memory" will contain vastly different numbers from one second to the next as cluster and page availability increases and decreases. We may also have to do things like look for youngness at reclaim time, so I'm not convinced metrics like this makes sense in the general case. >Today, this information is indeed possible to calculate from userspace. >The idea is to present this information that will be backwards compatible, >however, as the kernel implementation changes. When lazy freeable memory >was added, for instance, userspace likely would not have preemptively been >doing an "active_file + inactive_file - file" calculation to factor that >in as reclaimable anon :) I agree it's hard to calculate from userspace without assistance, but I also generally think generally exposing a highly nuanced and situational value to userspace is a recipe for confusion. The user either knows mm internals and can understand it, or don't and probably only misunderstand it. There is a non-zero cognitive cost to adding more metrics like this, which is why I'm interested in knowing more about the userspace usage semantics intended :-) >The example I gave earlier in the thread showed how dramatically different >memory.current is before and after the introduction of deferred split >queues. Userspace sees ballooning memcg usage and alerts on it (suspects >a memory leak, for example) when in reality this is purely reclaimable >memory under pressure and is the result of a kernel implementation detail. Again, I'm curious why this can't be solved by artificial workingset pressurisation and monitoring. Generally, the most reliable reclaim metrics come from operating reclaim itself.