From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E57FBC433EF for ; Tue, 17 May 2022 20:11:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 515E86B0071; Tue, 17 May 2022 16:11:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C52C6B0073; Tue, 17 May 2022 16:11:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 366456B0074; Tue, 17 May 2022 16:11:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 225BD6B0071 for ; Tue, 17 May 2022 16:11:53 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id EC7C6606C0 for ; Tue, 17 May 2022 20:11:52 +0000 (UTC) X-FDA: 79476330864.27.827E60A Received: from mail-oo1-f45.google.com (mail-oo1-f45.google.com [209.85.161.45]) by imf09.hostedemail.com (Postfix) with ESMTP id 3AC5D1400B4 for ; Tue, 17 May 2022 20:11:42 +0000 (UTC) Received: by mail-oo1-f45.google.com with SMTP id f6-20020a4ace86000000b0035f083d2216so26054oos.4 for ; Tue, 17 May 2022 13:11:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=h2Eui4Xs+XTv+2fskDeZzVLFI7BoyJsiN1wcr5vwpuw=; b=WU/rjkUAPamtv0gkPDycWAE77Am36/IyWuJpo5Zb2eziurBXoSNBUk8a/btdXSi4vl JYdTjNlc1tFfa+zFQQGfnVCkC4iSfenQQOnM/K3wh621mRpoPOdyJzslLbh3/4vHD+5+ 9Q+KDe8qFb13wyO/kUhHD8ykWgNnfdQTI9okC1NnIZFUIXltN8CK26H5pdqfKR46vNo8 wJbBaQSlH6elChFUIYs2QL5H3p/l4vhCPKX7zktNzuNWL/4HU8YhrJ+6JbwYLN0+kyut UqAZ6Do0nKd4p6ualilc3kQp8Z/KwO46JHWnNORLCqjsoP/PCpGwNNqdTTCSIZ9Imkvj dCag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=h2Eui4Xs+XTv+2fskDeZzVLFI7BoyJsiN1wcr5vwpuw=; b=f/KVCGuph9aVbrXSNUJsKnopuHM8kmQzaSA07gg+LilZqTmxWuJka4F3D4zWcRe8zd rL9pe06g44UlDLgyTJxOTgGX/KiAaXwS5b9n0s4poGn2yRNp0wwYXkttDy/8iTD1whXf +eKfjXIXv0LoDNz/5f4awn9i3wH9VJt8YCNzEQn0YhhAT4UWJjcYenibs0vmrx3jO2BD WKYgNzfP13+ofcdz65OolqFMNw0GVcmHQyAlINygTSjRU8RsSklnk5rsH8bpDY01Ek6M uHVhS2KeLJBvEEBF7VGvxREhpmKfpWdHdCVPa1SzE/3ACV3oiopiNKH9cFWqi1Vptov/ Wcmw== X-Gm-Message-State: AOAM530kI7kjkn+gZILUzNl9QacrBJxN7PgJrqWAb+6L4fqoenxK9D8j l3Xm62q5D4AdYgY8HbBp7vRc5eE+cK19kmY3svI1XA== X-Google-Smtp-Source: ABdhPJzuEzXPYLDRIUsQ0UUxlDf91VeLnwxnj1XJx1S2hSq/7TMXpvon5ep0s+XDdBTXAYlUx1DJHIS6xuThHLamwPw= X-Received: by 2002:a4a:d40d:0:b0:33a:33be:9c1e with SMTP id n13-20020a4ad40d000000b0033a33be9c1emr8631037oos.96.1652818309731; Tue, 17 May 2022 13:11:49 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Tue, 17 May 2022 13:11:13 -0700 Message-ID: Subject: Re: [RFC] Add swappiness argument to memory.reclaim To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Andrew Morton , David Rientjes , cgroups@vger.kernel.org, Tejun Heo , Linux-MM , Yu Zhao , Wei Xu , Greg Thelen , Chen Wandun Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 3AC5D1400B4 X-Stat-Signature: ityy7xnxz6dk47s96ztz6rb9ngzrajay Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="WU/rjkUA"; spf=pass (imf09.hostedemail.com: domain of yosryahmed@google.com designates 209.85.161.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1652818302-458210 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 17, 2022 at 12:49 PM Roman Gushchin wrote: > > On Tue, May 17, 2022 at 11:13:10AM -0700, Yosry Ahmed wrote: > > On Tue, May 17, 2022 at 9:05 AM Roman Gushchin wrote: > > > > > > On Mon, May 16, 2022 at 03:29:42PM -0700, Yosry Ahmed wrote: > > > > The discussions on the patch series [1] to add memory.reclaim has > > > > shown that it is desirable to add an argument to control the type of > > > > memory being reclaimed by invoked proactive reclaim using > > > > memory.reclaim. > > > > > > > > I am proposing adding a swappiness optional argument to the interface. > > > > If set, it overwrites vm.swappiness and per-memcg swappiness. This > > > > provides a way to enforce user policy on a stateless per-reclaim > > > > basis. We can make policy decisions to perform reclaim differently for > > > > tasks of different app classes based on their individual QoS needs. It > > > > also helps for use cases when particularly page cache is high and we > > > > want to mainly hit that without swapping out. > > > > > > > > The interface would be something like this (utilizing the nested-keyed > > > > interface we documented earlier): > > > > > > > > $ echo "200M swappiness=30" > memory.reclaim > > > > > > What are the anticipated use cases except swappiness == 0 and > > > swappiness == system_default? > > > > > > IMO it's better to allow specifying the type of memory to reclaim, > > > e.g. type="file"/"anon"/"slab", it's a way more clear what to expect. > > > > I imagined swappiness would give user space flexibility to reclaim a > > ratio of file vs. anon as it sees fit based on app class or userspace > > policy, but I agree that the guarantees of swappiness are weak and we > > might want an explicit argument that directly controls the return > > value of get_scan_count() or whether or not we call shrink_slab(). My > > fear is that this interface may be less flexible, for example if we > > only want to avoid reclaiming file pages, but we are fine with anon or > > slab. > > Maybe in the future we will have a new type of memory to > > reclaim, does it get implicitly reclaimed when other types are > > specified or not? > > > > Maybe we can use one argument per type instead? E.g. > > $ echo "200M file=no anon=yes slab=yes" > memory.reclaim > > > > The default value would be "yes" for all types unless stated > > otherwise. This is also leaves room for future extensions (maybe > > file=clean to reclaim clean file pages only?). Interested to hear your > > thoughts on this! > > The question to answer is do you want the code which is determining > the balance of scanning be a part of the interface? > > If not, I'd stick with explicitly specifying a type of memory to scan > (and the "I don't care" mode, where you simply ask to reclaim X bytes). > > Otherwise you need to describe how the artificial memory pressure will > be distributed over different memory types. And with time it might > start being significantly different to what the generic reclaim code does, > because the reclaim path is free to do what's better, there are no > user-visible guarantees. My understanding is that your question is about the swappiness argument, and I agree it can get complicated. I am on board with explicitly specifying the type(s) to reclaim. I think an interface with one argument per type (whitelist/blacklist approach) could be more flexible in specifying multiple types per invocation (smaller race window between reading usages and writing to memory.reclaim), and has room for future extensions (e.g. file=clean). However, if you still think a type=file/anon/slab parameter is better we can also go with this. I imagine this will be an enum/flags that will be passed to try_to_free_pages() instead of may_swap, and then we can map it to one bit flags in struct scan_control. The anon/file flags will be used to control list type in shrink_lruvec (get_scan_counts) and mem_cgroup_soft_limit_reclaim(), and the slab flag will be used to control calls to shrink_slab(). This is orthogonal, but while we are at it we can also add a "controlled_reclaim" flag that we use to control whether we call vmpressure or not. I assume we don't want to count vmpressure for controlled reclaim, similar to PSI. We can then also revert e22c6ed90aa9 ("mm: memcontrol: don't count limit-setting reclaim as memory pressure") and use the same flag to control calls to psi. > > > > > > > > > E.g. what > > > $ echo "200M swappiness=1" > memory.reclaim > > > means if there is only 10M of pagecache? How much of anon memory will > > > be reclaimed? > > > > Good point. I agree that the type argument or per-type arguments have > > multiple advantages over swappiness. > > If a user wants to select multiple types of memory, can they just run several > requests in parallel? Or one by one? > > Thanks! From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yosry Ahmed Subject: Re: [RFC] Add swappiness argument to memory.reclaim Date: Tue, 17 May 2022 13:11:13 -0700 Message-ID: References: Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=h2Eui4Xs+XTv+2fskDeZzVLFI7BoyJsiN1wcr5vwpuw=; b=WU/rjkUAPamtv0gkPDycWAE77Am36/IyWuJpo5Zb2eziurBXoSNBUk8a/btdXSi4vl JYdTjNlc1tFfa+zFQQGfnVCkC4iSfenQQOnM/K3wh621mRpoPOdyJzslLbh3/4vHD+5+ 9Q+KDe8qFb13wyO/kUhHD8ykWgNnfdQTI9okC1NnIZFUIXltN8CK26H5pdqfKR46vNo8 wJbBaQSlH6elChFUIYs2QL5H3p/l4vhCPKX7zktNzuNWL/4HU8YhrJ+6JbwYLN0+kyut UqAZ6Do0nKd4p6ualilc3kQp8Z/KwO46JHWnNORLCqjsoP/PCpGwNNqdTTCSIZ9Imkvj dCag== In-Reply-To: List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Andrew Morton , David Rientjes , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Tejun Heo , Linux-MM , Yu Zhao , Wei Xu , Greg Thelen , Chen Wandun On Tue, May 17, 2022 at 12:49 PM Roman Gushchin wrote: > > On Tue, May 17, 2022 at 11:13:10AM -0700, Yosry Ahmed wrote: > > On Tue, May 17, 2022 at 9:05 AM Roman Gushchin wrote: > > > > > > On Mon, May 16, 2022 at 03:29:42PM -0700, Yosry Ahmed wrote: > > > > The discussions on the patch series [1] to add memory.reclaim has > > > > shown that it is desirable to add an argument to control the type of > > > > memory being reclaimed by invoked proactive reclaim using > > > > memory.reclaim. > > > > > > > > I am proposing adding a swappiness optional argument to the interface. > > > > If set, it overwrites vm.swappiness and per-memcg swappiness. This > > > > provides a way to enforce user policy on a stateless per-reclaim > > > > basis. We can make policy decisions to perform reclaim differently for > > > > tasks of different app classes based on their individual QoS needs. It > > > > also helps for use cases when particularly page cache is high and we > > > > want to mainly hit that without swapping out. > > > > > > > > The interface would be something like this (utilizing the nested-keyed > > > > interface we documented earlier): > > > > > > > > $ echo "200M swappiness=30" > memory.reclaim > > > > > > What are the anticipated use cases except swappiness == 0 and > > > swappiness == system_default? > > > > > > IMO it's better to allow specifying the type of memory to reclaim, > > > e.g. type="file"/"anon"/"slab", it's a way more clear what to expect. > > > > I imagined swappiness would give user space flexibility to reclaim a > > ratio of file vs. anon as it sees fit based on app class or userspace > > policy, but I agree that the guarantees of swappiness are weak and we > > might want an explicit argument that directly controls the return > > value of get_scan_count() or whether or not we call shrink_slab(). My > > fear is that this interface may be less flexible, for example if we > > only want to avoid reclaiming file pages, but we are fine with anon or > > slab. > > Maybe in the future we will have a new type of memory to > > reclaim, does it get implicitly reclaimed when other types are > > specified or not? > > > > Maybe we can use one argument per type instead? E.g. > > $ echo "200M file=no anon=yes slab=yes" > memory.reclaim > > > > The default value would be "yes" for all types unless stated > > otherwise. This is also leaves room for future extensions (maybe > > file=clean to reclaim clean file pages only?). Interested to hear your > > thoughts on this! > > The question to answer is do you want the code which is determining > the balance of scanning be a part of the interface? > > If not, I'd stick with explicitly specifying a type of memory to scan > (and the "I don't care" mode, where you simply ask to reclaim X bytes). > > Otherwise you need to describe how the artificial memory pressure will > be distributed over different memory types. And with time it might > start being significantly different to what the generic reclaim code does, > because the reclaim path is free to do what's better, there are no > user-visible guarantees. My understanding is that your question is about the swappiness argument, and I agree it can get complicated. I am on board with explicitly specifying the type(s) to reclaim. I think an interface with one argument per type (whitelist/blacklist approach) could be more flexible in specifying multiple types per invocation (smaller race window between reading usages and writing to memory.reclaim), and has room for future extensions (e.g. file=clean). However, if you still think a type=file/anon/slab parameter is better we can also go with this. I imagine this will be an enum/flags that will be passed to try_to_free_pages() instead of may_swap, and then we can map it to one bit flags in struct scan_control. The anon/file flags will be used to control list type in shrink_lruvec (get_scan_counts) and mem_cgroup_soft_limit_reclaim(), and the slab flag will be used to control calls to shrink_slab(). This is orthogonal, but while we are at it we can also add a "controlled_reclaim" flag that we use to control whether we call vmpressure or not. I assume we don't want to count vmpressure for controlled reclaim, similar to PSI. We can then also revert e22c6ed90aa9 ("mm: memcontrol: don't count limit-setting reclaim as memory pressure") and use the same flag to control calls to psi. > > > > > > > > > E.g. what > > > $ echo "200M swappiness=1" > memory.reclaim > > > means if there is only 10M of pagecache? How much of anon memory will > > > be reclaimed? > > > > Good point. I agree that the type argument or per-type arguments have > > multiple advantages over swappiness. > > If a user wants to select multiple types of memory, can they just run several > requests in parallel? Or one by one? > > Thanks!