From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56723C433F5 for ; Tue, 22 Feb 2022 01:47:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80B308D0002; Mon, 21 Feb 2022 20:47:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 792FB8D0001; Mon, 21 Feb 2022 20:47:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BED58D0002; Mon, 21 Feb 2022 20:47:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 45AC78D0001 for ; Mon, 21 Feb 2022 20:47:38 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1CB7F60385 for ; Tue, 22 Feb 2022 01:47:38 +0000 (UTC) X-FDA: 79168728996.01.81038C3 Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf19.hostedemail.com (Postfix) with ESMTP id 900E31A0006 for ; Tue, 22 Feb 2022 01:47:37 +0000 (UTC) Received: by mail-vs1-f47.google.com with SMTP id y26so15550460vsq.8 for ; Mon, 21 Feb 2022 17:47:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H78t/Tkekmq9O5Lj2Ksbs71PIv/v6BIhtnLX5Mxv/tc=; b=QfMTvyiwE3XGQ1znQQVazyaY8j7LOpDoxLgXasjsywlb12Uci3qRU1ZjDbs/DQksHh S9CjewhdqGO4rBR3DYFpHJcESh4eo60117lCdcr36OiDBNjU6VAgpoLLew5RWRXHW9GB m1xsEPuqNMhFg/usmYK3pEEmFu9C3uVvmOjv1vuoX+SQZ1x6SKN5SBuMLyNivWDp3jrS ZgylfMVMs2xuobHRidTGV30OXRrlh7FaEFFhHEru2MHPtldU4IyTMjUOT8xR9l/KEoS3 wUQpOR2AdJSZOVPBgowDlFMTQp0NlTPFJVNAR3nGjoMQTRPjOnvnq8zyBULD6N4nvUNd jdgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H78t/Tkekmq9O5Lj2Ksbs71PIv/v6BIhtnLX5Mxv/tc=; b=uTGrZyhDqH8bTa+SnvdUNQ+uHsHKuBeBFomxaDw65ci2EYKYvD2Ho2eKVkxWAJlS0u pCe72j8O87ifzue6LjZyrn3jukkNR8gi6rnHQG4Iqu9Amw5Bhz5gYYEuRtFxO1N+euf4 X9YXb3nnKSNPz9wyXPLGajKYsuIk6ioslg/MV8p0xZ1jolSvbpnVg578gxKQ7QHd9+S3 L90ioWV3kZMV1qeNcTIRtP8Whz37YrQrzLJiiwzIwxkJqGrGenLlxsaEz421qoCTivkZ H6WffdA6v1rzk7RWQfurMdl2LQSGcFm62jK8iHvLjKvoq0dvkPEYYMbYadPe6x41ED9H 5yjw== X-Gm-Message-State: AOAM533M633zG2q4j+7I4SWDQoD9SW9miEA+CPo3SNxcEL4iqMObD57d 2Q1OoNJD5d7oqZCMghFf67sydFohNuOKx0LgOzPuKQ== X-Google-Smtp-Source: ABdhPJxIeyI3mYLfbDebz5kcexdOouEp1H1k6oTsKi5Lq57wRN56u1mCxv31zG/zmUgj5o8K0uV3wWd30Qc6oH/iWc4= X-Received: by 2002:a67:edce:0:b0:31c:405:cb78 with SMTP id e14-20020a67edce000000b0031c0405cb78mr6901843vsp.6.1645494456603; Mon, 21 Feb 2022 17:47:36 -0800 (PST) MIME-Version: 1.0 References: <20220208081902.3550911-1-yuzhao@google.com> <20220208081902.3550911-13-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Mon, 21 Feb 2022 18:47:25 -0700 Message-ID: Subject: Re: [PATCH v7 12/12] mm: multigenerational LRU: documentation To: Mike Rapoport Cc: Andrew Morton , Johannes Weiner , Mel Gorman , Michal Hocko , Andi Kleen , Aneesh Kumar , Barry Song <21cnbao@gmail.com>, Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Michael Larabel , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , Linux ARM , "open list:DOCUMENTATION" , linux-kernel , Linux-MM , Kernel Page Reclaim v2 , "the arch/x86 maintainers" , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 900E31A0006 X-Stat-Signature: q8ofb1zorm59cbtnf3yzuiqfqk15bszb Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QfMTvyiw; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=yuzhao@google.com X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1645494457-367076 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 21, 2022 at 2:02 AM Mike Rapoport wrote: > > On Tue, Feb 15, 2022 at 08:22:10PM -0700, Yu Zhao wrote: > > On Mon, Feb 14, 2022 at 12:28:56PM +0200, Mike Rapoport wrote: > > > > > > +====== ======== > > > > +Values Features > > > > +====== ======== > > > > +0x0001 the multigenerational LRU > > > > > > The multigenerational LRU what? > > > > Itself? This depends on the POV, and I'm trying to determine what would > > be the natural way to present it. > > > > MGLRU itself could be seen as an add-on atop the existing page reclaim > > or an alternative in parallel. The latter would be similar to sl[aou]b, > > and that's how I personally see it. > > > > But here I presented it more like the former because I feel this way is > > more natural to users because they are like switches on a single panel. > > Than I think it should be described as "enable multigenerational LRU" or > something like this. Will do. > > > What will happen if I write 0x2 to this file? > > > > Just like turning on a branch breaker while leaving the main breaker > > off in a circuit breaker box. This is how I see it, and I'm totally > > fine with changing it to whatever you'd recommend. > > That was my guess that when bit 0 is clear the rest do not matter :) > What's important, IMO, is that it is stated explicitly in the description. Will do. > > > Please consider splitting "enable" and "features" attributes. > > > > How about s/Features/Components/? > > I meant to use two attributes: > > /sys/kernel/mm/lru_gen/enable for the main breaker, and > /sys/kernel/mm/lru_gen/features (or components) for the branch breakers It's a bit superfluous for my taste. I generally consider multiple items to fall into the same category if they can be expressed by a type of array, and I usually pack an array into a single file. >From your last review, I gauged this would be too overloaded for your taste. So I'd be happy to make the change if you think two files look more intuitive from user's perspective. > > > > +0x0002 clear the accessed bit in leaf page table entries **in large > > > > + batches**, when MMU sets it (e.g., on x86) > > > > > > Is extra markup really needed here... > > > > > > > +0x0004 clear the accessed bit in non-leaf page table entries **as > > > > + well**, when MMU sets it (e.g., on x86) > > > > > > ... and here? > > > > Will do. > > > > > As for the descriptions, what is the user-visible effect of these features? > > > How different modes of clearing the access bit are reflected in, say, GUI > > > responsiveness, database TPS, or probability of OOM? > > > > These remain to be seen :) I just added these switches in v7, per Mel's > > request from the meeting we had. These were never tested in the field. > > I see :) > > It would be nice to have a description or/and examples of user-visible > effects when there will be some insight on what these features do. How does the following sound? Clearing the accessed bit in large batches can theoretically cause lock contention (mmap_lock), and if it happens the 0x0002 switch can disable this feature. In this case the multigenerational LRU suffers a minor performance degradation. Clearing the accessed bit in non-leaf page table entries was only verified on Intel and AMD, and if it causes problems on other x86 varieties the 0x0004 switch can disable this feature. In this case the multigenerational LRU suffers a negligible performance degradation. > > > > +:Debugfs interface: ``/sys/kernel/debug/lru_gen`` has the following > > > > > > Is debugfs interface relevant only for datacenters? > > > > For the moment, yes. > > And what will happen if somebody uses these interfaces outside > datacenters? As soon as there is a sysfs intefrace, somebody will surely > play with it. > > I think the job schedulers might be the most important user of that > interface, but the documentation should not presume it is the only user. Other ideas are more like brainstorming than concrete use cases, e.g., for desktop users, these interface can in theory speed up hibernation (suspend to disk); for VM users, they can again in theory support auto ballooning. These niches are really minor and less explored compared with the data center use cases which have been dominant. I was hoping we could focus on the essential and take one step at a time. Later on, if there is additional demand and resource, then we expand to cover more use cases. > > > > + job scheduler writes to this file at a certain time interval to > > > > + create new generations, and it ranks available servers based on the > > > > + sizes of their cold memory defined by this time interval. For > > > > + proactive reclaim, a job scheduler writes to this file before it > > > > + tries to land a new job, and if it fails to materialize the cold > > > > + memory without impacting the existing jobs, it retries on the next > > > > + server according to the ranking result. > > > > > > Is this knob only relevant for a job scheduler? Or it can be used in other > > > use-cases as well? > > > > There are other concrete use cases but I'm not ready to discuss them > > yet. > > Here as well, as soon as there is an interface it's not necessarily "job > scheduler" that will "write to this file", anybody can write to that file. > Please adjust the documentation to be more neutral regarding the use-cases. Will do.