From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0105C18E5A for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6C43D222C3 for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C43D222C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C2ABD6B0005; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDB0D6B0006; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA2DA6B0007; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 911116B0005 for ; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 504FA8248047 for ; Mon, 9 Mar 2020 15:05:17 +0000 (UTC) X-FDA: 76576147074.07.screw74_2e0bdc3b3e91b X-HE-Tag: screw74_2e0bdc3b3e91b X-Filterd-Recvd-Size: 9781 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.24]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 15:05:16 +0000 (UTC) Received: from mail-qt1-f174.google.com ([209.85.160.174]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.145]) with ESMTPSA (Nemesis) id 1MuDsZ-1jUMJr2IdZ-00uYhu for ; Mon, 09 Mar 2020 16:05:14 +0100 Received: by mail-qt1-f174.google.com with SMTP id d22so7235831qtn.0 for ; Mon, 09 Mar 2020 08:05:14 -0700 (PDT) X-Gm-Message-State: ANhLgQ2Lf3u4LpvuibbuB3zk0LCAz9jBZiAnsej8vxJQdxODg2CQ7Ltw aNmErl7miFXMYJNO31uhd5R2hGj9hb8dIyr9Y0o= X-Google-Smtp-Source: ADFU+vvxWhD4Ku1IfT+mUmY/AeB1CS0OqCZ/jtDyF/dQT9uwneeJb7AmmSc5NwwGRV+bP0xVJ9xB9rGipT01taiUGkg= X-Received: by 2002:ac8:7381:: with SMTP id t1mr14598317qtp.142.1583766312944; Mon, 09 Mar 2020 08:05:12 -0700 (PDT) MIME-Version: 1.0 References: <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200308141923.GI25745@shell.armlinux.org.uk> <20200309140439.GL25745@shell.armlinux.org.uk> In-Reply-To: <20200309140439.GL25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 16:04:54 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin Cc: Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Catalin Marinas , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:oY1U+gv64vhhAugFF7BbRoZD2LDbOsDZhp5xJu2DVJCaEtqu94Y nQjmW2wUQuAxySdgfotu4nDAt5nbNQt8zoKpe6NVwxFptfgKktysp62TBGDJW/f+p8acSxE eqgzDZV8GG+zeA8UVCF9zSW8uRfDoNDyMS3xWphF1snf9i5Ev0vT4o11rpynnHVOxgvCkBB bfMClLSoBxb9qKINzbYIg== X-UI-Out-Filterresults: notjunk:1;V03:K0:MAvIdwOc4Bw=:wJps6UBLvQ8T3kLy0GC13T 1TBD3NElI4TFk8TL63UAQxeooe4mLWJ/X4loPHAaqtf74B43M/HY3ItZKNzSAwl3YigpSBODV StGsVEBuLl1HYkCgS/VIySbz+3mYrBJ7vVEqfgjG57fiC1hvwGWozpHT352LQ+4i+9Ifxf6mi AaYSWvShbFrM2IqaE6xo/sx0CHfk45k6JGPGPrG5vXvLFzG0E1Vn72DpHjrN77rg7fO4CQGuK 5WJaYXaWkpVMkqCqfLNXbpL8N4uQEYMSUYuLZlHbOwT+DhYB7HAR3zp4rv33WT9E3vFtSixNy qsHgUuYnOreBE8Uo38g0AQTZvlUMob0ctqKPIHstaDfBSQ8Bf/jo/eLGf8bSU6Hb27wBGct62 keRc7NcCKSid3BJVHtJ9jBB53d40WwyhLVohyLkU9P3snAe9T8kGZyhefneDAkO5v4Y0t1Vkw Gj1Ut7VkFM+du3y9pQ+MpLfN9OgRajseE+aVH7AKoaMYX6C6SsIDSYV2WPTJQlFlew5SMdSAw u3OCB7ZN7nlQR0XAJKo19hTLnRnKsxT2rRBUlfXQCW+aCrVz8deyzeHBMLm5PxqzqjmU/jbDi RH6WiuKKEwpxNMwkd+nA15nPjYNpkmTqKhMDIujL974x4ygnBPcI1anFKFA8nBFOW1litS98E j3bqE7Yaiwp6FgZ+E+qjAFTybXIBP9gMGLOWs/mgRSSV5k2Y/mVDKpCsV7dlCSs9pwjhUJc/m gWnoKUPbtD/TTC4/G0MayIMBMcEA3OkGbPCDpIssOsta7dfkoSwmFHDqSzac/mjQHf3sdB8zy XPF+TDLh2zThmjniCxSgkMh7OhITV6lyMxWWyWK8WLrlqlSLHk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 9, 2020 at 3:05 PM Russell King - ARM Linux admin wrote: > > On Mon, Mar 09, 2020 at 02:33:26PM +0100, Arnd Bergmann wrote: > > On Sun, Mar 8, 2020 at 3:20 PM Russell King - ARM Linux admin > > wrote: > > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > > On Fri, Mar 6, 2020 at 9:36 PM Nishanth Menon wrote: > > > > > On 13:11-20200226, santosh.shilimkar@oracle.com wrote: > > > > > > > - extend zswap to use all the available high memory for swap space > > > > when highmem is disabled. > > > > > > I don't think that's a good idea. Running debian stable kernels on my > > > 8GB laptop, I have problems when leaving firefox running long before > > > even half the 16GB of swap gets consumed - the entire machine slows > > > down very quickly when it starts swapping more than about 2 or so GB. > > > It seems either the kernel has become quite bad at selecting pages to > > > evict. > > > > > > It gets to the point where any git operation has a battle to fight > > > for RAM, despite not touching anything else other than git. > > > > > > The behaviour is much like firefox is locking memory into core, but > > > that doesn't seem to be what's actually going on. I've never really > > > got to the bottom of it though. > > > > > > This is with 64-bit kernel and userspace. > > > > I agree there is something going wrong on your machine, but I > > don't really see how that relates to my suggestion. > > You are suggesting for a 4GB machine to use 2GB of RAM for normal > usage via an optimised virtual space layout, and 2GB of RAM for > swap using ZSWAP, rather than having 4GB of RAM available via the > present highmem / lowmem system. No, that would not be good. The cases where I would hope to get improvements out of zswap are: - 1GB of RAM with VMSPLIT_3G, when VMSPLIT_3G_OPT and VMSPLIT_2G don't work because of user address space requirements - 2GB of RAM with VMSPLIT_2G - 4GB of RAM if we add VMSPLIT_4G_4G > > - A lot of embedded systems are configured to have no swap at all, > > which can be for good or not-so-good reasons. Having some > > swap space available often improves things, even if it comes > > out of RAM. > > How do you come up with that assertion? What is the evidence behind > it? The idea of zswap is that it's faster to compress/uncompress data than to actually access a slow disk. So if you already have a swap space, it gives you another performance tier inbetween direct-mapped pages and the slow swap. If you don't have a physical swap space, then reserving a little bit of RAM for compressed swap means that rarely used pages take up less space and you end up with more RAM available for the workload you want to run. > This is kind of the crux of my point in the previous email: Linux > with swap performs way worse for me - if I had 16GB of RAM in my > laptop, I bet it would perform better than my current 8GB with a > 16GB swap file - where, when the swap file gets around 8GB full, > the system as a whole starts to struggle. > > That's about a 50/50 split of VM space between RAM and swap. As I said above I agree that very few workloads would behave better from using using 1.75GB RAM plus 2.25GB zswap (storing maybe 6GB of data) compared to highmem. To deal with 4GB systems, we probably need either highmem or VMSPLIT_4G_4G. > > - A particularly important case to optimize for is 2GB of RAM with > > LPAE enabled. With CONFIG_VMSPLIT_2G and highmem, this > > leads to the paradox -ENOMEM when 256MB of highmem are > > full while plenty of lowmem is available. With highmem disabled, > > you avoid that at the cost of losing 12% of RAM. > > What happened to requests for memory from highmem being able to be > sourced from lowmem if highmem wasn't available? That used to be > standard kernel behaviour. AFAICT this is how it's supposed to work, but for some reason it doesn't always. I don't know the details, but have heard of recent complaints about it. I don't think it's the actual get_free_pages failing, but rather some heuristic looking at the number of free pages. > > - With 4GB+ of RAM and CONFIG_VMSPLIT_2G or > > CONFIG_VMSPLIT_3G, using gigabytes of RAM for swap > > space would usually be worse than highmem, but once > > we have VMSPLIT_4G_4G, it's the same situation as above > > with 6% of RAM used for zswap instead of highmem. > > I think the chances of that happening are very slim - I doubt there > is the will to invest the energy amongst what is left of the 32-bit > ARM community. Right. But I think it makes sense to discuss what it would take to do it anyway, and to see who would be interested in funding or implementing VMSPLIT_4G_4G. Whether it happens or not comes down to another tradeoff: Without it, we have to keep highmem around for a much long timer to support systems with 4GB of RAM along with systems that need both 2GB of physical RAM and 3GB of user address space, while adding VMSPLIT_4G_4G soon means we can probably kill off highmem after everybody with more 8GB of RAM or more has stopped upgrading kernels. Unlike the 2GB case, this is something we can realistically plan for. What is going to be much harder I fear is to find someone to implement it on MIPS32, which seems to be a decade ahead of 32-bit ARM in its decline, and also has a small number of users with 4GB or more, and architecturally it seems harder to implement or impossible depending on the type of MIPS MMU. Arnd