From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82FB0C10F27 for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5B06521655 for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726949AbgCIPFR (ORCPT ); Mon, 9 Mar 2020 11:05:17 -0400 Received: from mout.kundenserver.de ([212.227.126.187]:34587 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726617AbgCIPFQ (ORCPT ); Mon, 9 Mar 2020 11:05:16 -0400 Received: from mail-qt1-f177.google.com ([209.85.160.177]) by mrelayeu.kundenserver.de (mreue009 [212.227.15.129]) with ESMTPSA (Nemesis) id 1MnJdE-1janPi1KtA-00jEZ9; Mon, 09 Mar 2020 16:05:14 +0100 Received: by mail-qt1-f177.google.com with SMTP id l20so5557303qtp.4; Mon, 09 Mar 2020 08:05:14 -0700 (PDT) X-Gm-Message-State: ANhLgQ3HqHeMHH9TihrNmzLLFYCfnATsDJSD+IKTVqXxyMWleNEB0Gmt MVWowcjrJVomitqs/+ntcrH6J4G1L09Sr+pPL9M= X-Google-Smtp-Source: ADFU+vvxWhD4Ku1IfT+mUmY/AeB1CS0OqCZ/jtDyF/dQT9uwneeJb7AmmSc5NwwGRV+bP0xVJ9xB9rGipT01taiUGkg= X-Received: by 2002:ac8:7381:: with SMTP id t1mr14598317qtp.142.1583766312944; Mon, 09 Mar 2020 08:05:12 -0700 (PDT) MIME-Version: 1.0 References: <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200308141923.GI25745@shell.armlinux.org.uk> <20200309140439.GL25745@shell.armlinux.org.uk> In-Reply-To: <20200309140439.GL25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 16:04:54 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin Cc: Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Catalin Marinas , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:dmgsuHQCUIzYa0I7DN5xfdcgt5Nq7Iah49tOapDoqpRZpxpAaXv 7/kkwoJZZ+Y+m3JLVaKVx8fo5io5Z9o10Dtr0QaW4IwZIevdJieL0WWlZMPTfYJ+NOHhs+/ Oh9qqYE7VuAIBumyBprWWcvu2I1SM1vfFio1P7zjdhUPEsxuSYaMQdZr9BjW7mI9bQaZsd5 GNwoFPDSAxbEUubr3yCgQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:8Vfh85SduTs=:o07GddFM5AX/0D26sVOJMD 20aYEUAczgqPDNhbX+xoYrb8kvAyfuV820RioC/SOWQ2Ad6nkHQ//vmPFuVxo8nC9eszt70NI tJUiNhux12zK1NrV9W9jW8XvxCMQuVdaayLFww9i1MPsu0hq8PPeZcDmixex2fPNNkUpvTBey 7UM9YVtU/FjE+kBHgt7FDXvWDwjWJbeWV0VEl9vvPfIAplqIE7e8nO6dZQv5++PNBbnCIU7nI ouoANgLfqOak2drUP689ZpilUOQOPgbiC70bfj00dP7TACd9VQs1px+3rulO7dDHp362Jucrk iui5iHeDkgnhfHTMicFcSQDh5QqHNwFnLS3lye/pAiSR/RcXiM8JOFiz04d7Dpo1aX5h/66aE gpXSL3RpG2SBGXxqbQQBAsksq2Y7q8Ad6jnnOkUryIbWV8l5ShDTtpdV21Oqp/+XVWx+MM9Oe Ob+eU1nQMVFh7iuXqaLUSpj873JYejV9jZ/r5MvhvPzE5LV1MvuWJkZIj3CyJYSnUvlLz+u/0 g+qiPpm70gEKUlFrTK7Wscx/mWpCTshsEw6XtWbqUUliD+DdQTaCTVmbRY0fcQazeD599dO/n aT1G03zBIfNrB1+SH4GRThPiSuA1vzlfnLaplouCDbuIYrVN1pie7b+iIuKJcmDLLTg/mwNgz Bo7VBpyShFG8vuVFRdZkw0wh5jhWaq7RTZQyKXvjV+6byD+TEy+4o53KuRBpeadiZK9Nnv8Et 7sj+N4qtzTs0OoEc031H5CGjLMYJQLubot+CVq0E8oE9VTNLDd5dYAaEQELHsgfN34OWWfZe/ vFGtDVTnAta/vpSU+c4F/aN/KTy25NDMjEY/9CLXqIgYGbjXRU= Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 9, 2020 at 3:05 PM Russell King - ARM Linux admin wrote: > > On Mon, Mar 09, 2020 at 02:33:26PM +0100, Arnd Bergmann wrote: > > On Sun, Mar 8, 2020 at 3:20 PM Russell King - ARM Linux admin > > wrote: > > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > > On Fri, Mar 6, 2020 at 9:36 PM Nishanth Menon wrote: > > > > > On 13:11-20200226, santosh.shilimkar@oracle.com wrote: > > > > > > > - extend zswap to use all the available high memory for swap space > > > > when highmem is disabled. > > > > > > I don't think that's a good idea. Running debian stable kernels on my > > > 8GB laptop, I have problems when leaving firefox running long before > > > even half the 16GB of swap gets consumed - the entire machine slows > > > down very quickly when it starts swapping more than about 2 or so GB. > > > It seems either the kernel has become quite bad at selecting pages to > > > evict. > > > > > > It gets to the point where any git operation has a battle to fight > > > for RAM, despite not touching anything else other than git. > > > > > > The behaviour is much like firefox is locking memory into core, but > > > that doesn't seem to be what's actually going on. I've never really > > > got to the bottom of it though. > > > > > > This is with 64-bit kernel and userspace. > > > > I agree there is something going wrong on your machine, but I > > don't really see how that relates to my suggestion. > > You are suggesting for a 4GB machine to use 2GB of RAM for normal > usage via an optimised virtual space layout, and 2GB of RAM for > swap using ZSWAP, rather than having 4GB of RAM available via the > present highmem / lowmem system. No, that would not be good. The cases where I would hope to get improvements out of zswap are: - 1GB of RAM with VMSPLIT_3G, when VMSPLIT_3G_OPT and VMSPLIT_2G don't work because of user address space requirements - 2GB of RAM with VMSPLIT_2G - 4GB of RAM if we add VMSPLIT_4G_4G > > - A lot of embedded systems are configured to have no swap at all, > > which can be for good or not-so-good reasons. Having some > > swap space available often improves things, even if it comes > > out of RAM. > > How do you come up with that assertion? What is the evidence behind > it? The idea of zswap is that it's faster to compress/uncompress data than to actually access a slow disk. So if you already have a swap space, it gives you another performance tier inbetween direct-mapped pages and the slow swap. If you don't have a physical swap space, then reserving a little bit of RAM for compressed swap means that rarely used pages take up less space and you end up with more RAM available for the workload you want to run. > This is kind of the crux of my point in the previous email: Linux > with swap performs way worse for me - if I had 16GB of RAM in my > laptop, I bet it would perform better than my current 8GB with a > 16GB swap file - where, when the swap file gets around 8GB full, > the system as a whole starts to struggle. > > That's about a 50/50 split of VM space between RAM and swap. As I said above I agree that very few workloads would behave better from using using 1.75GB RAM plus 2.25GB zswap (storing maybe 6GB of data) compared to highmem. To deal with 4GB systems, we probably need either highmem or VMSPLIT_4G_4G. > > - A particularly important case to optimize for is 2GB of RAM with > > LPAE enabled. With CONFIG_VMSPLIT_2G and highmem, this > > leads to the paradox -ENOMEM when 256MB of highmem are > > full while plenty of lowmem is available. With highmem disabled, > > you avoid that at the cost of losing 12% of RAM. > > What happened to requests for memory from highmem being able to be > sourced from lowmem if highmem wasn't available? That used to be > standard kernel behaviour. AFAICT this is how it's supposed to work, but for some reason it doesn't always. I don't know the details, but have heard of recent complaints about it. I don't think it's the actual get_free_pages failing, but rather some heuristic looking at the number of free pages. > > - With 4GB+ of RAM and CONFIG_VMSPLIT_2G or > > CONFIG_VMSPLIT_3G, using gigabytes of RAM for swap > > space would usually be worse than highmem, but once > > we have VMSPLIT_4G_4G, it's the same situation as above > > with 6% of RAM used for zswap instead of highmem. > > I think the chances of that happening are very slim - I doubt there > is the will to invest the energy amongst what is left of the 32-bit > ARM community. Right. But I think it makes sense to discuss what it would take to do it anyway, and to see who would be interested in funding or implementing VMSPLIT_4G_4G. Whether it happens or not comes down to another tradeoff: Without it, we have to keep highmem around for a much long timer to support systems with 4GB of RAM along with systems that need both 2GB of physical RAM and 3GB of user address space, while adding VMSPLIT_4G_4G soon means we can probably kill off highmem after everybody with more 8GB of RAM or more has stopped upgrading kernels. Unlike the 2GB case, this is something we can realistically plan for. What is going to be much harder I fear is to find someone to implement it on MIPS32, which seems to be a decade ahead of 32-bit ARM in its decline, and also has a small number of users with 4GB or more, and architecturally it seems harder to implement or impossible depending on the type of MIPS MMU. Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0105C18E5A for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6C43D222C3 for ; Mon, 9 Mar 2020 15:05:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C43D222C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C2ABD6B0005; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDB0D6B0006; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA2DA6B0007; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 911116B0005 for ; Mon, 9 Mar 2020 11:05:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 504FA8248047 for ; Mon, 9 Mar 2020 15:05:17 +0000 (UTC) X-FDA: 76576147074.07.screw74_2e0bdc3b3e91b X-HE-Tag: screw74_2e0bdc3b3e91b X-Filterd-Recvd-Size: 9781 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.24]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 15:05:16 +0000 (UTC) Received: from mail-qt1-f174.google.com ([209.85.160.174]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.145]) with ESMTPSA (Nemesis) id 1MuDsZ-1jUMJr2IdZ-00uYhu for ; Mon, 09 Mar 2020 16:05:14 +0100 Received: by mail-qt1-f174.google.com with SMTP id d22so7235831qtn.0 for ; Mon, 09 Mar 2020 08:05:14 -0700 (PDT) X-Gm-Message-State: ANhLgQ2Lf3u4LpvuibbuB3zk0LCAz9jBZiAnsej8vxJQdxODg2CQ7Ltw aNmErl7miFXMYJNO31uhd5R2hGj9hb8dIyr9Y0o= X-Google-Smtp-Source: ADFU+vvxWhD4Ku1IfT+mUmY/AeB1CS0OqCZ/jtDyF/dQT9uwneeJb7AmmSc5NwwGRV+bP0xVJ9xB9rGipT01taiUGkg= X-Received: by 2002:ac8:7381:: with SMTP id t1mr14598317qtp.142.1583766312944; Mon, 09 Mar 2020 08:05:12 -0700 (PDT) MIME-Version: 1.0 References: <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200308141923.GI25745@shell.armlinux.org.uk> <20200309140439.GL25745@shell.armlinux.org.uk> In-Reply-To: <20200309140439.GL25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 16:04:54 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin Cc: Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Catalin Marinas , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:oY1U+gv64vhhAugFF7BbRoZD2LDbOsDZhp5xJu2DVJCaEtqu94Y nQjmW2wUQuAxySdgfotu4nDAt5nbNQt8zoKpe6NVwxFptfgKktysp62TBGDJW/f+p8acSxE eqgzDZV8GG+zeA8UVCF9zSW8uRfDoNDyMS3xWphF1snf9i5Ev0vT4o11rpynnHVOxgvCkBB bfMClLSoBxb9qKINzbYIg== X-UI-Out-Filterresults: notjunk:1;V03:K0:MAvIdwOc4Bw=:wJps6UBLvQ8T3kLy0GC13T 1TBD3NElI4TFk8TL63UAQxeooe4mLWJ/X4loPHAaqtf74B43M/HY3ItZKNzSAwl3YigpSBODV StGsVEBuLl1HYkCgS/VIySbz+3mYrBJ7vVEqfgjG57fiC1hvwGWozpHT352LQ+4i+9Ifxf6mi AaYSWvShbFrM2IqaE6xo/sx0CHfk45k6JGPGPrG5vXvLFzG0E1Vn72DpHjrN77rg7fO4CQGuK 5WJaYXaWkpVMkqCqfLNXbpL8N4uQEYMSUYuLZlHbOwT+DhYB7HAR3zp4rv33WT9E3vFtSixNy qsHgUuYnOreBE8Uo38g0AQTZvlUMob0ctqKPIHstaDfBSQ8Bf/jo/eLGf8bSU6Hb27wBGct62 keRc7NcCKSid3BJVHtJ9jBB53d40WwyhLVohyLkU9P3snAe9T8kGZyhefneDAkO5v4Y0t1Vkw Gj1Ut7VkFM+du3y9pQ+MpLfN9OgRajseE+aVH7AKoaMYX6C6SsIDSYV2WPTJQlFlew5SMdSAw u3OCB7ZN7nlQR0XAJKo19hTLnRnKsxT2rRBUlfXQCW+aCrVz8deyzeHBMLm5PxqzqjmU/jbDi RH6WiuKKEwpxNMwkd+nA15nPjYNpkmTqKhMDIujL974x4ygnBPcI1anFKFA8nBFOW1litS98E j3bqE7Yaiwp6FgZ+E+qjAFTybXIBP9gMGLOWs/mgRSSV5k2Y/mVDKpCsV7dlCSs9pwjhUJc/m gWnoKUPbtD/TTC4/G0MayIMBMcEA3OkGbPCDpIssOsta7dfkoSwmFHDqSzac/mjQHf3sdB8zy XPF+TDLh2zThmjniCxSgkMh7OhITV6lyMxWWyWK8WLrlqlSLHk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 9, 2020 at 3:05 PM Russell King - ARM Linux admin wrote: > > On Mon, Mar 09, 2020 at 02:33:26PM +0100, Arnd Bergmann wrote: > > On Sun, Mar 8, 2020 at 3:20 PM Russell King - ARM Linux admin > > wrote: > > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > > On Fri, Mar 6, 2020 at 9:36 PM Nishanth Menon wrote: > > > > > On 13:11-20200226, santosh.shilimkar@oracle.com wrote: > > > > > > > - extend zswap to use all the available high memory for swap space > > > > when highmem is disabled. > > > > > > I don't think that's a good idea. Running debian stable kernels on my > > > 8GB laptop, I have problems when leaving firefox running long before > > > even half the 16GB of swap gets consumed - the entire machine slows > > > down very quickly when it starts swapping more than about 2 or so GB. > > > It seems either the kernel has become quite bad at selecting pages to > > > evict. > > > > > > It gets to the point where any git operation has a battle to fight > > > for RAM, despite not touching anything else other than git. > > > > > > The behaviour is much like firefox is locking memory into core, but > > > that doesn't seem to be what's actually going on. I've never really > > > got to the bottom of it though. > > > > > > This is with 64-bit kernel and userspace. > > > > I agree there is something going wrong on your machine, but I > > don't really see how that relates to my suggestion. > > You are suggesting for a 4GB machine to use 2GB of RAM for normal > usage via an optimised virtual space layout, and 2GB of RAM for > swap using ZSWAP, rather than having 4GB of RAM available via the > present highmem / lowmem system. No, that would not be good. The cases where I would hope to get improvements out of zswap are: - 1GB of RAM with VMSPLIT_3G, when VMSPLIT_3G_OPT and VMSPLIT_2G don't work because of user address space requirements - 2GB of RAM with VMSPLIT_2G - 4GB of RAM if we add VMSPLIT_4G_4G > > - A lot of embedded systems are configured to have no swap at all, > > which can be for good or not-so-good reasons. Having some > > swap space available often improves things, even if it comes > > out of RAM. > > How do you come up with that assertion? What is the evidence behind > it? The idea of zswap is that it's faster to compress/uncompress data than to actually access a slow disk. So if you already have a swap space, it gives you another performance tier inbetween direct-mapped pages and the slow swap. If you don't have a physical swap space, then reserving a little bit of RAM for compressed swap means that rarely used pages take up less space and you end up with more RAM available for the workload you want to run. > This is kind of the crux of my point in the previous email: Linux > with swap performs way worse for me - if I had 16GB of RAM in my > laptop, I bet it would perform better than my current 8GB with a > 16GB swap file - where, when the swap file gets around 8GB full, > the system as a whole starts to struggle. > > That's about a 50/50 split of VM space between RAM and swap. As I said above I agree that very few workloads would behave better from using using 1.75GB RAM plus 2.25GB zswap (storing maybe 6GB of data) compared to highmem. To deal with 4GB systems, we probably need either highmem or VMSPLIT_4G_4G. > > - A particularly important case to optimize for is 2GB of RAM with > > LPAE enabled. With CONFIG_VMSPLIT_2G and highmem, this > > leads to the paradox -ENOMEM when 256MB of highmem are > > full while plenty of lowmem is available. With highmem disabled, > > you avoid that at the cost of losing 12% of RAM. > > What happened to requests for memory from highmem being able to be > sourced from lowmem if highmem wasn't available? That used to be > standard kernel behaviour. AFAICT this is how it's supposed to work, but for some reason it doesn't always. I don't know the details, but have heard of recent complaints about it. I don't think it's the actual get_free_pages failing, but rather some heuristic looking at the number of free pages. > > - With 4GB+ of RAM and CONFIG_VMSPLIT_2G or > > CONFIG_VMSPLIT_3G, using gigabytes of RAM for swap > > space would usually be worse than highmem, but once > > we have VMSPLIT_4G_4G, it's the same situation as above > > with 6% of RAM used for zswap instead of highmem. > > I think the chances of that happening are very slim - I doubt there > is the will to invest the energy amongst what is left of the 32-bit > ARM community. Right. But I think it makes sense to discuss what it would take to do it anyway, and to see who would be interested in funding or implementing VMSPLIT_4G_4G. Whether it happens or not comes down to another tradeoff: Without it, we have to keep highmem around for a much long timer to support systems with 4GB of RAM along with systems that need both 2GB of physical RAM and 3GB of user address space, while adding VMSPLIT_4G_4G soon means we can probably kill off highmem after everybody with more 8GB of RAM or more has stopped upgrading kernels. Unlike the 2GB case, this is something we can realistically plan for. What is going to be much harder I fear is to find someone to implement it on MIPS32, which seems to be a decade ahead of 32-bit ARM in its decline, and also has a small number of users with 4GB or more, and architecturally it seems harder to implement or impossible depending on the type of MIPS MMU. Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EFF7C10F27 for ; Mon, 9 Mar 2020 15:05:25 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E7A6021655 for ; Mon, 9 Mar 2020 15:05:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="MvytJp5N" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E7A6021655 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=q7nG4AloFJMJzUaRZnm6RO8KfRL2xNuWc8ieLb4Myqs=; b=MvytJp5NBAlJGJ PKBJBtgAmHU5v1v2B71j7Z+QzCAPwbsyIdeEjyy4csNOLNSnL1pfE9AtBSpvknuD6nfl597/WVULc bUrdkEuIGVSd/Jc/uKkhvl1K7xEE/r0HH0eLkLDlM6lQxnNPbCkKWAF9W4upPhvbyzin+1dAaBCxR RscgBjl3Ne9R35tX6rDaUo8SrNNKnexIEegprkaBNrG7vTndeIM/ar0mcuHXEwP2cYzDpXRSQENgA /blJfS0sv3wto6VBtnBOsTOTRdURx3IcV02Cg7O6adPAoGp2g1sns54aouWt5vr3Y2MKxgpX2dROM 9/KvftX26984NzUKX0Ow==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jBJy8-0005zr-As; Mon, 09 Mar 2020 15:05:24 +0000 Received: from mout.kundenserver.de ([212.227.126.134]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jBJy3-0005wp-Dr for linux-arm-kernel@lists.infradead.org; Mon, 09 Mar 2020 15:05:21 +0000 Received: from mail-qt1-f169.google.com ([209.85.160.169]) by mrelayeu.kundenserver.de (mreue011 [212.227.15.129]) with ESMTPSA (Nemesis) id 1MxE5Y-1jZYzE20j2-00xaqJ for ; Mon, 09 Mar 2020 16:05:16 +0100 Received: by mail-qt1-f169.google.com with SMTP id m33so7219888qtb.3 for ; Mon, 09 Mar 2020 08:05:14 -0700 (PDT) X-Gm-Message-State: ANhLgQ3wXDylmGtucN6RhnJk4gfc/UpElQNoFr8YUx5RnGbSrZ3F3882 Cawz7QRZp6gfU+4+SqFzLucpr5sMIguUmwOYsoY= X-Google-Smtp-Source: ADFU+vvxWhD4Ku1IfT+mUmY/AeB1CS0OqCZ/jtDyF/dQT9uwneeJb7AmmSc5NwwGRV+bP0xVJ9xB9rGipT01taiUGkg= X-Received: by 2002:ac8:7381:: with SMTP id t1mr14598317qtp.142.1583766312944; Mon, 09 Mar 2020 08:05:12 -0700 (PDT) MIME-Version: 1.0 References: <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200308141923.GI25745@shell.armlinux.org.uk> <20200309140439.GL25745@shell.armlinux.org.uk> In-Reply-To: <20200309140439.GL25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 16:04:54 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin X-Provags-ID: V03:K1:Utpm87gILuwpk3AW+k2JtE9M706BcMu2KcOmdKsIlJT8oq4JIDI syPeE40yLrZFlfUNC2uBTlkFCWvfnZtADGfozvsyXCu5xLOI9c+wm+67LyDhQY3FDXHL+9u TI4kNIvqMP2xXkYdbPE3W0Lp5tgfZyLOKg7e+V2y2brIaP4Cxr3RJlYwJbawZOKy/n5G4IV XelVA57UBohh4sD4AS8zg== X-UI-Out-Filterresults: notjunk:1;V03:K0:dIzyIbzPopo=:wkpEc14NNxGesN0dGwxkVj DXl4Zp1NzSd8pgcSVRFkJpqNGpEhL2u8/ky0yc8hV5dehaSDix1kNJqcgd8W63gH0FTZNUaoh 9CCubXsyPRbsMjxaRmJSCUnpZVpxUzIaoU7JNPsymnniD6CAqrdgyYVDhnawIX0rE/8Ji8Gi+ lQVGux9Ji8rhMGjW7qYS8+l2FryUhO6vaC82o68iZ3NCpnwygnCgzToncdk9PUCap/f8cstv7 H2zkN2aZvapUfXGSP4WohiW9uti4owXdy2mUY74EAC85pDQRNDQHSafkM5gBAJrhyW65z9/tY SdRcP6vugy0hHH07J7G6gkdUUOpM/wBO8ElNTbL+GT3KwvmCLRjXgV7EXZKnERQPtdvna82ej 0WbJjr7qvujLBaxLKsqsJzGsqIEpIOcrE1pldI/V9cEE2yVTMbn2ilHDmHsOwPfRJf88ITmIu 2PonMOzRqpZg8MQINOy2vvK41ME2u0uT1lQdoVggzByfd5rDgD2T8/Hq4b7AQR11H58ta4Rap AhdBqEMAKGbA/EFS6ct7wMDr94OSaqdVzxdb7Kcve6Vv8prBuZaC5eHSlXfuqI9NZAjHoBTii UYTSw72hicXpMaQOaaHVbNTAd//91YBG9fLE/HrZ04FSywPSlNce44NNiqAUFpE1ZWcixHwSY bYANY5C395dPm22X0TEeC6R9o9qcQVLXxTce09v2/iikqGWStD4lIyiSzR53k5oxtGWe9MIp0 jS1L2Lz5KYGs8S5yjNr4PYhihlsURR10GTEmFMPvZYDxtMOaiUvIzhS5/QbafUjmL14/quGYp EnM6R77vfq/Z7UCzWmajtasYujaG8gym35e/DCvTxUBQQvLlIo= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200309_080519_764048_517340B4 X-CRM114-Status: GOOD ( 36.60 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nishanth Menon , Michal Hocko , Johannes Weiner , Rik van Riel , Catalin Marinas , Roman Gushchin , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Kishon Vijay Abraham I , Tero Kristo , Linux-MM , Yafang Shao , Al Viro , Santosh Shilimkar , linux-fsdevel , kernel-team@fb.com, Linus Torvalds , Andrew Morton , Linux ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Mar 9, 2020 at 3:05 PM Russell King - ARM Linux admin wrote: > > On Mon, Mar 09, 2020 at 02:33:26PM +0100, Arnd Bergmann wrote: > > On Sun, Mar 8, 2020 at 3:20 PM Russell King - ARM Linux admin > > wrote: > > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > > On Fri, Mar 6, 2020 at 9:36 PM Nishanth Menon wrote: > > > > > On 13:11-20200226, santosh.shilimkar@oracle.com wrote: > > > > > > > - extend zswap to use all the available high memory for swap space > > > > when highmem is disabled. > > > > > > I don't think that's a good idea. Running debian stable kernels on my > > > 8GB laptop, I have problems when leaving firefox running long before > > > even half the 16GB of swap gets consumed - the entire machine slows > > > down very quickly when it starts swapping more than about 2 or so GB. > > > It seems either the kernel has become quite bad at selecting pages to > > > evict. > > > > > > It gets to the point where any git operation has a battle to fight > > > for RAM, despite not touching anything else other than git. > > > > > > The behaviour is much like firefox is locking memory into core, but > > > that doesn't seem to be what's actually going on. I've never really > > > got to the bottom of it though. > > > > > > This is with 64-bit kernel and userspace. > > > > I agree there is something going wrong on your machine, but I > > don't really see how that relates to my suggestion. > > You are suggesting for a 4GB machine to use 2GB of RAM for normal > usage via an optimised virtual space layout, and 2GB of RAM for > swap using ZSWAP, rather than having 4GB of RAM available via the > present highmem / lowmem system. No, that would not be good. The cases where I would hope to get improvements out of zswap are: - 1GB of RAM with VMSPLIT_3G, when VMSPLIT_3G_OPT and VMSPLIT_2G don't work because of user address space requirements - 2GB of RAM with VMSPLIT_2G - 4GB of RAM if we add VMSPLIT_4G_4G > > - A lot of embedded systems are configured to have no swap at all, > > which can be for good or not-so-good reasons. Having some > > swap space available often improves things, even if it comes > > out of RAM. > > How do you come up with that assertion? What is the evidence behind > it? The idea of zswap is that it's faster to compress/uncompress data than to actually access a slow disk. So if you already have a swap space, it gives you another performance tier inbetween direct-mapped pages and the slow swap. If you don't have a physical swap space, then reserving a little bit of RAM for compressed swap means that rarely used pages take up less space and you end up with more RAM available for the workload you want to run. > This is kind of the crux of my point in the previous email: Linux > with swap performs way worse for me - if I had 16GB of RAM in my > laptop, I bet it would perform better than my current 8GB with a > 16GB swap file - where, when the swap file gets around 8GB full, > the system as a whole starts to struggle. > > That's about a 50/50 split of VM space between RAM and swap. As I said above I agree that very few workloads would behave better from using using 1.75GB RAM plus 2.25GB zswap (storing maybe 6GB of data) compared to highmem. To deal with 4GB systems, we probably need either highmem or VMSPLIT_4G_4G. > > - A particularly important case to optimize for is 2GB of RAM with > > LPAE enabled. With CONFIG_VMSPLIT_2G and highmem, this > > leads to the paradox -ENOMEM when 256MB of highmem are > > full while plenty of lowmem is available. With highmem disabled, > > you avoid that at the cost of losing 12% of RAM. > > What happened to requests for memory from highmem being able to be > sourced from lowmem if highmem wasn't available? That used to be > standard kernel behaviour. AFAICT this is how it's supposed to work, but for some reason it doesn't always. I don't know the details, but have heard of recent complaints about it. I don't think it's the actual get_free_pages failing, but rather some heuristic looking at the number of free pages. > > - With 4GB+ of RAM and CONFIG_VMSPLIT_2G or > > CONFIG_VMSPLIT_3G, using gigabytes of RAM for swap > > space would usually be worse than highmem, but once > > we have VMSPLIT_4G_4G, it's the same situation as above > > with 6% of RAM used for zswap instead of highmem. > > I think the chances of that happening are very slim - I doubt there > is the will to invest the energy amongst what is left of the 32-bit > ARM community. Right. But I think it makes sense to discuss what it would take to do it anyway, and to see who would be interested in funding or implementing VMSPLIT_4G_4G. Whether it happens or not comes down to another tradeoff: Without it, we have to keep highmem around for a much long timer to support systems with 4GB of RAM along with systems that need both 2GB of physical RAM and 3GB of user address space, while adding VMSPLIT_4G_4G soon means we can probably kill off highmem after everybody with more 8GB of RAM or more has stopped upgrading kernels. Unlike the 2GB case, this is something we can realistically plan for. What is going to be much harder I fear is to find someone to implement it on MIPS32, which seems to be a decade ahead of 32-bit ARM in its decline, and also has a small number of users with 4GB or more, and architecturally it seems harder to implement or impossible depending on the type of MIPS MMU. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel