From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DCF1C3F2D1 for ; Mon, 9 Mar 2020 19:46:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0B26B24654 for ; Mon, 9 Mar 2020 19:46:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726569AbgCITqi (ORCPT ); Mon, 9 Mar 2020 15:46:38 -0400 Received: from mout.kundenserver.de ([212.227.126.131]:46091 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725992AbgCITqi (ORCPT ); Mon, 9 Mar 2020 15:46:38 -0400 Received: from mail-qk1-f181.google.com ([209.85.222.181]) by mrelayeu.kundenserver.de (mreue009 [212.227.15.129]) with ESMTPSA (Nemesis) id 1MMGZS-1isD500wxK-00JM4f; Mon, 09 Mar 2020 20:46:36 +0100 Received: by mail-qk1-f181.google.com with SMTP id c145so4695230qke.12; Mon, 09 Mar 2020 12:46:35 -0700 (PDT) X-Gm-Message-State: ANhLgQ3rq6yzydjwfH0UH1dKg+FjDRwxNNz1MOIEGKZue9mEMqQln0ao MvGgtLDZC+DAzcC1iQE1mJ6Li6ZvtnQs/WBOCCc= X-Google-Smtp-Source: ADFU+vvmBvMz+V6RDUU++ZwSagdLLK5RIhqNTJ7v8Q8rLCrbads695yLqa65IiOBHxFXCnfQzEWO8L12fWPy8qQjUb0= X-Received: by 2002:a37:6455:: with SMTP id y82mr7117037qkb.286.1583783194930; Mon, 09 Mar 2020 12:46:34 -0700 (PDT) MIME-Version: 1.0 References: <20200211164701.4ac88d9222e23d1e8cc57c51@linux-foundation.org> <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200309155945.GA4124965@arrakis.emea.arm.com> <20200309160919.GM25745@shell.armlinux.org.uk> In-Reply-To: <20200309160919.GM25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 20:46:18 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin Cc: Catalin Marinas , Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:bBz45YywharMfMeD3eAylez7Ounvj65HPQEEQib3QyJAizH9t4T +vrg8YPRfP/wkc3x9uVrXC2fg1vn05X8acW5S8uqkcfR611JV6haZdbjIhjZOb5VXAEIU7T fybLcm2xzaHPSwnjc1kRg9m+mHJtEcSLp71hOXsnfW8NT9ZsZ+77b5tzERPVbdF5887rNI9 R/z93tWt8v4tBo0ketAlA== X-UI-Out-Filterresults: notjunk:1;V03:K0:3oD3vabjtEQ=:uf5dhXYr9zoCuZp/lYOLA5 PMvgD5mIkR064YxlKT9lB+1ZGq9kjrwQgLDsImiD6z35fFl31b+HKsYetkDDL1KgDNsGs8M7k R1oaj4voGgjyPF+t/AH1FC1GG21eGzAlGvdYZUA/ttauCDUVjk4pmuHZ6VyeRnl1Ot+RJfeOw TQOmSiRDpq3XKzdxVLTEWSaViFdPTlimJ3fkutFcsSgBQGiZfOqrpOl8TT8JCl1pqupugg9hQ rwCyvTyDjnvaZ7XwMjJN9ItTabaTDn9Vx7UCDraGDVCoBZfxJTNaIGABGpOlw7SUXZUK2IxXc i6cNH6p26cVgZE4vSvRteQJI2n7d69OjCsdybhGfx9TyRi5lJO0DpPdPYu9zEETpKU4chZ4f4 +Y5zL0Sj11hxzvJB7WJ3HNTrjQ36RqNHQJyOk0+BBS8G4lmd+vCsfx0escAzCknKVz+GllMtD T2krPnnyoNH29/Jjzj9+YogN+uuV0TtadjHmXr81PmSab1wmnbQd+PDVzQIBjEdF8Ylv8pHZP T4vscLw+V3j6KzvOhE8jqoOo/Cc915U/2we96med5LcBIDZWVPfqIw9EktqpjyiCb4B6dYrqE UUN260UVYq2b+emfbsHeXhR89Zb9G6y8DkaL4LdlNHbRS4tdEVL/LRBSE7NKbx2IVtICC/I+4 CSNaLWcxGUeMYw8bu9a70dy5x8VswT8xhcLEIApDzhChO/frkrPgWvK8tn3qVK6Of5/a865PQ EMNJbhkUIdHZM31hdABugNVrNCiucPw4T7qtSeT/8yrURvJj/oag3CwyRAN6QBCto4EHOrCdc Hb4ckkyulIukGhi2eVQv4bNmqFLqSnIhrJrsDbsFMERWMa2DQQ= Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin wrote: > On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote: > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > - revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32) > > > to see if it can be done, and what the overhead is. This is probably > > > more work than the others combined, but also the most promising > > > as it allows the most user address space and physical ram to be used. > > > > A rough outline of such support (and likely to miss some corner cases): > > > > 1. Kernel runs with its own ASID and non-global page tables. > > > > 2. Trampoline code on exception entry/exit to handle the TTBR0 switching > > between user and kernel. > > > > 3. uaccess routines need to be reworked to pin the user pages in memory > > (get_user_pages()) and access them via the kernel address space. > > > > Point 3 is probably the ugliest and it would introduce a noticeable > > slowdown in certain syscalls. There are probably a number of ways to do the basic design. The idea I had (again, probably missing more corner cases than either of you two that actually understand the details of the mmu): - Assuming we have LPAE, run the kernel vmlinux and modules inside the vmalloc space, in the top 256MB or 512MB on TTBR1 - Map all the physical RAM (up to 3.75GB) into a reserved ASID with TTBR0 - Flip TTBR0 on kernel entry/exit, and again during user access. This is probably more work to implement than your idea, but I would hope this has a lower overhead on most microarchitectures as it doesn't require pinning the pages. Depending on the microarchitecture, I'd hope the overhead would be comparable to that of ARM64_SW_TTBR0_PAN. > We also need to consider that it has implications for the single-kernel > support; a kernel doing this kind of switching would likely be horrid > for a kernel supporting v6+ with VIPT aliasing caches. Would we be > adding a new red line between kernels supporting VIPT-aliasing caches > (present in earlier v6 implementations) and kernels using this system? I would initially do it for LPAE only, given that this is already an incompatible config option. I don't think there are any v6 machines with more than 1GB of RAM (the maximum for AST2500), and the only distro that ships a v6+ multiplatform kernel is Raspbian, which in turn needs a separate LPAE kernel for the large-memory machines anyway. Only doing it for LPAE would still cover the vast majority of systems that actually shipped with more than 2GB. There are a couple of exceptions, i.e. early Cubox i4x4, the Calxeda Highbank developer system and the Novena Laptop, which I would guess have a limited life expectancy (before users stop updating kernels) no longer than the 8GB Keystone-2. Based on that, I would hope that the ARMv7 distros can keep shipping the two kernel images they already ship: - The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem on anything up to 2GB, but still supporting the handful of remaining Cortex-A9s with 4GB using highmem until they are completely obsolete. - The LPAE kernel modified to use a newly added VMSPLIT_4G_4G, with details to be worked out. Most new systems tend to be based on Cortex-A7 with no more than 2GB, so those could run either configuration well. If we find the 2GB of user address space too limiting for the non-LPAE config, or I missed some important pre-LPAE systems with 4GB that need to be supported for longer than other highmem systems, that can probably be added later. Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AB82C10F25 for ; Mon, 9 Mar 2020 19:46:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DE88620828 for ; Mon, 9 Mar 2020 19:46:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DE88620828 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 899336B0007; Mon, 9 Mar 2020 15:46:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FB716B0008; Mon, 9 Mar 2020 15:46:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EC0D6B000A; Mon, 9 Mar 2020 15:46:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 51AEE6B0007 for ; Mon, 9 Mar 2020 15:46:39 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2DA8C180AD811 for ; Mon, 9 Mar 2020 19:46:39 +0000 (UTC) X-FDA: 76576856118.14.roof98_1cdc1c5368859 X-HE-Tag: roof98_1cdc1c5368859 X-Filterd-Recvd-Size: 7823 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.24]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 19:46:38 +0000 (UTC) Received: from mail-qk1-f177.google.com ([209.85.222.177]) by mrelayeu.kundenserver.de (mreue109 [212.227.15.145]) with ESMTPSA (Nemesis) id 1McpW8-1jjuAA1uVe-00a0rl for ; Mon, 09 Mar 2020 20:46:36 +0100 Received: by mail-qk1-f177.google.com with SMTP id c145so4695228qke.12 for ; Mon, 09 Mar 2020 12:46:35 -0700 (PDT) X-Gm-Message-State: ANhLgQ3DvTRYI8x6i1NZANlNiK9fWdr0YJ3hArosKYNW8KFp14A48/w6 QBbE5YneesyjKo8YcI7U2vvyEAwG54KRF5+bvxI= X-Google-Smtp-Source: ADFU+vvmBvMz+V6RDUU++ZwSagdLLK5RIhqNTJ7v8Q8rLCrbads695yLqa65IiOBHxFXCnfQzEWO8L12fWPy8qQjUb0= X-Received: by 2002:a37:6455:: with SMTP id y82mr7117037qkb.286.1583783194930; Mon, 09 Mar 2020 12:46:34 -0700 (PDT) MIME-Version: 1.0 References: <20200211164701.4ac88d9222e23d1e8cc57c51@linux-foundation.org> <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200309155945.GA4124965@arrakis.emea.arm.com> <20200309160919.GM25745@shell.armlinux.org.uk> In-Reply-To: <20200309160919.GM25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 20:46:18 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin Cc: Catalin Marinas , Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:4OqqOuufWoesLMz5UzTzM2dpEWkTnUbnzoqIBbXNevJTvjMfW2z FUDeoU9LvZ4GzRlllwNgIbsRNAEgooNwtSHVyJLHyy4GfIHbby+vYS0w0sl1X0gW+3+KXDy w9meQrk6BtgCUgabEgKzPrpgTxCp/ibnDTO9x5DZwS/s7gmSl/00Y9xOpeydWBL1PrJWFr+ eV+81MkZGRS4pad8NB8yA== X-UI-Out-Filterresults: notjunk:1;V03:K0:5qJaCHL+2QE=:e4tgdx9lgpmKAe9Xkojyib SGxduTfhUdwYb5j9lsaLLmsVRA3H5ir4c5DbYqydvAB3+pM1iuyWPZRER3m7sMaVcw9kfzmLX 49HlC75CNW2ObeqlcB05489DpqSV6TJ2q8fcwIDxe1kOggUoTHNftCKPi13PMquShaaN92fYr lq0A/pjTT9zAbOy4WCH8y21QwbCvHBKE0MkcLtZ8u1tvpX35HJUMzg7/7hzGDKi9/WJ1EiFE0 VhM0+0Ch0dATE7qoLGHfSTxf48XVqZjHlYgXuZh7Xf9hTJ5MXsKs794AOfMGA4rOC0C3QRcE2 6Yx+NvOoWdVPWz5MomrBRIeMxX8IaszMCRS8CLDEihBw2hzT1FRZ6GFepWBgr3ulysYRUdu6i IpG9ykfc2B/3U/Bkgh/MDGZhym8Yn7f5LLb7G35uXRQ1ntGcdUhLXQaVQi7GsYu3QvEaTY3k9 Yg/US5E48RqTWkrH0+8BBkqf11g2X5sjzoTBBwSQN7LPfYKjWeY9OKA67cZMJrD7n7DUwKWTd bm/X9g3wQAfdumvuqdjjUMFCtmMftBU+uRp0cHWq5cEWuVAphfG56ZA/r1nzrt8UX6j4kCjJe zYybhQlaNwRwZdFUR7wOZEz3Ic5YH3RC7uO60iceJGUGpDZazckTCiYNkd5iGk0eQLCEq5FwI qwbywLmxbe7gAy1jPfkFDZd0Xz1oUOLvrAN+znkdweODCqHhC0NxgqmvyiKJc4MxHLwxA+RYI p3Uye3dcm2N02bH2GqfqPmVQnzqaRJ3lWR/Wgc0OG8xBRxGxLNtIXmiYyJv1ygb0MNChkTuWH qMigjIGMuFQeh6ACRBXuzpVBwm+LBhDppjv3fXz1s9eRqoSojY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin wrote: > On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote: > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > - revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32) > > > to see if it can be done, and what the overhead is. This is probably > > > more work than the others combined, but also the most promising > > > as it allows the most user address space and physical ram to be used. > > > > A rough outline of such support (and likely to miss some corner cases): > > > > 1. Kernel runs with its own ASID and non-global page tables. > > > > 2. Trampoline code on exception entry/exit to handle the TTBR0 switching > > between user and kernel. > > > > 3. uaccess routines need to be reworked to pin the user pages in memory > > (get_user_pages()) and access them via the kernel address space. > > > > Point 3 is probably the ugliest and it would introduce a noticeable > > slowdown in certain syscalls. There are probably a number of ways to do the basic design. The idea I had (again, probably missing more corner cases than either of you two that actually understand the details of the mmu): - Assuming we have LPAE, run the kernel vmlinux and modules inside the vmalloc space, in the top 256MB or 512MB on TTBR1 - Map all the physical RAM (up to 3.75GB) into a reserved ASID with TTBR0 - Flip TTBR0 on kernel entry/exit, and again during user access. This is probably more work to implement than your idea, but I would hope this has a lower overhead on most microarchitectures as it doesn't require pinning the pages. Depending on the microarchitecture, I'd hope the overhead would be comparable to that of ARM64_SW_TTBR0_PAN. > We also need to consider that it has implications for the single-kernel > support; a kernel doing this kind of switching would likely be horrid > for a kernel supporting v6+ with VIPT aliasing caches. Would we be > adding a new red line between kernels supporting VIPT-aliasing caches > (present in earlier v6 implementations) and kernels using this system? I would initially do it for LPAE only, given that this is already an incompatible config option. I don't think there are any v6 machines with more than 1GB of RAM (the maximum for AST2500), and the only distro that ships a v6+ multiplatform kernel is Raspbian, which in turn needs a separate LPAE kernel for the large-memory machines anyway. Only doing it for LPAE would still cover the vast majority of systems that actually shipped with more than 2GB. There are a couple of exceptions, i.e. early Cubox i4x4, the Calxeda Highbank developer system and the Novena Laptop, which I would guess have a limited life expectancy (before users stop updating kernels) no longer than the 8GB Keystone-2. Based on that, I would hope that the ARMv7 distros can keep shipping the two kernel images they already ship: - The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem on anything up to 2GB, but still supporting the handful of remaining Cortex-A9s with 4GB using highmem until they are completely obsolete. - The LPAE kernel modified to use a newly added VMSPLIT_4G_4G, with details to be worked out. Most new systems tend to be based on Cortex-A7 with no more than 2GB, so those could run either configuration well. If we find the 2GB of user address space too limiting for the non-LPAE config, or I missed some important pre-LPAE systems with 4GB that need to be supported for longer than other highmem systems, that can probably be added later. Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16C2DC10F25 for ; Mon, 9 Mar 2020 19:46:46 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DDB1020828 for ; Mon, 9 Mar 2020 19:46:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Di/ewe0t" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DDB1020828 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=5bwoV8Gc5WgyDUez+QFpG9nUdnDJIXoUQjoa2RNG6tY=; b=Di/ewe0tgKDlL6 /f4AaFXtqyMKTKw5wX1HHyCJM1px/8ec/b5UrGQCc9cLEv1dpzCSubWaqWiCOxfncdcC0E2GonYWR 008YSPKB5UUpJsKIY3RyMt3FYV5nJVUf5QCw23bl+/Rg0D2QKzRwLqtSpzFl/ud3wrAywupFo9Il1 kHSiCEgy2KV27IbdxYG8sBfxaan7kuytTqgKqCsPDiVo9O7GvZfJ1n2dzgTSXB9fFkNkGFBeYMZrN o9E/3ae0X0o3n71bxJLLXEducEPcDxHNnHyll4o+zsS1tcLuknACF94wY/2PGB+PyKC2znA3HxDXu LvdI7Cgxj/Dn74R/c1Kg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jBOMM-0006bu-Cm; Mon, 09 Mar 2020 19:46:42 +0000 Received: from mout.kundenserver.de ([212.227.126.133]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jBOMJ-0006bK-4C for linux-arm-kernel@lists.infradead.org; Mon, 09 Mar 2020 19:46:40 +0000 Received: from mail-qk1-f169.google.com ([209.85.222.169]) by mrelayeu.kundenserver.de (mreue010 [212.227.15.129]) with ESMTPSA (Nemesis) id 1M4roN-1jA0uh0wwJ-001ypm for ; Mon, 09 Mar 2020 20:46:36 +0100 Received: by mail-qk1-f169.google.com with SMTP id f198so10450036qke.11 for ; Mon, 09 Mar 2020 12:46:35 -0700 (PDT) X-Gm-Message-State: ANhLgQ3yYhta0oMqXvgtwzofLj7ajq+5peEbLOchz2IlSrVORRhFYclU ES51Xkm22jtsiLjJzrp94BP/0/DN8qepkFQ37X8= X-Google-Smtp-Source: ADFU+vvmBvMz+V6RDUU++ZwSagdLLK5RIhqNTJ7v8Q8rLCrbads695yLqa65IiOBHxFXCnfQzEWO8L12fWPy8qQjUb0= X-Received: by 2002:a37:6455:: with SMTP id y82mr7117037qkb.286.1583783194930; Mon, 09 Mar 2020 12:46:34 -0700 (PDT) MIME-Version: 1.0 References: <20200211164701.4ac88d9222e23d1e8cc57c51@linux-foundation.org> <20200212085004.GL25745@shell.armlinux.org.uk> <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200309155945.GA4124965@arrakis.emea.arm.com> <20200309160919.GM25745@shell.armlinux.org.uk> In-Reply-To: <20200309160919.GM25745@shell.armlinux.org.uk> From: Arnd Bergmann Date: Mon, 9 Mar 2020 20:46:18 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU To: Russell King - ARM Linux admin X-Provags-ID: V03:K1:XFCgUUdSazkCxv2U74euTBMMj46fwUTBYB9XvL6ID4DPWjb0yI3 nxK5nbnSLsrPLDB2bl7t/5W6U8uPFidWhphprPslzKygb2ZK7H3Xcf34MRxjPGmTzYSsZDK 4eaDC1tO7F0LHPdp5dmJok+HEkAbyDPlM19y1CS6gONBwaG0e+fSu6mGbOZeu5tJMpW5nJa J4HOSx0iLW2QvC/NOfX6A== X-UI-Out-Filterresults: notjunk:1;V03:K0:MBphgpORWEw=:pe1opI1UkIpCWszOXAMHm/ aX6tkeGlAuUDw27sHOaMlW1i2/LkKQdds2oIWpTuMn6GcW49hYLrSeRQQN2V37ozgNuFiyyVy mm5FN21h2ePEu4tQDz2HM5GhU6UaYv/ZscgcTcfVvzyhO+MI822CTsnl23APils3QbbnystxX RQ5yoaLmoV6CrAh8ZPg0+fN9Bn2j9UtqAZDjyGQLQLHxlF6vZx9WkiPutX8le+37RcYpNRGOd FhSTirZnsT8/7+p0XT5/MOEGUB2g/kIfV8R04zKkOwdNrOz4UFJTmr3NYh3sUkOFnoRpKb5jS lXagsWK9NwZzv0KuTBhu21nkowibcwfzTLBMBtUvyD5xMLh5K6sWz1r/BUJAfNr7cUX10SDvo QDJs99ZJCUExRDA6xs70RYjIc0/9goHKE8tdZD4HomE9jksxfAjL0UBs4RtoeBQPcAsS1Cthj Ep9YecL+xib90MuY+0+Kk/QPXniXqffIBfqH9vYm2pZklpQ8ALVMXWwTY6STbd3+2bOu0J8bx 2nHeE/04HfahzLarOXmw6SHBCSin+KnBwi+EPFSUHn8y0xlyQ2+0ihYmqh8RHegOEiCGJ8gpz GFvgcrU5qxB4WXiSueFrWQEOJCPdBcjtvCUnvt4r1JByuz4ZigdqyKjS3tCoN5YzMATktN7+K AB1eII0BusQfdcwbrSKEG12YKtLDE7FcJe3HPXGSrksTkE5DjGKLs/flXuiLO466ZP/sc2KRe XO+h+L74FZ6yttkHQCUXaq6JptvkoCNvVrg+jSWkuWXqkuRmVwkJwFTvli4ZgXWWMI26sacdn NPqLpbhq+MFUcREXQa2O3+3cSzh+ZNBNTsCKCeEl9oLYL8oHZQ= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200309_124639_465491_A640EB7E X-CRM114-Status: GOOD ( 21.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nishanth Menon , Michal Hocko , Johannes Weiner , Rik van Riel , Catalin Marinas , Roman Gushchin , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Kishon Vijay Abraham I , Tero Kristo , Linux-MM , Yafang Shao , Al Viro , Santosh Shilimkar , linux-fsdevel , kernel-team@fb.com, Linus Torvalds , Andrew Morton , Linux ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin wrote: > On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote: > > On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote: > > > - revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32) > > > to see if it can be done, and what the overhead is. This is probably > > > more work than the others combined, but also the most promising > > > as it allows the most user address space and physical ram to be used. > > > > A rough outline of such support (and likely to miss some corner cases): > > > > 1. Kernel runs with its own ASID and non-global page tables. > > > > 2. Trampoline code on exception entry/exit to handle the TTBR0 switching > > between user and kernel. > > > > 3. uaccess routines need to be reworked to pin the user pages in memory > > (get_user_pages()) and access them via the kernel address space. > > > > Point 3 is probably the ugliest and it would introduce a noticeable > > slowdown in certain syscalls. There are probably a number of ways to do the basic design. The idea I had (again, probably missing more corner cases than either of you two that actually understand the details of the mmu): - Assuming we have LPAE, run the kernel vmlinux and modules inside the vmalloc space, in the top 256MB or 512MB on TTBR1 - Map all the physical RAM (up to 3.75GB) into a reserved ASID with TTBR0 - Flip TTBR0 on kernel entry/exit, and again during user access. This is probably more work to implement than your idea, but I would hope this has a lower overhead on most microarchitectures as it doesn't require pinning the pages. Depending on the microarchitecture, I'd hope the overhead would be comparable to that of ARM64_SW_TTBR0_PAN. > We also need to consider that it has implications for the single-kernel > support; a kernel doing this kind of switching would likely be horrid > for a kernel supporting v6+ with VIPT aliasing caches. Would we be > adding a new red line between kernels supporting VIPT-aliasing caches > (present in earlier v6 implementations) and kernels using this system? I would initially do it for LPAE only, given that this is already an incompatible config option. I don't think there are any v6 machines with more than 1GB of RAM (the maximum for AST2500), and the only distro that ships a v6+ multiplatform kernel is Raspbian, which in turn needs a separate LPAE kernel for the large-memory machines anyway. Only doing it for LPAE would still cover the vast majority of systems that actually shipped with more than 2GB. There are a couple of exceptions, i.e. early Cubox i4x4, the Calxeda Highbank developer system and the Novena Laptop, which I would guess have a limited life expectancy (before users stop updating kernels) no longer than the 8GB Keystone-2. Based on that, I would hope that the ARMv7 distros can keep shipping the two kernel images they already ship: - The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem on anything up to 2GB, but still supporting the handful of remaining Cortex-A9s with 4GB using highmem until they are completely obsolete. - The LPAE kernel modified to use a newly added VMSPLIT_4G_4G, with details to be worked out. Most new systems tend to be based on Cortex-A7 with no more than 2GB, so those could run either configuration well. If we find the 2GB of user address space too limiting for the non-LPAE config, or I missed some important pre-LPAE systems with 4GB that need to be supported for longer than other highmem systems, that can probably be added later. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel