From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3367C2BAEE for ; Wed, 11 Mar 2020 17:26:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4D86020736 for ; Wed, 11 Mar 2020 17:26:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730520AbgCKR0h (ORCPT ); Wed, 11 Mar 2020 13:26:37 -0400 Received: from foss.arm.com ([217.140.110.172]:52486 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730193AbgCKR0h (ORCPT ); Wed, 11 Mar 2020 13:26:37 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7FC8C1FB; Wed, 11 Mar 2020 10:26:36 -0700 (PDT) Received: from arrakis.emea.arm.com (arrakis.cambridge.arm.com [10.1.196.71]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0166D3F6CF; Wed, 11 Mar 2020 10:26:33 -0700 (PDT) Date: Wed, 11 Mar 2020 17:26:31 +0000 From: Catalin Marinas To: Arnd Bergmann Cc: Russell King - ARM Linux admin , Nishanth Menon , Santosh Shilimkar , Tero Kristo , Linux ARM , Michal Hocko , Rik van Riel , Santosh Shilimkar , Dave Chinner , Linux Kernel Mailing List , Linux-MM , Yafang Shao , Al Viro , Johannes Weiner , linux-fsdevel , kernel-team@fb.com, Kishon Vijay Abraham I , Linus Torvalds , Andrew Morton , Roman Gushchin Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU Message-ID: <20200311172631.GN3216816@arrakis.emea.arm.com> References: <671b05bc-7237-7422-3ece-f1a4a3652c92@oracle.com> <7c4c1459-60d5-24c8-6eb9-da299ead99ea@oracle.com> <20200306203439.peytghdqragjfhdx@kahuna> <20200309155945.GA4124965@arrakis.emea.arm.com> <20200309160919.GM25745@shell.armlinux.org.uk> <20200311142905.GI3216816@arrakis.emea.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 11, 2020 at 05:59:53PM +0100, Arnd Bergmann wrote: > On Wed, Mar 11, 2020 at 3:29 PM Catalin Marinas wrote: > > > > - Flip TTBR0 on kernel entry/exit, and again during user access. > > > > > > This is probably more work to implement than your idea, but > > > I would hope this has a lower overhead on most microarchitectures > > > as it doesn't require pinning the pages. Depending on the > > > microarchitecture, I'd hope the overhead would be comparable > > > to that of ARM64_SW_TTBR0_PAN. > > > > This still doesn't solve the copy_{from,to}_user() case where both > > address spaces need to be available during copy. So you either pin the > > user pages in memory and access them via the kernel mapping or you > > temporarily map (kmap?) the destination/source kernel address. The > > overhead I'd expect to be significantly greater than ARM64_SW_TTBR0_PAN > > for the uaccess routines. For user entry/exit, your suggestion is > > probably comparable with SW PAN. > > Good point, that is indeed a larger overhead. The simplest implementation > I had in mind would use the code from arch/arm/lib/copy_from_user.S and > flip ttbr0 between each ldm and stm (up to 32 bytes), but I have no idea > of the cost of storing to ttbr0, so this might be even more expensive. Do you > have an estimate of how long writing to TTBR0_64 takes on Cortex-A7 > and A15, respectively? I don't have numbers but it's usually not cheap since you need an ISB to synchronise the context after TTBR0 update (basically flushing the pipeline). > Another way might be to use a use a temporary buffer that is already > mapped, and add a memcpy() through L1-cache to reduce the number > of ttbr0 changes. The buffer would probably have to be on the stack, > which limits the size, but for large copies get_user_pages()+memcpy() > may end up being faster anyway. IIRC, the x86 attempt from Ingo some years ago was using get_user_pages() for uaccess. Depending on the size of the buffer, this may be faster than copying twice. -- Catalin