From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65AE1C636CA for ; Fri, 16 Jul 2021 10:02:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E991613F0 for ; Fri, 16 Jul 2021 10:02:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238006AbhGPKFs (ORCPT ); Fri, 16 Jul 2021 06:05:48 -0400 Received: from foss.arm.com ([217.140.110.172]:37230 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234095AbhGPKFq (ORCPT ); Fri, 16 Jul 2021 06:05:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BC2F06D; Fri, 16 Jul 2021 03:02:50 -0700 (PDT) Received: from [192.168.1.179] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 338CB3F7D8; Fri, 16 Jul 2021 03:02:49 -0700 (PDT) Subject: Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding To: Anshuman Khandual , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, suzuki.poulose@arm.com, mark.rutland@arm.com, will@kernel.org, catalin.marinas@arm.com, maz@kernel.org, james.morse@arm.com References: <1626229291-6569-1-git-send-email-anshuman.khandual@arm.com> <1626229291-6569-7-git-send-email-anshuman.khandual@arm.com> <9f0d9925-3694-3fae-0d09-00adbecd1878@arm.com> From: Steven Price Message-ID: Date: Fri, 16 Jul 2021 11:02:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16/07/2021 08:20, Anshuman Khandual wrote: > > > On 7/14/21 9:08 PM, Steven Price wrote: >> On 14/07/2021 03:21, Anshuman Khandual wrote: >>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K >>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM >>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers >>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K >>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains >>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper >>> to accept a temporary variable and changes impacted call sites. >>> >>> Signed-off-by: Anshuman Khandual >>> --- >>> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++---- >>> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++ >>> arch/arm64/include/asm/pgtable.h | 4 ++++ >>> arch/arm64/kernel/head.S | 25 +++++++++++++------------ >>> 4 files changed, 40 insertions(+), 16 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h >>> index fedc202..0492543 100644 >>> --- a/arch/arm64/include/asm/assembler.h >>> +++ b/arch/arm64/include/asm/assembler.h >>> @@ -606,7 +606,7 @@ alternative_endif >>> #endif >>> .endm >>> >>> - .macro phys_to_pte, pte, phys >>> + .macro phys_to_pte, pte, phys, tmp >>> #ifdef CONFIG_ARM64_PA_BITS_52_LPA >>> /* >>> * We assume \phys is 64K aligned and this is guaranteed by only >>> @@ -614,6 +614,17 @@ alternative_endif >>> */ >>> orr \pte, \phys, \phys, lsr #36 >>> and \pte, \pte, #PTE_ADDR_MASK >>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2) >>> + orr \pte, \phys, \phys, lsr #42 >>> + >>> + /* >>> + * The 'tmp' is being used here to just prepare >>> + * and hold PTE_ADDR_MASK which cannot be passed >>> + * to the subsequent 'and' instruction. >>> + */ >>> + mov \tmp, #PTE_ADDR_LOW >>> + orr \tmp, \tmp, #PTE_ADDR_HIGH >>> + and \pte, \pte, \tmp >> Rather than adding an extra temporary register (and the fallout of >> various other macros needing an extra register), this can be done with >> two AND instructions: > > I would really like to get rid of the 'tmp' variable here as > well but did not figure out any method of accomplishing it. > >> >> /* PTE_ADDR_MASK cannot be encoded as an immediate, so >> * mask off all but two bits, followed by masking the >> * extra two bits >> */ >> and \pte, \pte, #PTE_ADDR_MASK | (3 << 10) >> and \pte, \pte, #~(3 << 10) > > Did this change as suggested > > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -626,9 +626,8 @@ alternative_endif > * and hold PTE_ADDR_MASK which cannot be passed > * to the subsequent 'and' instruction. > */ > - mov \tmp, #PTE_ADDR_LOW > - orr \tmp, \tmp, #PTE_ADDR_HIGH > - and \pte, \pte, \tmp > + and \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10) > + and \pte, \pte, #~(0x3 << 10) > > .Lskip_lpa2\@: > mov \pte, \phys > > > but still fails to build (tested on 16K) > > arch/arm64/kernel/head.S: Assembler messages: > arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > Ah, I'd only tested this for 4k. 16k would require a different set of masks. So the bits we need to cover are those from just below PAGE_SHIFT to the top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k and 16k with GENMASK(PAGE_SHIFT-1, 10): and \pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10) and \pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10) This compiles (for both 4k and 16k) and the assembly looks correct, but I've not done any other testing. Steve From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5608C07E95 for ; Fri, 16 Jul 2021 10:04:57 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83EC56127C for ; Fri, 16 Jul 2021 10:04:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83EC56127C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=XwefnUI0rkS0/p9SDDBF1I0X4tt2F3jSeuJYsFM+bB4=; b=iLkfI/BR8Ec1CIEQXtWaC/u8mp zrw8YtQF4zNoKycdkzI+6CL6T2PgIz6Yqm6vi4+lZKTJzgompTVB3Q5kL0OObrkES+Nsrnuf6CeLe xQZgRDWDud84KOOR9W0uHoJGnmhfw6Vcg94YPoq7nToXeb4Yz2SjVW9f8T8avCMY6RDWzKteY+oCw /z7v3pIKnjDUNeLNEgLsITThLA3Z9nYJOMRlPdtBIr+0DUPrJLfdAzNXcBsWnU+P99mFMU1GQxl6E MiZKixCS/eHGiSxtXAcNbGcT/OxyFSRuv43Q/Nz21JkoudMEYCM41b4Q25zkbPlnRCS6BSyqaL9Vs B8XhoLiA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4KgQ-00414k-OF; Fri, 16 Jul 2021 10:03:02 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4KgM-00413Y-IS for linux-arm-kernel@lists.infradead.org; Fri, 16 Jul 2021 10:03:00 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BC2F06D; Fri, 16 Jul 2021 03:02:50 -0700 (PDT) Received: from [192.168.1.179] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 338CB3F7D8; Fri, 16 Jul 2021 03:02:49 -0700 (PDT) Subject: Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding To: Anshuman Khandual , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, suzuki.poulose@arm.com, mark.rutland@arm.com, will@kernel.org, catalin.marinas@arm.com, maz@kernel.org, james.morse@arm.com References: <1626229291-6569-1-git-send-email-anshuman.khandual@arm.com> <1626229291-6569-7-git-send-email-anshuman.khandual@arm.com> <9f0d9925-3694-3fae-0d09-00adbecd1878@arm.com> From: Steven Price Message-ID: Date: Fri, 16 Jul 2021 11:02:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-GB X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210716_030258_768022_39161F69 X-CRM114-Status: GOOD ( 25.05 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 16/07/2021 08:20, Anshuman Khandual wrote: > > > On 7/14/21 9:08 PM, Steven Price wrote: >> On 14/07/2021 03:21, Anshuman Khandual wrote: >>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K >>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM >>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers >>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K >>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains >>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper >>> to accept a temporary variable and changes impacted call sites. >>> >>> Signed-off-by: Anshuman Khandual >>> --- >>> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++---- >>> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++ >>> arch/arm64/include/asm/pgtable.h | 4 ++++ >>> arch/arm64/kernel/head.S | 25 +++++++++++++------------ >>> 4 files changed, 40 insertions(+), 16 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h >>> index fedc202..0492543 100644 >>> --- a/arch/arm64/include/asm/assembler.h >>> +++ b/arch/arm64/include/asm/assembler.h >>> @@ -606,7 +606,7 @@ alternative_endif >>> #endif >>> .endm >>> >>> - .macro phys_to_pte, pte, phys >>> + .macro phys_to_pte, pte, phys, tmp >>> #ifdef CONFIG_ARM64_PA_BITS_52_LPA >>> /* >>> * We assume \phys is 64K aligned and this is guaranteed by only >>> @@ -614,6 +614,17 @@ alternative_endif >>> */ >>> orr \pte, \phys, \phys, lsr #36 >>> and \pte, \pte, #PTE_ADDR_MASK >>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2) >>> + orr \pte, \phys, \phys, lsr #42 >>> + >>> + /* >>> + * The 'tmp' is being used here to just prepare >>> + * and hold PTE_ADDR_MASK which cannot be passed >>> + * to the subsequent 'and' instruction. >>> + */ >>> + mov \tmp, #PTE_ADDR_LOW >>> + orr \tmp, \tmp, #PTE_ADDR_HIGH >>> + and \pte, \pte, \tmp >> Rather than adding an extra temporary register (and the fallout of >> various other macros needing an extra register), this can be done with >> two AND instructions: > > I would really like to get rid of the 'tmp' variable here as > well but did not figure out any method of accomplishing it. > >> >> /* PTE_ADDR_MASK cannot be encoded as an immediate, so >> * mask off all but two bits, followed by masking the >> * extra two bits >> */ >> and \pte, \pte, #PTE_ADDR_MASK | (3 << 10) >> and \pte, \pte, #~(3 << 10) > > Did this change as suggested > > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -626,9 +626,8 @@ alternative_endif > * and hold PTE_ADDR_MASK which cannot be passed > * to the subsequent 'and' instruction. > */ > - mov \tmp, #PTE_ADDR_LOW > - orr \tmp, \tmp, #PTE_ADDR_HIGH > - and \pte, \pte, \tmp > + and \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10) > + and \pte, \pte, #~(0x3 << 10) > > .Lskip_lpa2\@: > mov \pte, \phys > > > but still fails to build (tested on 16K) > > arch/arm64/kernel/head.S: Assembler messages: > arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)' > Ah, I'd only tested this for 4k. 16k would require a different set of masks. So the bits we need to cover are those from just below PAGE_SHIFT to the top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k and 16k with GENMASK(PAGE_SHIFT-1, 10): and \pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10) and \pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10) This compiles (for both 4k and 16k) and the assembly looks correct, but I've not done any other testing. Steve _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel