From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41E90C43463 for ; Mon, 21 Sep 2020 03:35:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0C74D21789 for ; Mon, 21 Sep 2020 03:35:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726476AbgIUDfK (ORCPT ); Sun, 20 Sep 2020 23:35:10 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:13734 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726368AbgIUDfK (ORCPT ); Sun, 20 Sep 2020 23:35:10 -0400 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 30E72FE13DC98CC13017; Mon, 21 Sep 2020 11:35:07 +0800 (CST) Received: from [127.0.0.1] (10.174.177.253) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Mon, 21 Sep 2020 11:34:59 +0800 Subject: Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary To: Ard Biesheuvel , Russell King - ARM Linux admin CC: Jianguo Chen , Kefeng Wang , Catalin Marinas , Daniel Lezcano , linux-kernel , Libin , "Thomas Gleixner" , Andrew Morton , linux-arm-kernel , patches-armlinux References: <20200915131615.3138-1-thunder.leizhen@huawei.com> <20200915131615.3138-3-thunder.leizhen@huawei.com> <20200915190143.GP1551@shell.armlinux.org.uk> From: "Leizhen (ThunderTown)" Message-ID: Date: Mon, 21 Sep 2020 11:34:58 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.253] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/9/17 22:00, Ard Biesheuvel wrote: > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > wrote: >> >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>> Currently, only support the kernels where the base of physical memory is >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>> unrotated value. But we can use one more "add/sub" instructions to handle >>> bits 23-16. The performance will be slightly affected. >>> >>> Since most boards meet 16 MiB alignment, so add a new configuration >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>> anyone really needs it. >>> >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>> the whole head.S file. So choose it. >>> >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>> (above y). >>> >>> Signed-off-by: Zhen Lei >>> --- >>> arch/arm/Kconfig | 18 +++++++++++++++++- >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>> 3 files changed, 49 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>> index e00d94b16658765..19fc2c746e2ce29 100644 >>> --- a/arch/arm/Kconfig >>> +++ b/arch/arm/Kconfig >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>> kernel in system memory. >>> >>> This can only be used with non-XIP MMU kernels where the base >>> - of physical memory is at a 16MB boundary. >>> + of physical memory is at a 16MiB boundary. >>> >>> Only disable this option if you know that you do not require >>> this feature (eg, building a kernel for a single machine) and >>> you need to shrink the kernel to the minimal size. >>> >>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>> + default n >> >> Please drop the "default n" - this is the default anyway. >> >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>> * in place where 'r' 32 bit operand is expected. >>> */ >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >> >> t is already unsigned long, so this cast is not necessary. >> >> I've been debating whether it would be better to use "movw" for this >> for ARMv7. In other words: >> >> movw tmp, #16-bit >> adds %Q0, %1, tmp, lsl #16 >> adc %R0, %R0, #0 >> >> It would certainly be less instructions, but at the cost of an >> additional register - and we'd have to change the fixup code to >> know about movw. >> >> Thoughts? >> > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > There is no need to use an additional temp register, as we can use the > register holding the high word. (There is no need for the mov_hi macro > to be separate) > > 0: movw %R0, #low offset >> 16 > adds %Q0, %1, %R0, lsl #16 > 1: mov %R0, #high offset > adc %R0, %R0, #0 > .pushsection .pv_table,"a" > .long 0b, 1b > .popsection > > The only problem is distinguishing the two mov instructions from each The #high offset can also consider use movw, it just save two bytes in the thumb2 scenario. We can store different imm16 value for high_offset and low_offset, so that we can distinguish them in __fixup_a_pv_table(). This will make the final implementation of the code look more clear and consistent, especially THUMB2. Let me try it. > other, but that should not be too hard I think. > > . > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A117C43463 for ; Mon, 21 Sep 2020 03:37:18 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D5ED6214F1 for ; Mon, 21 Sep 2020 03:37:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="HD9ZCBJ5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5ED6214F1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=D+hAPTsLpPAegHlDvqUh+NLFgOqD9Yru6nkZRqSR1BQ=; b=HD9ZCBJ5wxbo3vhUR6/wZzyib XKG13zWscG1gYScuL7v6wipg9QAJws4T/uKWbPEY2q1qYzTAvrapwfMvcU0QITBBAKtqrZ5BO1EeD ACRUKYjQCrMZBJ1alI0Q9OZfrhrComu/F3qK+sVb7B5ZAC1F2VSLLAUqHFlYnP4xb4pAqVbnVY9zj dmaKGU/jt7oEQmcUfPxfGMcOKUwwXUwUAvsRqwUGos26TvgmNnxM6T4LS1cY0WbAPCDu4qxxWchnV QQN1NUungeFrtFh3lXaBp/pBL8S4AeGrpaaT6Zi+ZEjWjQyfIAxUb/B342K3ZFKYoTj3bsg35/6Zz PIvblWYsw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kKCc5-00059E-6H; Mon, 21 Sep 2020 03:35:37 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191] helo=huawei.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kKCc1-00058I-Vx for linux-arm-kernel@lists.infradead.org; Mon, 21 Sep 2020 03:35:35 +0000 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 30E72FE13DC98CC13017; Mon, 21 Sep 2020 11:35:07 +0800 (CST) Received: from [127.0.0.1] (10.174.177.253) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Mon, 21 Sep 2020 11:34:59 +0800 Subject: Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary To: Ard Biesheuvel , Russell King - ARM Linux admin References: <20200915131615.3138-1-thunder.leizhen@huawei.com> <20200915131615.3138-3-thunder.leizhen@huawei.com> <20200915190143.GP1551@shell.armlinux.org.uk> From: "Leizhen (ThunderTown)" Message-ID: Date: Mon, 21 Sep 2020 11:34:58 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.174.177.253] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200920_233534_296064_A73C1494 X-CRM114-Status: GOOD ( 26.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jianguo Chen , Kefeng Wang , Catalin Marinas , Daniel Lezcano , linux-kernel , Libin , Thomas Gleixner , Andrew Morton , linux-arm-kernel , patches-armlinux Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2020/9/17 22:00, Ard Biesheuvel wrote: > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > wrote: >> >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>> Currently, only support the kernels where the base of physical memory is >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>> unrotated value. But we can use one more "add/sub" instructions to handle >>> bits 23-16. The performance will be slightly affected. >>> >>> Since most boards meet 16 MiB alignment, so add a new configuration >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>> anyone really needs it. >>> >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>> the whole head.S file. So choose it. >>> >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>> (above y). >>> >>> Signed-off-by: Zhen Lei >>> --- >>> arch/arm/Kconfig | 18 +++++++++++++++++- >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>> 3 files changed, 49 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>> index e00d94b16658765..19fc2c746e2ce29 100644 >>> --- a/arch/arm/Kconfig >>> +++ b/arch/arm/Kconfig >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>> kernel in system memory. >>> >>> This can only be used with non-XIP MMU kernels where the base >>> - of physical memory is at a 16MB boundary. >>> + of physical memory is at a 16MiB boundary. >>> >>> Only disable this option if you know that you do not require >>> this feature (eg, building a kernel for a single machine) and >>> you need to shrink the kernel to the minimal size. >>> >>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>> + default n >> >> Please drop the "default n" - this is the default anyway. >> >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>> * in place where 'r' 32 bit operand is expected. >>> */ >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >> >> t is already unsigned long, so this cast is not necessary. >> >> I've been debating whether it would be better to use "movw" for this >> for ARMv7. In other words: >> >> movw tmp, #16-bit >> adds %Q0, %1, tmp, lsl #16 >> adc %R0, %R0, #0 >> >> It would certainly be less instructions, but at the cost of an >> additional register - and we'd have to change the fixup code to >> know about movw. >> >> Thoughts? >> > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > There is no need to use an additional temp register, as we can use the > register holding the high word. (There is no need for the mov_hi macro > to be separate) > > 0: movw %R0, #low offset >> 16 > adds %Q0, %1, %R0, lsl #16 > 1: mov %R0, #high offset > adc %R0, %R0, #0 > .pushsection .pv_table,"a" > .long 0b, 1b > .popsection > > The only problem is distinguishing the two mov instructions from each The #high offset can also consider use movw, it just save two bytes in the thumb2 scenario. We can store different imm16 value for high_offset and low_offset, so that we can distinguish them in __fixup_a_pv_table(). This will make the final implementation of the code look more clear and consistent, especially THUMB2. Let me try it. > other, but that should not be too hard I think. > > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel