* [PATCH v2 0/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-15 13:16 ` Zhen Lei 0 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Zhen Lei, Libin, Kefeng Wang, Jianguo Chen v1 --> v2: Nothing changed, but add mail list: patches@armlinux.org.uk v1: Currently, only support the kernels where the base of physical memory is at a 16MiB boundary. Because the add/sub instructions only contains 8bits unrotated value. But we can use one more "add/sub" instructions to handle bits 23-16, to support PHYS_OFFSET minimum aligned at 64KiB boundary. This function is required at least by some Huawei boards, such as Hi1380 board. Becuase the kernel Image is loaded at 2MiB boundary. Zhen Lei (2): ARM: fix trivial comments in head.S ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary arch/arm/Kconfig | 18 +++++++++++++++++- arch/arm/include/asm/memory.h | 16 +++++++++++++--- arch/arm/kernel/head.S | 31 ++++++++++++++++++++++--------- 3 files changed, 52 insertions(+), 13 deletions(-) -- 1.8.3 ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 0/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-15 13:16 ` Zhen Lei 0 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Jianguo Chen, Kefeng Wang, Libin, Zhen Lei v1 --> v2: Nothing changed, but add mail list: patches@armlinux.org.uk v1: Currently, only support the kernels where the base of physical memory is at a 16MiB boundary. Because the add/sub instructions only contains 8bits unrotated value. But we can use one more "add/sub" instructions to handle bits 23-16, to support PHYS_OFFSET minimum aligned at 64KiB boundary. This function is required at least by some Huawei boards, such as Hi1380 board. Becuase the kernel Image is loaded at 2MiB boundary. Zhen Lei (2): ARM: fix trivial comments in head.S ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary arch/arm/Kconfig | 18 +++++++++++++++++- arch/arm/include/asm/memory.h | 16 +++++++++++++--- arch/arm/kernel/head.S | 31 ++++++++++++++++++++++--------- 3 files changed, 52 insertions(+), 13 deletions(-) -- 1.8.3 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 1/2] ARM: fix trivial comments in head.S 2020-09-15 13:16 ` Zhen Lei @ 2020-09-15 13:16 ` Zhen Lei -1 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Zhen Lei, Libin, Kefeng Wang, Jianguo Chen 1. Change pv_offset to __pv_offset. 2. Change PHYS_OFFSET to PHYS_PFN_OFFSET. commit e26a9e00afc4 ("ARM: Better virt_to_page() handling") replaced __pv_phys_offset with __pv_phys_pfn_offset, but forgot updating the related PHYS_OFFSET to PHYS_PFN_OFFSET. #define PHYS_PFN_OFFSET (__pv_phys_pfn_offset) Fixes: f52bb722547f ("ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses") Fixes: e26a9e00afc4 ("ARM: Better virt_to_page() handling") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- arch/arm/kernel/head.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index f8904227e7fdc44..02d78c9198d0e8d 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -612,7 +612,7 @@ __fixup_pv_table: add r6, r6, r3 @ adjust __pv_phys_pfn_offset address add r7, r7, r3 @ adjust __pv_offset address mov r0, r8, lsr #PAGE_SHIFT @ convert to PFN - str r0, [r6] @ save computed PHYS_OFFSET to __pv_phys_pfn_offset + str r0, [r6] @ save computed PHYS_PFN_OFFSET to __pv_phys_pfn_offset strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits mov r6, r3, lsr #24 @ constant for add/sub instructions teq r3, r6, lsl #24 @ must be 16MiB aligned @@ -634,8 +634,8 @@ __fixup_a_pv_table: adr r0, 3f ldr r6, [r0] add r6, r6, r3 - ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word - ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word + ldr r0, [r6, #HIGH_OFFSET] @ __pv_offset high word + ldr r6, [r6, #LOW_OFFSET] @ __pv_offset low word mov r6, r6, lsr #24 cmn r0, #1 #ifdef CONFIG_THUMB2_KERNEL -- 1.8.3 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 1/2] ARM: fix trivial comments in head.S @ 2020-09-15 13:16 ` Zhen Lei 0 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Jianguo Chen, Kefeng Wang, Libin, Zhen Lei 1. Change pv_offset to __pv_offset. 2. Change PHYS_OFFSET to PHYS_PFN_OFFSET. commit e26a9e00afc4 ("ARM: Better virt_to_page() handling") replaced __pv_phys_offset with __pv_phys_pfn_offset, but forgot updating the related PHYS_OFFSET to PHYS_PFN_OFFSET. #define PHYS_PFN_OFFSET (__pv_phys_pfn_offset) Fixes: f52bb722547f ("ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses") Fixes: e26a9e00afc4 ("ARM: Better virt_to_page() handling") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- arch/arm/kernel/head.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index f8904227e7fdc44..02d78c9198d0e8d 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -612,7 +612,7 @@ __fixup_pv_table: add r6, r6, r3 @ adjust __pv_phys_pfn_offset address add r7, r7, r3 @ adjust __pv_offset address mov r0, r8, lsr #PAGE_SHIFT @ convert to PFN - str r0, [r6] @ save computed PHYS_OFFSET to __pv_phys_pfn_offset + str r0, [r6] @ save computed PHYS_PFN_OFFSET to __pv_phys_pfn_offset strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits mov r6, r3, lsr #24 @ constant for add/sub instructions teq r3, r6, lsl #24 @ must be 16MiB aligned @@ -634,8 +634,8 @@ __fixup_a_pv_table: adr r0, 3f ldr r6, [r0] add r6, r6, r3 - ldr r0, [r6, #HIGH_OFFSET] @ pv_offset high word - ldr r6, [r6, #LOW_OFFSET] @ pv_offset low word + ldr r0, [r6, #HIGH_OFFSET] @ __pv_offset high word + ldr r6, [r6, #LOW_OFFSET] @ __pv_offset low word mov r6, r6, lsr #24 cmn r0, #1 #ifdef CONFIG_THUMB2_KERNEL -- 1.8.3 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-15 13:16 ` Zhen Lei @ 2020-09-15 13:16 ` Zhen Lei -1 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Zhen Lei, Libin, Kefeng Wang, Jianguo Chen Currently, only support the kernels where the base of physical memory is at a 16MiB boundary. Because the add/sub instructions only contains 8bits unrotated value. But we can use one more "add/sub" instructions to handle bits 23-16. The performance will be slightly affected. Since most boards meet 16 MiB alignment, so add a new configuration option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if anyone really needs it. All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are used in __fixup_a_pv_table() now, but the callee saved r11 is not used in the whole head.S file. So choose it. Because the calculation of "y = x + __pv_offset[63:24]" have been done, so we only need to calculate "y = y + __pv_offset[23:16]", that's why the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" (above y). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- arch/arm/Kconfig | 18 +++++++++++++++++- arch/arm/include/asm/memory.h | 16 +++++++++++++--- arch/arm/kernel/head.S | 25 +++++++++++++++++++------ 3 files changed, 49 insertions(+), 10 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e00d94b16658765..19fc2c746e2ce29 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT kernel in system memory. This can only be used with non-XIP MMU kernels where the base - of physical memory is at a 16MB boundary. + of physical memory is at a 16MiB boundary. Only disable this option if you know that you do not require this feature (eg, building a kernel for a single machine) and you need to shrink the kernel to the minimal size. +config ARM_PATCH_PHYS_VIRT_RADICAL + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" + default n + depends on ARM_PATCH_PHYS_VIRT + depends on !THUMB2_KERNEL + help + This can only be used with non-XIP MMU kernels where the base + of physical memory is at a 64KiB boundary. + + Compared with ARM_PATCH_PHYS_VIRT, one or two more instructions + need to be added to implement the conversion of bits 23-16 of + the VA/PA in phys-to-virt and virt-to-phys. The performance is + slightly affected. + + If unsure say N here. + config NEED_MACH_IO_H bool help diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 99035b5891ef442..71b3a60eeb1b1c6 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -173,6 +173,7 @@ * so that all we need to do is modify the 8-bit constant field. */ #define __PV_BITS_31_24 0x81000000 +#define __PV_BITS_23_16 0x00810000 #define __PV_BITS_7_0 0x81 extern unsigned long __pv_phys_pfn_offset; @@ -201,7 +202,7 @@ : "=r" (t) \ : "I" (__PV_BITS_7_0)) -#define __pv_add_carry_stub(x, y) \ +#define __pv_add_carry_stub(x, y, type) \ __asm__ volatile("@ __pv_add_carry_stub\n" \ "1: adds %Q0, %1, %2\n" \ " adc %R0, %R0, #0\n" \ @@ -209,7 +210,7 @@ " .long 1b\n" \ " .popsection\n" \ : "+r" (y) \ - : "r" (x), "I" (__PV_BITS_31_24) \ + : "r" (x), "I" (type) \ : "cc") static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x) @@ -218,9 +219,15 @@ static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x) if (sizeof(phys_addr_t) == 4) { __pv_stub(x, t, "add", __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_stub(t, t, "add", __PV_BITS_23_16); +#endif } else { __pv_stub_mov_hi(t); - __pv_add_carry_stub(x, t); + __pv_add_carry_stub(x, t, __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_add_carry_stub(t, t, __PV_BITS_23_16); +#endif } return t; } @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) * in place where 'r' 32 bit operand is expected. */ __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); +#endif return t; } diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 02d78c9198d0e8d..d9fb226a24d43ae 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -120,7 +120,7 @@ ENTRY(stext) bl __fixup_smp #endif #ifdef CONFIG_ARM_PATCH_PHYS_VIRT - bl __fixup_pv_table + bl __fixup_pv_table @r11 will be used #endif bl __create_page_tables @@ -614,8 +614,13 @@ __fixup_pv_table: mov r0, r8, lsr #PAGE_SHIFT @ convert to PFN str r0, [r6] @ save computed PHYS_PFN_OFFSET to __pv_phys_pfn_offset strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + mov r6, r3, lsr #16 @ constant for add/sub instructions + teq r3, r6, lsl #16 @ must be 64KiB aligned +#else mov r6, r3, lsr #24 @ constant for add/sub instructions teq r3, r6, lsl #24 @ must be 16MiB aligned +#endif THUMB( it ne @ cross section branch ) bne __error str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits @@ -636,7 +641,9 @@ __fixup_a_pv_table: add r6, r6, r3 ldr r0, [r6, #HIGH_OFFSET] @ __pv_offset high word ldr r6, [r6, #LOW_OFFSET] @ __pv_offset low word - mov r6, r6, lsr #24 + mov r11, r6, lsl #8 + mov r11, r11, lsr #24 @ bits 23-16 + mov r6, r6, lsr #24 @ bits 31-24 cmn r0, #1 #ifdef CONFIG_THUMB2_KERNEL moveq r0, #0x200000 @ set bit 21, mov to mvn instruction @@ -682,14 +689,20 @@ ARM_BE8(rev16 ip, ip) #ifdef CONFIG_CPU_ENDIAN_BE8 @ in BE8, we load data in BE, but instructions still in LE bic ip, ip, #0xff000000 - tst ip, #0x000f0000 @ check the rotation field + tst ip, #0x00040000 @ check the rotation field orrne ip, ip, r6, lsl #24 @ mask in offset bits 31-24 + tst ip, #0x00080000 @ check the rotation field + orrne ip, ip, r11, lsl #24 @ mask in offset bits 23-16 + tst ip, #0x000f0000 @ check the rotation field biceq ip, ip, #0x00004000 @ clear bit 22 orreq ip, ip, r0 @ mask in offset bits 7-0 #else bic ip, ip, #0x000000ff - tst ip, #0xf00 @ check the rotation field + tst ip, #0x400 @ check the rotation field orrne ip, ip, r6 @ mask in offset bits 31-24 + tst ip, #0x800 @ check the rotation field + orrne ip, ip, r11 @ mask in offset bits 23-16 + tst ip, #0xf00 @ check the rotation field biceq ip, ip, #0x400000 @ clear bit 22 orreq ip, ip, r0 @ mask in offset bits 7-0 #endif @@ -705,12 +718,12 @@ ENDPROC(__fixup_a_pv_table) 3: .long __pv_offset ENTRY(fixup_pv_table) - stmfd sp!, {r4 - r7, lr} + stmfd sp!, {r4 - r7, r11, lr} mov r3, #0 @ no offset mov r4, r0 @ r0 = table start add r5, r0, r1 @ r1 = table size bl __fixup_a_pv_table - ldmfd sp!, {r4 - r7, pc} + ldmfd sp!, {r4 - r7, r11, pc} ENDPROC(fixup_pv_table) .data -- 1.8.3 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-15 13:16 ` Zhen Lei 0 siblings, 0 replies; 30+ messages in thread From: Zhen Lei @ 2020-09-15 13:16 UTC (permalink / raw) To: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Russell King, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel Cc: Jianguo Chen, Kefeng Wang, Libin, Zhen Lei Currently, only support the kernels where the base of physical memory is at a 16MiB boundary. Because the add/sub instructions only contains 8bits unrotated value. But we can use one more "add/sub" instructions to handle bits 23-16. The performance will be slightly affected. Since most boards meet 16 MiB alignment, so add a new configuration option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if anyone really needs it. All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are used in __fixup_a_pv_table() now, but the callee saved r11 is not used in the whole head.S file. So choose it. Because the calculation of "y = x + __pv_offset[63:24]" have been done, so we only need to calculate "y = y + __pv_offset[23:16]", that's why the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" (above y). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- arch/arm/Kconfig | 18 +++++++++++++++++- arch/arm/include/asm/memory.h | 16 +++++++++++++--- arch/arm/kernel/head.S | 25 +++++++++++++++++++------ 3 files changed, 49 insertions(+), 10 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e00d94b16658765..19fc2c746e2ce29 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT kernel in system memory. This can only be used with non-XIP MMU kernels where the base - of physical memory is at a 16MB boundary. + of physical memory is at a 16MiB boundary. Only disable this option if you know that you do not require this feature (eg, building a kernel for a single machine) and you need to shrink the kernel to the minimal size. +config ARM_PATCH_PHYS_VIRT_RADICAL + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" + default n + depends on ARM_PATCH_PHYS_VIRT + depends on !THUMB2_KERNEL + help + This can only be used with non-XIP MMU kernels where the base + of physical memory is at a 64KiB boundary. + + Compared with ARM_PATCH_PHYS_VIRT, one or two more instructions + need to be added to implement the conversion of bits 23-16 of + the VA/PA in phys-to-virt and virt-to-phys. The performance is + slightly affected. + + If unsure say N here. + config NEED_MACH_IO_H bool help diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 99035b5891ef442..71b3a60eeb1b1c6 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -173,6 +173,7 @@ * so that all we need to do is modify the 8-bit constant field. */ #define __PV_BITS_31_24 0x81000000 +#define __PV_BITS_23_16 0x00810000 #define __PV_BITS_7_0 0x81 extern unsigned long __pv_phys_pfn_offset; @@ -201,7 +202,7 @@ : "=r" (t) \ : "I" (__PV_BITS_7_0)) -#define __pv_add_carry_stub(x, y) \ +#define __pv_add_carry_stub(x, y, type) \ __asm__ volatile("@ __pv_add_carry_stub\n" \ "1: adds %Q0, %1, %2\n" \ " adc %R0, %R0, #0\n" \ @@ -209,7 +210,7 @@ " .long 1b\n" \ " .popsection\n" \ : "+r" (y) \ - : "r" (x), "I" (__PV_BITS_31_24) \ + : "r" (x), "I" (type) \ : "cc") static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x) @@ -218,9 +219,15 @@ static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x) if (sizeof(phys_addr_t) == 4) { __pv_stub(x, t, "add", __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_stub(t, t, "add", __PV_BITS_23_16); +#endif } else { __pv_stub_mov_hi(t); - __pv_add_carry_stub(x, t); + __pv_add_carry_stub(x, t, __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_add_carry_stub(t, t, __PV_BITS_23_16); +#endif } return t; } @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) * in place where 'r' 32 bit operand is expected. */ __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); +#endif return t; } diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 02d78c9198d0e8d..d9fb226a24d43ae 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -120,7 +120,7 @@ ENTRY(stext) bl __fixup_smp #endif #ifdef CONFIG_ARM_PATCH_PHYS_VIRT - bl __fixup_pv_table + bl __fixup_pv_table @r11 will be used #endif bl __create_page_tables @@ -614,8 +614,13 @@ __fixup_pv_table: mov r0, r8, lsr #PAGE_SHIFT @ convert to PFN str r0, [r6] @ save computed PHYS_PFN_OFFSET to __pv_phys_pfn_offset strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL + mov r6, r3, lsr #16 @ constant for add/sub instructions + teq r3, r6, lsl #16 @ must be 64KiB aligned +#else mov r6, r3, lsr #24 @ constant for add/sub instructions teq r3, r6, lsl #24 @ must be 16MiB aligned +#endif THUMB( it ne @ cross section branch ) bne __error str r3, [r7, #LOW_OFFSET] @ save to __pv_offset low bits @@ -636,7 +641,9 @@ __fixup_a_pv_table: add r6, r6, r3 ldr r0, [r6, #HIGH_OFFSET] @ __pv_offset high word ldr r6, [r6, #LOW_OFFSET] @ __pv_offset low word - mov r6, r6, lsr #24 + mov r11, r6, lsl #8 + mov r11, r11, lsr #24 @ bits 23-16 + mov r6, r6, lsr #24 @ bits 31-24 cmn r0, #1 #ifdef CONFIG_THUMB2_KERNEL moveq r0, #0x200000 @ set bit 21, mov to mvn instruction @@ -682,14 +689,20 @@ ARM_BE8(rev16 ip, ip) #ifdef CONFIG_CPU_ENDIAN_BE8 @ in BE8, we load data in BE, but instructions still in LE bic ip, ip, #0xff000000 - tst ip, #0x000f0000 @ check the rotation field + tst ip, #0x00040000 @ check the rotation field orrne ip, ip, r6, lsl #24 @ mask in offset bits 31-24 + tst ip, #0x00080000 @ check the rotation field + orrne ip, ip, r11, lsl #24 @ mask in offset bits 23-16 + tst ip, #0x000f0000 @ check the rotation field biceq ip, ip, #0x00004000 @ clear bit 22 orreq ip, ip, r0 @ mask in offset bits 7-0 #else bic ip, ip, #0x000000ff - tst ip, #0xf00 @ check the rotation field + tst ip, #0x400 @ check the rotation field orrne ip, ip, r6 @ mask in offset bits 31-24 + tst ip, #0x800 @ check the rotation field + orrne ip, ip, r11 @ mask in offset bits 23-16 + tst ip, #0xf00 @ check the rotation field biceq ip, ip, #0x400000 @ clear bit 22 orreq ip, ip, r0 @ mask in offset bits 7-0 #endif @@ -705,12 +718,12 @@ ENDPROC(__fixup_a_pv_table) 3: .long __pv_offset ENTRY(fixup_pv_table) - stmfd sp!, {r4 - r7, lr} + stmfd sp!, {r4 - r7, r11, lr} mov r3, #0 @ no offset mov r4, r0 @ r0 = table start add r5, r0, r1 @ r1 = table size bl __fixup_a_pv_table - ldmfd sp!, {r4 - r7, pc} + ldmfd sp!, {r4 - r7, r11, pc} ENDPROC(fixup_pv_table) .data -- 1.8.3 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-15 13:16 ` Zhen Lei @ 2020-09-15 19:01 ` Russell King - ARM Linux admin -1 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-15 19:01 UTC (permalink / raw) To: Zhen Lei Cc: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel, Libin, Kefeng Wang, Jianguo Chen On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > Currently, only support the kernels where the base of physical memory is > at a 16MiB boundary. Because the add/sub instructions only contains 8bits > unrotated value. But we can use one more "add/sub" instructions to handle > bits 23-16. The performance will be slightly affected. > > Since most boards meet 16 MiB alignment, so add a new configuration > option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > anyone really needs it. > > All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > the whole head.S file. So choose it. > > Because the calculation of "y = x + __pv_offset[63:24]" have been done, > so we only need to calculate "y = y + __pv_offset[23:16]", that's why > the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > (above y). > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > arch/arm/Kconfig | 18 +++++++++++++++++- > arch/arm/include/asm/memory.h | 16 +++++++++++++--- > arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > 3 files changed, 49 insertions(+), 10 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index e00d94b16658765..19fc2c746e2ce29 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > kernel in system memory. > > This can only be used with non-XIP MMU kernels where the base > - of physical memory is at a 16MB boundary. > + of physical memory is at a 16MiB boundary. > > Only disable this option if you know that you do not require > this feature (eg, building a kernel for a single machine) and > you need to shrink the kernel to the minimal size. > > +config ARM_PATCH_PHYS_VIRT_RADICAL > + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > + default n Please drop the "default n" - this is the default anyway. > @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > * in place where 'r' 32 bit operand is expected. > */ > __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); t is already unsigned long, so this cast is not necessary. I've been debating whether it would be better to use "movw" for this for ARMv7. In other words: movw tmp, #16-bit adds %Q0, %1, tmp, lsl #16 adc %R0, %R0, #0 It would certainly be less instructions, but at the cost of an additional register - and we'd have to change the fixup code to know about movw. Thoughts? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-15 19:01 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-15 19:01 UTC (permalink / raw) To: Zhen Lei Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > Currently, only support the kernels where the base of physical memory is > at a 16MiB boundary. Because the add/sub instructions only contains 8bits > unrotated value. But we can use one more "add/sub" instructions to handle > bits 23-16. The performance will be slightly affected. > > Since most boards meet 16 MiB alignment, so add a new configuration > option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > anyone really needs it. > > All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > the whole head.S file. So choose it. > > Because the calculation of "y = x + __pv_offset[63:24]" have been done, > so we only need to calculate "y = y + __pv_offset[23:16]", that's why > the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > (above y). > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > arch/arm/Kconfig | 18 +++++++++++++++++- > arch/arm/include/asm/memory.h | 16 +++++++++++++--- > arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > 3 files changed, 49 insertions(+), 10 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index e00d94b16658765..19fc2c746e2ce29 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > kernel in system memory. > > This can only be used with non-XIP MMU kernels where the base > - of physical memory is at a 16MB boundary. > + of physical memory is at a 16MiB boundary. > > Only disable this option if you know that you do not require > this feature (eg, building a kernel for a single machine) and > you need to shrink the kernel to the minimal size. > > +config ARM_PATCH_PHYS_VIRT_RADICAL > + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > + default n Please drop the "default n" - this is the default anyway. > @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > * in place where 'r' 32 bit operand is expected. > */ > __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); t is already unsigned long, so this cast is not necessary. I've been debating whether it would be better to use "movw" for this for ARMv7. In other words: movw tmp, #16-bit adds %Q0, %1, tmp, lsl #16 adc %R0, %R0, #0 It would certainly be less instructions, but at the cost of an additional register - and we'd have to change the fixup code to know about movw. Thoughts? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-15 19:01 ` Russell King - ARM Linux admin @ 2020-09-16 1:57 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-16 1:57 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel, Libin, Kefeng Wang, Jianguo Chen On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >> Currently, only support the kernels where the base of physical memory is >> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >> unrotated value. But we can use one more "add/sub" instructions to handle >> bits 23-16. The performance will be slightly affected. >> >> Since most boards meet 16 MiB alignment, so add a new configuration >> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >> anyone really needs it. >> >> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >> the whole head.S file. So choose it. >> >> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >> (above y). >> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >> --- >> arch/arm/Kconfig | 18 +++++++++++++++++- >> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >> 3 files changed, 49 insertions(+), 10 deletions(-) >> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> index e00d94b16658765..19fc2c746e2ce29 100644 >> --- a/arch/arm/Kconfig >> +++ b/arch/arm/Kconfig >> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >> kernel in system memory. >> >> This can only be used with non-XIP MMU kernels where the base >> - of physical memory is at a 16MB boundary. >> + of physical memory is at a 16MiB boundary. >> >> Only disable this option if you know that you do not require >> this feature (eg, building a kernel for a single machine) and >> you need to shrink the kernel to the minimal size. >> >> +config ARM_PATCH_PHYS_VIRT_RADICAL >> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >> + default n > > Please drop the "default n" - this is the default anyway. OK, I will remove it. > >> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >> * in place where 'r' 32 bit operand is expected. >> */ >> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > t is already unsigned long, so this cast is not necessary. Oh, yes, yes, I copied from the above statement, but forgot to remove it. > > I've been debating whether it would be better to use "movw" for this > for ARMv7. In other words: > > movw tmp, #16-bit > adds %Q0, %1, tmp, lsl #16 > adc %R0, %R0, #0 > > It would certainly be less instructions, but at the cost of an > additional register - and we'd have to change the fixup code to > know about movw. It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), but one more instruction for 16MiB boundary. And maybe: 16MiB is widely used, but 64KiB is rarely used. So I'm inclined to the current revision. > > Thoughts? > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-16 1:57 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-16 1:57 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >> Currently, only support the kernels where the base of physical memory is >> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >> unrotated value. But we can use one more "add/sub" instructions to handle >> bits 23-16. The performance will be slightly affected. >> >> Since most boards meet 16 MiB alignment, so add a new configuration >> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >> anyone really needs it. >> >> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >> the whole head.S file. So choose it. >> >> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >> (above y). >> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >> --- >> arch/arm/Kconfig | 18 +++++++++++++++++- >> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >> 3 files changed, 49 insertions(+), 10 deletions(-) >> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> index e00d94b16658765..19fc2c746e2ce29 100644 >> --- a/arch/arm/Kconfig >> +++ b/arch/arm/Kconfig >> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >> kernel in system memory. >> >> This can only be used with non-XIP MMU kernels where the base >> - of physical memory is at a 16MB boundary. >> + of physical memory is at a 16MiB boundary. >> >> Only disable this option if you know that you do not require >> this feature (eg, building a kernel for a single machine) and >> you need to shrink the kernel to the minimal size. >> >> +config ARM_PATCH_PHYS_VIRT_RADICAL >> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >> + default n > > Please drop the "default n" - this is the default anyway. OK, I will remove it. > >> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >> * in place where 'r' 32 bit operand is expected. >> */ >> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > t is already unsigned long, so this cast is not necessary. Oh, yes, yes, I copied from the above statement, but forgot to remove it. > > I've been debating whether it would be better to use "movw" for this > for ARMv7. In other words: > > movw tmp, #16-bit > adds %Q0, %1, tmp, lsl #16 > adc %R0, %R0, #0 > > It would certainly be less instructions, but at the cost of an > additional register - and we'd have to change the fixup code to > know about movw. It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), but one more instruction for 16MiB boundary. And maybe: 16MiB is widely used, but 64KiB is rarely used. So I'm inclined to the current revision. > > Thoughts? > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-16 1:57 ` Leizhen (ThunderTown) @ 2020-09-16 7:57 ` Russell King - ARM Linux admin -1 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-16 7:57 UTC (permalink / raw) To: Leizhen (ThunderTown) Cc: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel, Libin, Kefeng Wang, Jianguo Chen On Wed, Sep 16, 2020 at 09:57:15AM +0800, Leizhen (ThunderTown) wrote: > On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: > > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > >> Currently, only support the kernels where the base of physical memory is > >> at a 16MiB boundary. Because the add/sub instructions only contains 8bits > >> unrotated value. But we can use one more "add/sub" instructions to handle > >> bits 23-16. The performance will be slightly affected. > >> > >> Since most boards meet 16 MiB alignment, so add a new configuration > >> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > >> anyone really needs it. > >> > >> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > >> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > >> the whole head.S file. So choose it. > >> > >> Because the calculation of "y = x + __pv_offset[63:24]" have been done, > >> so we only need to calculate "y = y + __pv_offset[23:16]", that's why > >> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > >> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > >> (above y). > >> > >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > >> --- > >> arch/arm/Kconfig | 18 +++++++++++++++++- > >> arch/arm/include/asm/memory.h | 16 +++++++++++++--- > >> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > >> 3 files changed, 49 insertions(+), 10 deletions(-) > >> > >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > >> index e00d94b16658765..19fc2c746e2ce29 100644 > >> --- a/arch/arm/Kconfig > >> +++ b/arch/arm/Kconfig > >> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > >> kernel in system memory. > >> > >> This can only be used with non-XIP MMU kernels where the base > >> - of physical memory is at a 16MB boundary. > >> + of physical memory is at a 16MiB boundary. > >> > >> Only disable this option if you know that you do not require > >> this feature (eg, building a kernel for a single machine) and > >> you need to shrink the kernel to the minimal size. > >> > >> +config ARM_PATCH_PHYS_VIRT_RADICAL > >> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > >> + default n > > > > Please drop the "default n" - this is the default anyway. > > OK, I will remove it. > > > > >> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > >> * in place where 'r' 32 bit operand is expected. > >> */ > >> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > >> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > >> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > > > t is already unsigned long, so this cast is not necessary. > > Oh, yes, yes, I copied from the above statement, but forgot to remove it. > > > > > I've been debating whether it would be better to use "movw" for this > > for ARMv7. In other words: > > > > movw tmp, #16-bit > > adds %Q0, %1, tmp, lsl #16 > > adc %R0, %R0, #0 > > > > It would certainly be less instructions, but at the cost of an > > additional register - and we'd have to change the fixup code to > > know about movw. > > It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), > and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), > but one more instruction for 16MiB boundary. > > And maybe: 16MiB is widely used, but 64KiB is rarely used. > > So I'm inclined to the current revision. Multiplatform kernels (which will be what distros build) will have to enable this option if they wish to support this platform. So, in that case it doesn't just impacting a single platform, but all platforms. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-16 7:57 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-16 7:57 UTC (permalink / raw) To: Leizhen (ThunderTown) Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Wed, Sep 16, 2020 at 09:57:15AM +0800, Leizhen (ThunderTown) wrote: > On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: > > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > >> Currently, only support the kernels where the base of physical memory is > >> at a 16MiB boundary. Because the add/sub instructions only contains 8bits > >> unrotated value. But we can use one more "add/sub" instructions to handle > >> bits 23-16. The performance will be slightly affected. > >> > >> Since most boards meet 16 MiB alignment, so add a new configuration > >> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > >> anyone really needs it. > >> > >> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > >> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > >> the whole head.S file. So choose it. > >> > >> Because the calculation of "y = x + __pv_offset[63:24]" have been done, > >> so we only need to calculate "y = y + __pv_offset[23:16]", that's why > >> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > >> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > >> (above y). > >> > >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > >> --- > >> arch/arm/Kconfig | 18 +++++++++++++++++- > >> arch/arm/include/asm/memory.h | 16 +++++++++++++--- > >> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > >> 3 files changed, 49 insertions(+), 10 deletions(-) > >> > >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > >> index e00d94b16658765..19fc2c746e2ce29 100644 > >> --- a/arch/arm/Kconfig > >> +++ b/arch/arm/Kconfig > >> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > >> kernel in system memory. > >> > >> This can only be used with non-XIP MMU kernels where the base > >> - of physical memory is at a 16MB boundary. > >> + of physical memory is at a 16MiB boundary. > >> > >> Only disable this option if you know that you do not require > >> this feature (eg, building a kernel for a single machine) and > >> you need to shrink the kernel to the minimal size. > >> > >> +config ARM_PATCH_PHYS_VIRT_RADICAL > >> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > >> + default n > > > > Please drop the "default n" - this is the default anyway. > > OK, I will remove it. > > > > >> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > >> * in place where 'r' 32 bit operand is expected. > >> */ > >> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > >> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > >> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > > > t is already unsigned long, so this cast is not necessary. > > Oh, yes, yes, I copied from the above statement, but forgot to remove it. > > > > > I've been debating whether it would be better to use "movw" for this > > for ARMv7. In other words: > > > > movw tmp, #16-bit > > adds %Q0, %1, tmp, lsl #16 > > adc %R0, %R0, #0 > > > > It would certainly be less instructions, but at the cost of an > > additional register - and we'd have to change the fixup code to > > know about movw. > > It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), > and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), > but one more instruction for 16MiB boundary. > > And maybe: 16MiB is widely used, but 64KiB is rarely used. > > So I'm inclined to the current revision. Multiplatform kernels (which will be what distros build) will have to enable this option if they wish to support this platform. So, in that case it doesn't just impacting a single platform, but all platforms. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-16 7:57 ` Russell King - ARM Linux admin @ 2020-09-17 3:26 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-17 3:26 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Catalin Marinas, linux-arm-kernel, patches-armlinux, linux-kernel, Libin, Kefeng Wang, Jianguo Chen On 2020/9/16 15:57, Russell King - ARM Linux admin wrote: > On Wed, Sep 16, 2020 at 09:57:15AM +0800, Leizhen (ThunderTown) wrote: >> On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: >>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>> Currently, only support the kernels where the base of physical memory is >>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>> bits 23-16. The performance will be slightly affected. >>>> >>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>> anyone really needs it. >>>> >>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>> the whole head.S file. So choose it. >>>> >>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>> (above y). >>>> >>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>> --- >>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>> --- a/arch/arm/Kconfig >>>> +++ b/arch/arm/Kconfig >>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>> kernel in system memory. >>>> >>>> This can only be used with non-XIP MMU kernels where the base >>>> - of physical memory is at a 16MB boundary. >>>> + of physical memory is at a 16MiB boundary. >>>> >>>> Only disable this option if you know that you do not require >>>> this feature (eg, building a kernel for a single machine) and >>>> you need to shrink the kernel to the minimal size. >>>> >>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>> + default n >>> >>> Please drop the "default n" - this is the default anyway. >> >> OK, I will remove it. >> >>> >>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>> * in place where 'r' 32 bit operand is expected. >>>> */ >>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>> >>> t is already unsigned long, so this cast is not necessary. >> >> Oh, yes, yes, I copied from the above statement, but forgot to remove it. >> >>> >>> I've been debating whether it would be better to use "movw" for this >>> for ARMv7. In other words: >>> >>> movw tmp, #16-bit >>> adds %Q0, %1, tmp, lsl #16 >>> adc %R0, %R0, #0 >>> >>> It would certainly be less instructions, but at the cost of an >>> additional register - and we'd have to change the fixup code to >>> know about movw. >> >> It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), >> and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), >> but one more instruction for 16MiB boundary. >> >> And maybe: 16MiB is widely used, but 64KiB is rarely used. >> >> So I'm inclined to the current revision. > > Multiplatform kernels (which will be what distros build) will have to > enable this option if they wish to support this platform. So, in that > case it doesn't just impacting a single platform, but all platforms. I will try movw. But it may take a few days, because I feel that the changes will be a little big. > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-17 3:26 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-17 3:26 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/16 15:57, Russell King - ARM Linux admin wrote: > On Wed, Sep 16, 2020 at 09:57:15AM +0800, Leizhen (ThunderTown) wrote: >> On 2020/9/16 3:01, Russell King - ARM Linux admin wrote: >>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>> Currently, only support the kernels where the base of physical memory is >>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>> bits 23-16. The performance will be slightly affected. >>>> >>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>> anyone really needs it. >>>> >>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>> the whole head.S file. So choose it. >>>> >>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>> (above y). >>>> >>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>> --- >>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>> --- a/arch/arm/Kconfig >>>> +++ b/arch/arm/Kconfig >>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>> kernel in system memory. >>>> >>>> This can only be used with non-XIP MMU kernels where the base >>>> - of physical memory is at a 16MB boundary. >>>> + of physical memory is at a 16MiB boundary. >>>> >>>> Only disable this option if you know that you do not require >>>> this feature (eg, building a kernel for a single machine) and >>>> you need to shrink the kernel to the minimal size. >>>> >>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>> + default n >>> >>> Please drop the "default n" - this is the default anyway. >> >> OK, I will remove it. >> >>> >>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>> * in place where 'r' 32 bit operand is expected. >>>> */ >>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>> >>> t is already unsigned long, so this cast is not necessary. >> >> Oh, yes, yes, I copied from the above statement, but forgot to remove it. >> >>> >>> I've been debating whether it would be better to use "movw" for this >>> for ARMv7. In other words: >>> >>> movw tmp, #16-bit >>> adds %Q0, %1, tmp, lsl #16 >>> adc %R0, %R0, #0 >>> >>> It would certainly be less instructions, but at the cost of an >>> additional register - and we'd have to change the fixup code to >>> know about movw. >> >> It's one less instruction for 64KiB boundary && (sizeof(phys_addr_t) == 8), >> and no increase or decrease for 64KiB boundary && (sizeof(phys_addr_t) == 4), >> but one more instruction for 16MiB boundary. >> >> And maybe: 16MiB is widely used, but 64KiB is rarely used. >> >> So I'm inclined to the current revision. > > Multiplatform kernels (which will be what distros build) will have to > enable this option if they wish to support this platform. So, in that > case it doesn't just impacting a single platform, but all platforms. I will try movw. But it may take a few days, because I feel that the changes will be a little big. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-15 19:01 ` Russell King - ARM Linux admin @ 2020-09-17 14:00 ` Ard Biesheuvel -1 siblings, 0 replies; 30+ messages in thread From: Ard Biesheuvel @ 2020-09-17 14:00 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Zhen Lei, Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > > Currently, only support the kernels where the base of physical memory is > > at a 16MiB boundary. Because the add/sub instructions only contains 8bits > > unrotated value. But we can use one more "add/sub" instructions to handle > > bits 23-16. The performance will be slightly affected. > > > > Since most boards meet 16 MiB alignment, so add a new configuration > > option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > > anyone really needs it. > > > > All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > > used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > > the whole head.S file. So choose it. > > > > Because the calculation of "y = x + __pv_offset[63:24]" have been done, > > so we only need to calculate "y = y + __pv_offset[23:16]", that's why > > the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > > in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > > (above y). > > > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > > --- > > arch/arm/Kconfig | 18 +++++++++++++++++- > > arch/arm/include/asm/memory.h | 16 +++++++++++++--- > > arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > > 3 files changed, 49 insertions(+), 10 deletions(-) > > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > > index e00d94b16658765..19fc2c746e2ce29 100644 > > --- a/arch/arm/Kconfig > > +++ b/arch/arm/Kconfig > > @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > > kernel in system memory. > > > > This can only be used with non-XIP MMU kernels where the base > > - of physical memory is at a 16MB boundary. > > + of physical memory is at a 16MiB boundary. > > > > Only disable this option if you know that you do not require > > this feature (eg, building a kernel for a single machine) and > > you need to shrink the kernel to the minimal size. > > > > +config ARM_PATCH_PHYS_VIRT_RADICAL > > + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > > + default n > > Please drop the "default n" - this is the default anyway. > > > @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > > * in place where 'r' 32 bit operand is expected. > > */ > > __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > > +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > > + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > t is already unsigned long, so this cast is not necessary. > > I've been debating whether it would be better to use "movw" for this > for ARMv7. In other words: > > movw tmp, #16-bit > adds %Q0, %1, tmp, lsl #16 > adc %R0, %R0, #0 > > It would certainly be less instructions, but at the cost of an > additional register - and we'd have to change the fixup code to > know about movw. > > Thoughts? > Since LPAE implies v7, we can use movw unconditionally, which is nice. There is no need to use an additional temp register, as we can use the register holding the high word. (There is no need for the mov_hi macro to be separate) 0: movw %R0, #low offset >> 16 adds %Q0, %1, %R0, lsl #16 1: mov %R0, #high offset adc %R0, %R0, #0 .pushsection .pv_table,"a" .long 0b, 1b .popsection The only problem is distinguishing the two mov instructions from each other, but that should not be too hard I think. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-17 14:00 ` Ard Biesheuvel 0 siblings, 0 replies; 30+ messages in thread From: Ard Biesheuvel @ 2020-09-17 14:00 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Zhen Lei, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > > Currently, only support the kernels where the base of physical memory is > > at a 16MiB boundary. Because the add/sub instructions only contains 8bits > > unrotated value. But we can use one more "add/sub" instructions to handle > > bits 23-16. The performance will be slightly affected. > > > > Since most boards meet 16 MiB alignment, so add a new configuration > > option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > > anyone really needs it. > > > > All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > > used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > > the whole head.S file. So choose it. > > > > Because the calculation of "y = x + __pv_offset[63:24]" have been done, > > so we only need to calculate "y = y + __pv_offset[23:16]", that's why > > the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > > in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > > (above y). > > > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > > --- > > arch/arm/Kconfig | 18 +++++++++++++++++- > > arch/arm/include/asm/memory.h | 16 +++++++++++++--- > > arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > > 3 files changed, 49 insertions(+), 10 deletions(-) > > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > > index e00d94b16658765..19fc2c746e2ce29 100644 > > --- a/arch/arm/Kconfig > > +++ b/arch/arm/Kconfig > > @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > > kernel in system memory. > > > > This can only be used with non-XIP MMU kernels where the base > > - of physical memory is at a 16MB boundary. > > + of physical memory is at a 16MiB boundary. > > > > Only disable this option if you know that you do not require > > this feature (eg, building a kernel for a single machine) and > > you need to shrink the kernel to the minimal size. > > > > +config ARM_PATCH_PHYS_VIRT_RADICAL > > + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > > + default n > > Please drop the "default n" - this is the default anyway. > > > @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > > * in place where 'r' 32 bit operand is expected. > > */ > > __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > > +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > > + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > > t is already unsigned long, so this cast is not necessary. > > I've been debating whether it would be better to use "movw" for this > for ARMv7. In other words: > > movw tmp, #16-bit > adds %Q0, %1, tmp, lsl #16 > adc %R0, %R0, #0 > > It would certainly be less instructions, but at the cost of an > additional register - and we'd have to change the fixup code to > know about movw. > > Thoughts? > Since LPAE implies v7, we can use movw unconditionally, which is nice. There is no need to use an additional temp register, as we can use the register holding the high word. (There is no need for the mov_hi macro to be separate) 0: movw %R0, #low offset >> 16 adds %Q0, %1, %R0, lsl #16 1: mov %R0, #high offset adc %R0, %R0, #0 .pushsection .pv_table,"a" .long 0b, 1b .popsection The only problem is distinguishing the two mov instructions from each other, but that should not be too hard I think. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-17 14:00 ` Ard Biesheuvel @ 2020-09-21 3:34 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-21 3:34 UTC (permalink / raw) To: Ard Biesheuvel, Russell King - ARM Linux admin Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/17 22:00, Ard Biesheuvel wrote: > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > <linux@armlinux.org.uk> wrote: >> >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>> Currently, only support the kernels where the base of physical memory is >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>> unrotated value. But we can use one more "add/sub" instructions to handle >>> bits 23-16. The performance will be slightly affected. >>> >>> Since most boards meet 16 MiB alignment, so add a new configuration >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>> anyone really needs it. >>> >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>> the whole head.S file. So choose it. >>> >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>> (above y). >>> >>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>> --- >>> arch/arm/Kconfig | 18 +++++++++++++++++- >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>> 3 files changed, 49 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>> index e00d94b16658765..19fc2c746e2ce29 100644 >>> --- a/arch/arm/Kconfig >>> +++ b/arch/arm/Kconfig >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>> kernel in system memory. >>> >>> This can only be used with non-XIP MMU kernels where the base >>> - of physical memory is at a 16MB boundary. >>> + of physical memory is at a 16MiB boundary. >>> >>> Only disable this option if you know that you do not require >>> this feature (eg, building a kernel for a single machine) and >>> you need to shrink the kernel to the minimal size. >>> >>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>> + default n >> >> Please drop the "default n" - this is the default anyway. >> >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>> * in place where 'r' 32 bit operand is expected. >>> */ >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >> >> t is already unsigned long, so this cast is not necessary. >> >> I've been debating whether it would be better to use "movw" for this >> for ARMv7. In other words: >> >> movw tmp, #16-bit >> adds %Q0, %1, tmp, lsl #16 >> adc %R0, %R0, #0 >> >> It would certainly be less instructions, but at the cost of an >> additional register - and we'd have to change the fixup code to >> know about movw. >> >> Thoughts? >> > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > There is no need to use an additional temp register, as we can use the > register holding the high word. (There is no need for the mov_hi macro > to be separate) > > 0: movw %R0, #low offset >> 16 > adds %Q0, %1, %R0, lsl #16 > 1: mov %R0, #high offset > adc %R0, %R0, #0 > .pushsection .pv_table,"a" > .long 0b, 1b > .popsection > > The only problem is distinguishing the two mov instructions from each The #high offset can also consider use movw, it just save two bytes in the thumb2 scenario. We can store different imm16 value for high_offset and low_offset, so that we can distinguish them in __fixup_a_pv_table(). This will make the final implementation of the code look more clear and consistent, especially THUMB2. Let me try it. > other, but that should not be too hard I think. > > . > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-21 3:34 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-21 3:34 UTC (permalink / raw) To: Ard Biesheuvel, Russell King - ARM Linux admin Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/17 22:00, Ard Biesheuvel wrote: > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > <linux@armlinux.org.uk> wrote: >> >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>> Currently, only support the kernels where the base of physical memory is >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>> unrotated value. But we can use one more "add/sub" instructions to handle >>> bits 23-16. The performance will be slightly affected. >>> >>> Since most boards meet 16 MiB alignment, so add a new configuration >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>> anyone really needs it. >>> >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>> the whole head.S file. So choose it. >>> >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>> (above y). >>> >>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>> --- >>> arch/arm/Kconfig | 18 +++++++++++++++++- >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>> 3 files changed, 49 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>> index e00d94b16658765..19fc2c746e2ce29 100644 >>> --- a/arch/arm/Kconfig >>> +++ b/arch/arm/Kconfig >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>> kernel in system memory. >>> >>> This can only be used with non-XIP MMU kernels where the base >>> - of physical memory is at a 16MB boundary. >>> + of physical memory is at a 16MiB boundary. >>> >>> Only disable this option if you know that you do not require >>> this feature (eg, building a kernel for a single machine) and >>> you need to shrink the kernel to the minimal size. >>> >>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>> + default n >> >> Please drop the "default n" - this is the default anyway. >> >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>> * in place where 'r' 32 bit operand is expected. >>> */ >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >> >> t is already unsigned long, so this cast is not necessary. >> >> I've been debating whether it would be better to use "movw" for this >> for ARMv7. In other words: >> >> movw tmp, #16-bit >> adds %Q0, %1, tmp, lsl #16 >> adc %R0, %R0, #0 >> >> It would certainly be less instructions, but at the cost of an >> additional register - and we'd have to change the fixup code to >> know about movw. >> >> Thoughts? >> > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > There is no need to use an additional temp register, as we can use the > register holding the high word. (There is no need for the mov_hi macro > to be separate) > > 0: movw %R0, #low offset >> 16 > adds %Q0, %1, %R0, lsl #16 > 1: mov %R0, #high offset > adc %R0, %R0, #0 > .pushsection .pv_table,"a" > .long 0b, 1b > .popsection > > The only problem is distinguishing the two mov instructions from each The #high offset can also consider use movw, it just save two bytes in the thumb2 scenario. We can store different imm16 value for high_offset and low_offset, so that we can distinguish them in __fixup_a_pv_table(). This will make the final implementation of the code look more clear and consistent, especially THUMB2. Let me try it. > other, but that should not be too hard I think. > > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-21 3:34 ` Leizhen (ThunderTown) @ 2020-09-21 6:47 ` Ard Biesheuvel -1 siblings, 0 replies; 30+ messages in thread From: Ard Biesheuvel @ 2020-09-21 6:47 UTC (permalink / raw) To: Leizhen (ThunderTown) Cc: Russell King - ARM Linux admin, Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) <thunder.leizhen@huawei.com> wrote: > > > > On 2020/9/17 22:00, Ard Biesheuvel wrote: > > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > > <linux@armlinux.org.uk> wrote: > >> > >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > >>> Currently, only support the kernels where the base of physical memory is > >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits > >>> unrotated value. But we can use one more "add/sub" instructions to handle > >>> bits 23-16. The performance will be slightly affected. > >>> > >>> Since most boards meet 16 MiB alignment, so add a new configuration > >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > >>> anyone really needs it. > >>> > >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > >>> the whole head.S file. So choose it. > >>> > >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, > >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why > >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > >>> (above y). > >>> > >>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > >>> --- > >>> arch/arm/Kconfig | 18 +++++++++++++++++- > >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- > >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > >>> 3 files changed, 49 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > >>> index e00d94b16658765..19fc2c746e2ce29 100644 > >>> --- a/arch/arm/Kconfig > >>> +++ b/arch/arm/Kconfig > >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > >>> kernel in system memory. > >>> > >>> This can only be used with non-XIP MMU kernels where the base > >>> - of physical memory is at a 16MB boundary. > >>> + of physical memory is at a 16MiB boundary. > >>> > >>> Only disable this option if you know that you do not require > >>> this feature (eg, building a kernel for a single machine) and > >>> you need to shrink the kernel to the minimal size. > >>> > >>> +config ARM_PATCH_PHYS_VIRT_RADICAL > >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > >>> + default n > >> > >> Please drop the "default n" - this is the default anyway. > >> > >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > >>> * in place where 'r' 32 bit operand is expected. > >>> */ > >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > >> > >> t is already unsigned long, so this cast is not necessary. > >> > >> I've been debating whether it would be better to use "movw" for this > >> for ARMv7. In other words: > >> > >> movw tmp, #16-bit > >> adds %Q0, %1, tmp, lsl #16 > >> adc %R0, %R0, #0 > >> > >> It would certainly be less instructions, but at the cost of an > >> additional register - and we'd have to change the fixup code to > >> know about movw. > >> > >> Thoughts? > >> > > > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > > > There is no need to use an additional temp register, as we can use the > > register holding the high word. (There is no need for the mov_hi macro > > to be separate) > > > > 0: movw %R0, #low offset >> 16 > > adds %Q0, %1, %R0, lsl #16 > > 1: mov %R0, #high offset > > adc %R0, %R0, #0 > > .pushsection .pv_table,"a" > > .long 0b, 1b > > .popsection > > > > The only problem is distinguishing the two mov instructions from each > > The #high offset can also consider use movw, it just save two bytes in > the thumb2 scenario. We can store different imm16 value for high_offset > and low_offset, so that we can distinguish them in __fixup_a_pv_table(). > > This will make the final implementation of the code look more clear and > consistent, especially THUMB2. > > Let me try it. > Hello Zhen Lei, I am looking into this as well: https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 Could you please test this version on your hardware? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-21 6:47 ` Ard Biesheuvel 0 siblings, 0 replies; 30+ messages in thread From: Ard Biesheuvel @ 2020-09-21 6:47 UTC (permalink / raw) To: Leizhen (ThunderTown) Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) <thunder.leizhen@huawei.com> wrote: > > > > On 2020/9/17 22:00, Ard Biesheuvel wrote: > > On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin > > <linux@armlinux.org.uk> wrote: > >> > >> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: > >>> Currently, only support the kernels where the base of physical memory is > >>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits > >>> unrotated value. But we can use one more "add/sub" instructions to handle > >>> bits 23-16. The performance will be slightly affected. > >>> > >>> Since most boards meet 16 MiB alignment, so add a new configuration > >>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if > >>> anyone really needs it. > >>> > >>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are > >>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in > >>> the whole head.S file. So choose it. > >>> > >>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, > >>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why > >>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() > >>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" > >>> (above y). > >>> > >>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > >>> --- > >>> arch/arm/Kconfig | 18 +++++++++++++++++- > >>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- > >>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ > >>> 3 files changed, 49 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > >>> index e00d94b16658765..19fc2c746e2ce29 100644 > >>> --- a/arch/arm/Kconfig > >>> +++ b/arch/arm/Kconfig > >>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT > >>> kernel in system memory. > >>> > >>> This can only be used with non-XIP MMU kernels where the base > >>> - of physical memory is at a 16MB boundary. > >>> + of physical memory is at a 16MiB boundary. > >>> > >>> Only disable this option if you know that you do not require > >>> this feature (eg, building a kernel for a single machine) and > >>> you need to shrink the kernel to the minimal size. > >>> > >>> +config ARM_PATCH_PHYS_VIRT_RADICAL > >>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" > >>> + default n > >> > >> Please drop the "default n" - this is the default anyway. > >> > >>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) > >>> * in place where 'r' 32 bit operand is expected. > >>> */ > >>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); > >>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL > >>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); > >> > >> t is already unsigned long, so this cast is not necessary. > >> > >> I've been debating whether it would be better to use "movw" for this > >> for ARMv7. In other words: > >> > >> movw tmp, #16-bit > >> adds %Q0, %1, tmp, lsl #16 > >> adc %R0, %R0, #0 > >> > >> It would certainly be less instructions, but at the cost of an > >> additional register - and we'd have to change the fixup code to > >> know about movw. > >> > >> Thoughts? > >> > > > > Since LPAE implies v7, we can use movw unconditionally, which is nice. > > > > There is no need to use an additional temp register, as we can use the > > register holding the high word. (There is no need for the mov_hi macro > > to be separate) > > > > 0: movw %R0, #low offset >> 16 > > adds %Q0, %1, %R0, lsl #16 > > 1: mov %R0, #high offset > > adc %R0, %R0, #0 > > .pushsection .pv_table,"a" > > .long 0b, 1b > > .popsection > > > > The only problem is distinguishing the two mov instructions from each > > The #high offset can also consider use movw, it just save two bytes in > the thumb2 scenario. We can store different imm16 value for high_offset > and low_offset, so that we can distinguish them in __fixup_a_pv_table(). > > This will make the final implementation of the code look more clear and > consistent, especially THUMB2. > > Let me try it. > Hello Zhen Lei, I am looking into this as well: https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 Could you please test this version on your hardware? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-21 6:47 ` Ard Biesheuvel @ 2020-09-21 8:53 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-21 8:53 UTC (permalink / raw) To: Ard Biesheuvel Cc: Russell King - ARM Linux admin, Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/21 14:47, Ard Biesheuvel wrote: > On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) > <thunder.leizhen@huawei.com> wrote: >> >> >> >> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>> <linux@armlinux.org.uk> wrote: >>>> >>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>> Currently, only support the kernels where the base of physical memory is >>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>> bits 23-16. The performance will be slightly affected. >>>>> >>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>> anyone really needs it. >>>>> >>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>> the whole head.S file. So choose it. >>>>> >>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>> (above y). >>>>> >>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>> --- >>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>> --- a/arch/arm/Kconfig >>>>> +++ b/arch/arm/Kconfig >>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>> kernel in system memory. >>>>> >>>>> This can only be used with non-XIP MMU kernels where the base >>>>> - of physical memory is at a 16MB boundary. >>>>> + of physical memory is at a 16MiB boundary. >>>>> >>>>> Only disable this option if you know that you do not require >>>>> this feature (eg, building a kernel for a single machine) and >>>>> you need to shrink the kernel to the minimal size. >>>>> >>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>> + default n >>>> >>>> Please drop the "default n" - this is the default anyway. >>>> >>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>> * in place where 'r' 32 bit operand is expected. >>>>> */ >>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>> >>>> t is already unsigned long, so this cast is not necessary. >>>> >>>> I've been debating whether it would be better to use "movw" for this >>>> for ARMv7. In other words: >>>> >>>> movw tmp, #16-bit >>>> adds %Q0, %1, tmp, lsl #16 >>>> adc %R0, %R0, #0 >>>> >>>> It would certainly be less instructions, but at the cost of an >>>> additional register - and we'd have to change the fixup code to >>>> know about movw. >>>> >>>> Thoughts? >>>> >>> >>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>> >>> There is no need to use an additional temp register, as we can use the >>> register holding the high word. (There is no need for the mov_hi macro >>> to be separate) >>> >>> 0: movw %R0, #low offset >> 16 >>> adds %Q0, %1, %R0, lsl #16 >>> 1: mov %R0, #high offset >>> adc %R0, %R0, #0 >>> .pushsection .pv_table,"a" >>> .long 0b, 1b >>> .popsection >>> >>> The only problem is distinguishing the two mov instructions from each >> >> The #high offset can also consider use movw, it just save two bytes in >> the thumb2 scenario. We can store different imm16 value for high_offset >> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >> >> This will make the final implementation of the code look more clear and >> consistent, especially THUMB2. >> >> Let me try it. >> > > Hello Zhen Lei, > > I am looking into this as well: > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 > > Could you please test this version on your hardware? OK, I will test it on my boards. > > . > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-21 8:53 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-21 8:53 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/21 14:47, Ard Biesheuvel wrote: > On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) > <thunder.leizhen@huawei.com> wrote: >> >> >> >> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>> <linux@armlinux.org.uk> wrote: >>>> >>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>> Currently, only support the kernels where the base of physical memory is >>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>> bits 23-16. The performance will be slightly affected. >>>>> >>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>> anyone really needs it. >>>>> >>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>> the whole head.S file. So choose it. >>>>> >>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>> (above y). >>>>> >>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>> --- >>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>> --- a/arch/arm/Kconfig >>>>> +++ b/arch/arm/Kconfig >>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>> kernel in system memory. >>>>> >>>>> This can only be used with non-XIP MMU kernels where the base >>>>> - of physical memory is at a 16MB boundary. >>>>> + of physical memory is at a 16MiB boundary. >>>>> >>>>> Only disable this option if you know that you do not require >>>>> this feature (eg, building a kernel for a single machine) and >>>>> you need to shrink the kernel to the minimal size. >>>>> >>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>> + default n >>>> >>>> Please drop the "default n" - this is the default anyway. >>>> >>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>> * in place where 'r' 32 bit operand is expected. >>>>> */ >>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>> >>>> t is already unsigned long, so this cast is not necessary. >>>> >>>> I've been debating whether it would be better to use "movw" for this >>>> for ARMv7. In other words: >>>> >>>> movw tmp, #16-bit >>>> adds %Q0, %1, tmp, lsl #16 >>>> adc %R0, %R0, #0 >>>> >>>> It would certainly be less instructions, but at the cost of an >>>> additional register - and we'd have to change the fixup code to >>>> know about movw. >>>> >>>> Thoughts? >>>> >>> >>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>> >>> There is no need to use an additional temp register, as we can use the >>> register holding the high word. (There is no need for the mov_hi macro >>> to be separate) >>> >>> 0: movw %R0, #low offset >> 16 >>> adds %Q0, %1, %R0, lsl #16 >>> 1: mov %R0, #high offset >>> adc %R0, %R0, #0 >>> .pushsection .pv_table,"a" >>> .long 0b, 1b >>> .popsection >>> >>> The only problem is distinguishing the two mov instructions from each >> >> The #high offset can also consider use movw, it just save two bytes in >> the thumb2 scenario. We can store different imm16 value for high_offset >> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >> >> This will make the final implementation of the code look more clear and >> consistent, especially THUMB2. >> >> Let me try it. >> > > Hello Zhen Lei, > > I am looking into this as well: > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 > > Could you please test this version on your hardware? OK, I will test it on my boards. > > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-21 8:53 ` Leizhen (ThunderTown) @ 2020-09-22 12:30 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-22 12:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: > > > On 2020/9/21 14:47, Ard Biesheuvel wrote: >> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >> <thunder.leizhen@huawei.com> wrote: >>> >>> >>> >>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>> <linux@armlinux.org.uk> wrote: >>>>> >>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>> Currently, only support the kernels where the base of physical memory is >>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>> bits 23-16. The performance will be slightly affected. >>>>>> >>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>> anyone really needs it. >>>>>> >>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>> the whole head.S file. So choose it. >>>>>> >>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>> (above y). >>>>>> >>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>> --- >>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>> >>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>> --- a/arch/arm/Kconfig >>>>>> +++ b/arch/arm/Kconfig >>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>> kernel in system memory. >>>>>> >>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>> - of physical memory is at a 16MB boundary. >>>>>> + of physical memory is at a 16MiB boundary. >>>>>> >>>>>> Only disable this option if you know that you do not require >>>>>> this feature (eg, building a kernel for a single machine) and >>>>>> you need to shrink the kernel to the minimal size. >>>>>> >>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>> + default n >>>>> >>>>> Please drop the "default n" - this is the default anyway. >>>>> >>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>> * in place where 'r' 32 bit operand is expected. >>>>>> */ >>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>> >>>>> t is already unsigned long, so this cast is not necessary. >>>>> >>>>> I've been debating whether it would be better to use "movw" for this >>>>> for ARMv7. In other words: >>>>> >>>>> movw tmp, #16-bit >>>>> adds %Q0, %1, tmp, lsl #16 >>>>> adc %R0, %R0, #0 >>>>> >>>>> It would certainly be less instructions, but at the cost of an >>>>> additional register - and we'd have to change the fixup code to >>>>> know about movw. >>>>> >>>>> Thoughts? >>>>> >>>> >>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>> >>>> There is no need to use an additional temp register, as we can use the >>>> register holding the high word. (There is no need for the mov_hi macro >>>> to be separate) >>>> >>>> 0: movw %R0, #low offset >> 16 >>>> adds %Q0, %1, %R0, lsl #16 >>>> 1: mov %R0, #high offset >>>> adc %R0, %R0, #0 >>>> .pushsection .pv_table,"a" >>>> .long 0b, 1b >>>> .popsection >>>> >>>> The only problem is distinguishing the two mov instructions from each >>> >>> The #high offset can also consider use movw, it just save two bytes in >>> the thumb2 scenario. We can store different imm16 value for high_offset >>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>> >>> This will make the final implementation of the code look more clear and >>> consistent, especially THUMB2. >>> >>> Let me try it. >>> >> >> Hello Zhen Lei, >> >> I am looking into this as well: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >> >> Could you please test this version on your hardware? > > OK, I will test it on my boards. Hi Ard Biesheuvel: I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. > >> >> . >> > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > . > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-22 12:30 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-22 12:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: > > > On 2020/9/21 14:47, Ard Biesheuvel wrote: >> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >> <thunder.leizhen@huawei.com> wrote: >>> >>> >>> >>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>> <linux@armlinux.org.uk> wrote: >>>>> >>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>> Currently, only support the kernels where the base of physical memory is >>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>> bits 23-16. The performance will be slightly affected. >>>>>> >>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>> anyone really needs it. >>>>>> >>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>> the whole head.S file. So choose it. >>>>>> >>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>> (above y). >>>>>> >>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>> --- >>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>> >>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>> --- a/arch/arm/Kconfig >>>>>> +++ b/arch/arm/Kconfig >>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>> kernel in system memory. >>>>>> >>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>> - of physical memory is at a 16MB boundary. >>>>>> + of physical memory is at a 16MiB boundary. >>>>>> >>>>>> Only disable this option if you know that you do not require >>>>>> this feature (eg, building a kernel for a single machine) and >>>>>> you need to shrink the kernel to the minimal size. >>>>>> >>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>> + default n >>>>> >>>>> Please drop the "default n" - this is the default anyway. >>>>> >>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>> * in place where 'r' 32 bit operand is expected. >>>>>> */ >>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>> >>>>> t is already unsigned long, so this cast is not necessary. >>>>> >>>>> I've been debating whether it would be better to use "movw" for this >>>>> for ARMv7. In other words: >>>>> >>>>> movw tmp, #16-bit >>>>> adds %Q0, %1, tmp, lsl #16 >>>>> adc %R0, %R0, #0 >>>>> >>>>> It would certainly be less instructions, but at the cost of an >>>>> additional register - and we'd have to change the fixup code to >>>>> know about movw. >>>>> >>>>> Thoughts? >>>>> >>>> >>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>> >>>> There is no need to use an additional temp register, as we can use the >>>> register holding the high word. (There is no need for the mov_hi macro >>>> to be separate) >>>> >>>> 0: movw %R0, #low offset >> 16 >>>> adds %Q0, %1, %R0, lsl #16 >>>> 1: mov %R0, #high offset >>>> adc %R0, %R0, #0 >>>> .pushsection .pv_table,"a" >>>> .long 0b, 1b >>>> .popsection >>>> >>>> The only problem is distinguishing the two mov instructions from each >>> >>> The #high offset can also consider use movw, it just save two bytes in >>> the thumb2 scenario. We can store different imm16 value for high_offset >>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>> >>> This will make the final implementation of the code look more clear and >>> consistent, especially THUMB2. >>> >>> Let me try it. >>> >> >> Hello Zhen Lei, >> >> I am looking into this as well: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >> >> Could you please test this version on your hardware? > > OK, I will test it on my boards. Hi Ard Biesheuvel: I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. > >> >> . >> > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-22 12:30 ` Leizhen (ThunderTown) @ 2020-09-28 1:30 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-28 1:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/22 20:30, Leizhen (ThunderTown) wrote: > > > On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: >> >> >> On 2020/9/21 14:47, Ard Biesheuvel wrote: >>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >>> <thunder.leizhen@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>>> <linux@armlinux.org.uk> wrote: >>>>>> >>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>>> Currently, only support the kernels where the base of physical memory is >>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>>> bits 23-16. The performance will be slightly affected. >>>>>>> >>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>>> anyone really needs it. >>>>>>> >>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>>> the whole head.S file. So choose it. >>>>>>> >>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>>> (above y). >>>>>>> >>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>>> --- >>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>>> >>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>>> --- a/arch/arm/Kconfig >>>>>>> +++ b/arch/arm/Kconfig >>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>>> kernel in system memory. >>>>>>> >>>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>>> - of physical memory is at a 16MB boundary. >>>>>>> + of physical memory is at a 16MiB boundary. >>>>>>> >>>>>>> Only disable this option if you know that you do not require >>>>>>> this feature (eg, building a kernel for a single machine) and >>>>>>> you need to shrink the kernel to the minimal size. >>>>>>> >>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>>> + default n >>>>>> >>>>>> Please drop the "default n" - this is the default anyway. >>>>>> >>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>>> * in place where 'r' 32 bit operand is expected. >>>>>>> */ >>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>>> >>>>>> t is already unsigned long, so this cast is not necessary. >>>>>> >>>>>> I've been debating whether it would be better to use "movw" for this >>>>>> for ARMv7. In other words: >>>>>> >>>>>> movw tmp, #16-bit >>>>>> adds %Q0, %1, tmp, lsl #16 >>>>>> adc %R0, %R0, #0 >>>>>> >>>>>> It would certainly be less instructions, but at the cost of an >>>>>> additional register - and we'd have to change the fixup code to >>>>>> know about movw. >>>>>> >>>>>> Thoughts? >>>>>> >>>>> >>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>>> >>>>> There is no need to use an additional temp register, as we can use the >>>>> register holding the high word. (There is no need for the mov_hi macro >>>>> to be separate) >>>>> >>>>> 0: movw %R0, #low offset >> 16 >>>>> adds %Q0, %1, %R0, lsl #16 >>>>> 1: mov %R0, #high offset >>>>> adc %R0, %R0, #0 >>>>> .pushsection .pv_table,"a" >>>>> .long 0b, 1b >>>>> .popsection >>>>> >>>>> The only problem is distinguishing the two mov instructions from each >>>> >>>> The #high offset can also consider use movw, it just save two bytes in >>>> the thumb2 scenario. We can store different imm16 value for high_offset >>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>>> >>>> This will make the final implementation of the code look more clear and >>>> consistent, especially THUMB2. >>>> >>>> Let me try it. >>>> >>> >>> Hello Zhen Lei, >>> >>> I am looking into this as well: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >>> >>> Could you please test this version on your hardware? >> >> OK, I will test it on my boards. > Hi Ard Biesheuvel: > I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues > from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. Hi, Ard Biesheuvel: I'm sorry to keep you waiting so long. You patch series works well on 2MiB aligned + BE board also. I spent a lot of time, because our 2MiB aligned + BE board loads zImage. Therefore, special processing is required for the following code: arch/arm/boot/compressed/head.S: #ifdef CONFIG_AUTO_ZRELADDR mov r4, pc and r4, r4, #0xf8000000 //currently only support 128MiB alignment add r4, r4, #TEXT_OFFSET #else This is a special scenario that does not conflict with your code framework. So I'm trying to fix it. Tested-by: Zhen Lei <thunder.leizhen@huawei.com> > > >> >>> >>> . >>> >> >> >> _______________________________________________ >> linux-arm-kernel mailing list >> linux-arm-kernel@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >> . >> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-28 1:30 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-28 1:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/22 20:30, Leizhen (ThunderTown) wrote: > > > On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: >> >> >> On 2020/9/21 14:47, Ard Biesheuvel wrote: >>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >>> <thunder.leizhen@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>>> <linux@armlinux.org.uk> wrote: >>>>>> >>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>>> Currently, only support the kernels where the base of physical memory is >>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>>> bits 23-16. The performance will be slightly affected. >>>>>>> >>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>>> anyone really needs it. >>>>>>> >>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>>> the whole head.S file. So choose it. >>>>>>> >>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>>> (above y). >>>>>>> >>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>>> --- >>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>>> >>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>>> --- a/arch/arm/Kconfig >>>>>>> +++ b/arch/arm/Kconfig >>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>>> kernel in system memory. >>>>>>> >>>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>>> - of physical memory is at a 16MB boundary. >>>>>>> + of physical memory is at a 16MiB boundary. >>>>>>> >>>>>>> Only disable this option if you know that you do not require >>>>>>> this feature (eg, building a kernel for a single machine) and >>>>>>> you need to shrink the kernel to the minimal size. >>>>>>> >>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>>> + default n >>>>>> >>>>>> Please drop the "default n" - this is the default anyway. >>>>>> >>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>>> * in place where 'r' 32 bit operand is expected. >>>>>>> */ >>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>>> >>>>>> t is already unsigned long, so this cast is not necessary. >>>>>> >>>>>> I've been debating whether it would be better to use "movw" for this >>>>>> for ARMv7. In other words: >>>>>> >>>>>> movw tmp, #16-bit >>>>>> adds %Q0, %1, tmp, lsl #16 >>>>>> adc %R0, %R0, #0 >>>>>> >>>>>> It would certainly be less instructions, but at the cost of an >>>>>> additional register - and we'd have to change the fixup code to >>>>>> know about movw. >>>>>> >>>>>> Thoughts? >>>>>> >>>>> >>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>>> >>>>> There is no need to use an additional temp register, as we can use the >>>>> register holding the high word. (There is no need for the mov_hi macro >>>>> to be separate) >>>>> >>>>> 0: movw %R0, #low offset >> 16 >>>>> adds %Q0, %1, %R0, lsl #16 >>>>> 1: mov %R0, #high offset >>>>> adc %R0, %R0, #0 >>>>> .pushsection .pv_table,"a" >>>>> .long 0b, 1b >>>>> .popsection >>>>> >>>>> The only problem is distinguishing the two mov instructions from each >>>> >>>> The #high offset can also consider use movw, it just save two bytes in >>>> the thumb2 scenario. We can store different imm16 value for high_offset >>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>>> >>>> This will make the final implementation of the code look more clear and >>>> consistent, especially THUMB2. >>>> >>>> Let me try it. >>>> >>> >>> Hello Zhen Lei, >>> >>> I am looking into this as well: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >>> >>> Could you please test this version on your hardware? >> >> OK, I will test it on my boards. > Hi Ard Biesheuvel: > I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues > from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. Hi, Ard Biesheuvel: I'm sorry to keep you waiting so long. You patch series works well on 2MiB aligned + BE board also. I spent a lot of time, because our 2MiB aligned + BE board loads zImage. Therefore, special processing is required for the following code: arch/arm/boot/compressed/head.S: #ifdef CONFIG_AUTO_ZRELADDR mov r4, pc and r4, r4, #0xf8000000 //currently only support 128MiB alignment add r4, r4, #TEXT_OFFSET #else This is a special scenario that does not conflict with your code framework. So I'm trying to fix it. Tested-by: Zhen Lei <thunder.leizhen@huawei.com> > > >> >>> >>> . >>> >> >> >> _______________________________________________ >> linux-arm-kernel mailing list >> linux-arm-kernel@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >> . >> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-28 1:30 ` Leizhen (ThunderTown) @ 2020-09-28 9:30 ` Leizhen (ThunderTown) -1 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-28 9:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/28 9:30, Leizhen (ThunderTown) wrote: > > > On 2020/9/22 20:30, Leizhen (ThunderTown) wrote: >> >> >> On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: >>> >>> >>> On 2020/9/21 14:47, Ard Biesheuvel wrote: >>>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >>>> <thunder.leizhen@huawei.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>>>> <linux@armlinux.org.uk> wrote: >>>>>>> >>>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>>>> Currently, only support the kernels where the base of physical memory is >>>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>>>> bits 23-16. The performance will be slightly affected. >>>>>>>> >>>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>>>> anyone really needs it. >>>>>>>> >>>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>>>> the whole head.S file. So choose it. >>>>>>>> >>>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>>>> (above y). >>>>>>>> >>>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>>>> --- >>>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>>>> >>>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>>>> --- a/arch/arm/Kconfig >>>>>>>> +++ b/arch/arm/Kconfig >>>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>>>> kernel in system memory. >>>>>>>> >>>>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>>>> - of physical memory is at a 16MB boundary. >>>>>>>> + of physical memory is at a 16MiB boundary. >>>>>>>> >>>>>>>> Only disable this option if you know that you do not require >>>>>>>> this feature (eg, building a kernel for a single machine) and >>>>>>>> you need to shrink the kernel to the minimal size. >>>>>>>> >>>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>>>> + default n >>>>>>> >>>>>>> Please drop the "default n" - this is the default anyway. >>>>>>> >>>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>>>> * in place where 'r' 32 bit operand is expected. >>>>>>>> */ >>>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>>>> >>>>>>> t is already unsigned long, so this cast is not necessary. >>>>>>> >>>>>>> I've been debating whether it would be better to use "movw" for this >>>>>>> for ARMv7. In other words: >>>>>>> >>>>>>> movw tmp, #16-bit >>>>>>> adds %Q0, %1, tmp, lsl #16 >>>>>>> adc %R0, %R0, #0 >>>>>>> >>>>>>> It would certainly be less instructions, but at the cost of an >>>>>>> additional register - and we'd have to change the fixup code to >>>>>>> know about movw. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>> >>>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>>>> >>>>>> There is no need to use an additional temp register, as we can use the >>>>>> register holding the high word. (There is no need for the mov_hi macro >>>>>> to be separate) >>>>>> >>>>>> 0: movw %R0, #low offset >> 16 >>>>>> adds %Q0, %1, %R0, lsl #16 >>>>>> 1: mov %R0, #high offset >>>>>> adc %R0, %R0, #0 >>>>>> .pushsection .pv_table,"a" >>>>>> .long 0b, 1b >>>>>> .popsection >>>>>> >>>>>> The only problem is distinguishing the two mov instructions from each >>>>> >>>>> The #high offset can also consider use movw, it just save two bytes in >>>>> the thumb2 scenario. We can store different imm16 value for high_offset >>>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>>>> >>>>> This will make the final implementation of the code look more clear and >>>>> consistent, especially THUMB2. >>>>> >>>>> Let me try it. >>>>> >>>> >>>> Hello Zhen Lei, >>>> >>>> I am looking into this as well: >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >>>> >>>> Could you please test this version on your hardware? >>> >>> OK, I will test it on my boards. >> Hi Ard Biesheuvel: >> I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues >> from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. > > Hi, Ard Biesheuvel: > I'm sorry to keep you waiting so long. You patch series works well on 2MiB aligned + BE board > also. I spent a lot of time, because our 2MiB aligned + BE board loads zImage. Therefore, special > processing is required for the following code: > > arch/arm/boot/compressed/head.S: > #ifdef CONFIG_AUTO_ZRELADDR > mov r4, pc > and r4, r4, #0xf8000000 //currently only support 128MiB alignment > add r4, r4, #TEXT_OFFSET > #else > > This is a special scenario that does not conflict with your code framework. So I'm trying to fix it. > > Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Hi, Ard Biesheuvel: I just sent the above problem's fix patch. [PATCH 0/2] ARM: decompressor: relax the loading restriction of the decompressed kernel > > >> >> >>> >>>> >>>> . >>>> >>> >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >>> >>> . >>> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-28 9:30 ` Leizhen (ThunderTown) 0 siblings, 0 replies; 30+ messages in thread From: Leizhen (ThunderTown) @ 2020-09-28 9:30 UTC (permalink / raw) To: Ard Biesheuvel Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, Russell King - ARM Linux admin, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel, patches-armlinux On 2020/9/28 9:30, Leizhen (ThunderTown) wrote: > > > On 2020/9/22 20:30, Leizhen (ThunderTown) wrote: >> >> >> On 2020/9/21 16:53, Leizhen (ThunderTown) wrote: >>> >>> >>> On 2020/9/21 14:47, Ard Biesheuvel wrote: >>>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown) >>>> <thunder.leizhen@huawei.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2020/9/17 22:00, Ard Biesheuvel wrote: >>>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin >>>>>> <linux@armlinux.org.uk> wrote: >>>>>>> >>>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote: >>>>>>>> Currently, only support the kernels where the base of physical memory is >>>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits >>>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle >>>>>>>> bits 23-16. The performance will be slightly affected. >>>>>>>> >>>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration >>>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if >>>>>>>> anyone really needs it. >>>>>>>> >>>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are >>>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in >>>>>>>> the whole head.S file. So choose it. >>>>>>>> >>>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done, >>>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why >>>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub() >>>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t" >>>>>>>> (above y). >>>>>>>> >>>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>>>>>>> --- >>>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++- >>>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++--- >>>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------ >>>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-) >>>>>>>> >>>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >>>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644 >>>>>>>> --- a/arch/arm/Kconfig >>>>>>>> +++ b/arch/arm/Kconfig >>>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT >>>>>>>> kernel in system memory. >>>>>>>> >>>>>>>> This can only be used with non-XIP MMU kernels where the base >>>>>>>> - of physical memory is at a 16MB boundary. >>>>>>>> + of physical memory is at a 16MiB boundary. >>>>>>>> >>>>>>>> Only disable this option if you know that you do not require >>>>>>>> this feature (eg, building a kernel for a single machine) and >>>>>>>> you need to shrink the kernel to the minimal size. >>>>>>>> >>>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary" >>>>>>>> + default n >>>>>>> >>>>>>> Please drop the "default n" - this is the default anyway. >>>>>>> >>>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x) >>>>>>>> * in place where 'r' 32 bit operand is expected. >>>>>>>> */ >>>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); >>>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL >>>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16); >>>>>>> >>>>>>> t is already unsigned long, so this cast is not necessary. >>>>>>> >>>>>>> I've been debating whether it would be better to use "movw" for this >>>>>>> for ARMv7. In other words: >>>>>>> >>>>>>> movw tmp, #16-bit >>>>>>> adds %Q0, %1, tmp, lsl #16 >>>>>>> adc %R0, %R0, #0 >>>>>>> >>>>>>> It would certainly be less instructions, but at the cost of an >>>>>>> additional register - and we'd have to change the fixup code to >>>>>>> know about movw. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>> >>>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice. >>>>>> >>>>>> There is no need to use an additional temp register, as we can use the >>>>>> register holding the high word. (There is no need for the mov_hi macro >>>>>> to be separate) >>>>>> >>>>>> 0: movw %R0, #low offset >> 16 >>>>>> adds %Q0, %1, %R0, lsl #16 >>>>>> 1: mov %R0, #high offset >>>>>> adc %R0, %R0, #0 >>>>>> .pushsection .pv_table,"a" >>>>>> .long 0b, 1b >>>>>> .popsection >>>>>> >>>>>> The only problem is distinguishing the two mov instructions from each >>>>> >>>>> The #high offset can also consider use movw, it just save two bytes in >>>>> the thumb2 scenario. We can store different imm16 value for high_offset >>>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table(). >>>>> >>>>> This will make the final implementation of the code look more clear and >>>>> consistent, especially THUMB2. >>>>> >>>>> Let me try it. >>>>> >>>> >>>> Hello Zhen Lei, >>>> >>>> I am looking into this as well: >>>> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2 >>>> >>>> Could you please test this version on your hardware? >>> >>> OK, I will test it on my boards. >> Hi Ard Biesheuvel: >> I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues >> from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow. > > Hi, Ard Biesheuvel: > I'm sorry to keep you waiting so long. You patch series works well on 2MiB aligned + BE board > also. I spent a lot of time, because our 2MiB aligned + BE board loads zImage. Therefore, special > processing is required for the following code: > > arch/arm/boot/compressed/head.S: > #ifdef CONFIG_AUTO_ZRELADDR > mov r4, pc > and r4, r4, #0xf8000000 //currently only support 128MiB alignment > add r4, r4, #TEXT_OFFSET > #else > > This is a special scenario that does not conflict with your code framework. So I'm trying to fix it. > > Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Hi, Ard Biesheuvel: I just sent the above problem's fix patch. [PATCH 0/2] ARM: decompressor: relax the loading restriction of the decompressed kernel > > >> >> >>> >>>> >>>> . >>>> >>> >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >>> >>> . >>> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 0/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary 2020-09-15 13:16 ` Zhen Lei @ 2020-09-15 13:31 ` Russell King - ARM Linux admin -1 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-15 13:31 UTC (permalink / raw) To: Zhen Lei Cc: Daniel Lezcano, Thomas Gleixner, Andrew Morton, Catalin Marinas, linux-arm-kernel, linux-kernel, Libin, Kefeng Wang, Jianguo Chen On Tue, Sep 15, 2020 at 09:16:13PM +0800, Zhen Lei wrote: > v1 --> v2: > Nothing changed, but add mail list: patches@armlinux.org.uk It isn't a mailing list, it's a bot, and it should only be copied when you're ready to submit the patches, and only after they've been reviewed. It queues the patches for me to eventually apply, so I don't have to wade through tens of thousands of emails to find (and likely miss) the appropriate patches. It also wants to have a KernelVersion: tag somewhere in every patch email, which has proven to be extremely valuable when applying. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 0/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary @ 2020-09-15 13:31 ` Russell King - ARM Linux admin 0 siblings, 0 replies; 30+ messages in thread From: Russell King - ARM Linux admin @ 2020-09-15 13:31 UTC (permalink / raw) To: Zhen Lei Cc: Jianguo Chen, Kefeng Wang, Catalin Marinas, Daniel Lezcano, linux-kernel, Libin, Thomas Gleixner, Andrew Morton, linux-arm-kernel On Tue, Sep 15, 2020 at 09:16:13PM +0800, Zhen Lei wrote: > v1 --> v2: > Nothing changed, but add mail list: patches@armlinux.org.uk It isn't a mailing list, it's a bot, and it should only be copied when you're ready to submit the patches, and only after they've been reviewed. It queues the patches for me to eventually apply, so I don't have to wade through tens of thousands of emails to find (and likely miss) the appropriate patches. It also wants to have a KernelVersion: tag somewhere in every patch email, which has proven to be extremely valuable when applying. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2020-09-28 9:32 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-15 13:16 [PATCH v2 0/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary Zhen Lei 2020-09-15 13:16 ` Zhen Lei 2020-09-15 13:16 ` [PATCH v2 1/2] ARM: fix trivial comments in head.S Zhen Lei 2020-09-15 13:16 ` Zhen Lei 2020-09-15 13:16 ` [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary Zhen Lei 2020-09-15 13:16 ` Zhen Lei 2020-09-15 19:01 ` Russell King - ARM Linux admin 2020-09-15 19:01 ` Russell King - ARM Linux admin 2020-09-16 1:57 ` Leizhen (ThunderTown) 2020-09-16 1:57 ` Leizhen (ThunderTown) 2020-09-16 7:57 ` Russell King - ARM Linux admin 2020-09-16 7:57 ` Russell King - ARM Linux admin 2020-09-17 3:26 ` Leizhen (ThunderTown) 2020-09-17 3:26 ` Leizhen (ThunderTown) 2020-09-17 14:00 ` Ard Biesheuvel 2020-09-17 14:00 ` Ard Biesheuvel 2020-09-21 3:34 ` Leizhen (ThunderTown) 2020-09-21 3:34 ` Leizhen (ThunderTown) 2020-09-21 6:47 ` Ard Biesheuvel 2020-09-21 6:47 ` Ard Biesheuvel 2020-09-21 8:53 ` Leizhen (ThunderTown) 2020-09-21 8:53 ` Leizhen (ThunderTown) 2020-09-22 12:30 ` Leizhen (ThunderTown) 2020-09-22 12:30 ` Leizhen (ThunderTown) 2020-09-28 1:30 ` Leizhen (ThunderTown) 2020-09-28 1:30 ` Leizhen (ThunderTown) 2020-09-28 9:30 ` Leizhen (ThunderTown) 2020-09-28 9:30 ` Leizhen (ThunderTown) 2020-09-15 13:31 ` [PATCH v2 0/2] " Russell King - ARM Linux admin 2020-09-15 13:31 ` Russell King - ARM Linux admin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.