linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] arm64: clean up trampoline vector loads
@ 2020-03-16 12:40 Rémi Denis-Courmont
  2020-03-17 22:30 ` Will Deacon
  2020-03-18 17:57 ` Catalin Marinas
  0 siblings, 2 replies; 15+ messages in thread
From: Rémi Denis-Courmont @ 2020-03-16 12:40 UTC (permalink / raw)
  To: catalin.marinas, will, linux-arm-kernel; +Cc: mark.rutland, linux-kernel

From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>

This switches from custom instruction patterns to the regular large
memory model sequence with ADRP and LDR. In doing so, the ADD
instruction can be eliminated in the SDEI handler, and the code no
longer assumes that the trampoline vectors and the vectors address both
start on a page boundary.

Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
---
 arch/arm64/kernel/entry.S | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e5d4e30ee242..24f828739696 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -805,9 +805,9 @@ alternative_else_nop_endif
 2:
 	tramp_map_kernel	x30
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x30, tramp_vectors + PAGE_SIZE
+	adrp	x30, tramp_vectors + PAGE_SIZE
 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
-	ldr	x30, [x30]
+	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
 #else
 	ldr	x30, =vectors
 #endif
@@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
 1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
 
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x4, tramp_vectors + PAGE_SIZE
-	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
-	ldr	x4, [x4]
+	adrp	x4, tramp_vectors + PAGE_SIZE
+	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
 #else
 	ldr	x4, =__sdei_asm_handler
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-16 12:40 [PATCH 1/3] arm64: clean up trampoline vector loads Rémi Denis-Courmont
@ 2020-03-17 22:30 ` Will Deacon
  2020-03-18 17:57 ` Catalin Marinas
  1 sibling, 0 replies; 15+ messages in thread
From: Will Deacon @ 2020-03-17 22:30 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: catalin.marinas, linux-arm-kernel, mark.rutland, linux-kernel

On Mon, Mar 16, 2020 at 02:40:44PM +0200, Rémi Denis-Courmont wrote:
> From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> 
> This switches from custom instruction patterns to the regular large
> memory model sequence with ADRP and LDR. In doing so, the ADD
> instruction can be eliminated in the SDEI handler, and the code no
> longer assumes that the trampoline vectors and the vectors address both
> start on a page boundary.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> ---
>  arch/arm64/kernel/entry.S | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5d4e30ee242..24f828739696 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -805,9 +805,9 @@ alternative_else_nop_endif
>  2:
>  	tramp_map_kernel	x30
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x30, tramp_vectors + PAGE_SIZE
> +	adrp	x30, tramp_vectors + PAGE_SIZE
>  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> -	ldr	x30, [x30]
> +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
>  #else
>  	ldr	x30, =vectors
>  #endif
> @@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
>  1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
>  
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x4, tramp_vectors + PAGE_SIZE
> -	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
> -	ldr	x4, [x4]
> +	adrp	x4, tramp_vectors + PAGE_SIZE
> +	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
>  #else
>  	ldr	x4, =__sdei_asm_handler
>  #endif

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-16 12:40 [PATCH 1/3] arm64: clean up trampoline vector loads Rémi Denis-Courmont
  2020-03-17 22:30 ` Will Deacon
@ 2020-03-18 17:57 ` Catalin Marinas
  2020-03-18 18:06   ` Catalin Marinas
  1 sibling, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2020-03-18 17:57 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: will, linux-arm-kernel, mark.rutland, linux-kernel

On Mon, Mar 16, 2020 at 02:40:44PM +0200, Rémi Denis-Courmont wrote:
> From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> 
> This switches from custom instruction patterns to the regular large
> memory model sequence with ADRP and LDR. In doing so, the ADD
> instruction can be eliminated in the SDEI handler, and the code no
> longer assumes that the trampoline vectors and the vectors address both
> start on a page boundary.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>

I queued the 3 trampoline patches for 5.7. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-18 17:57 ` Catalin Marinas
@ 2020-03-18 18:06   ` Catalin Marinas
  2020-03-18 18:29     ` Rémi Denis-Courmont
  0 siblings, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2020-03-18 18:06 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: mark.rutland, will, linux-kernel, linux-arm-kernel

On Wed, Mar 18, 2020 at 05:57:09PM +0000, Catalin Marinas wrote:
> On Mon, Mar 16, 2020 at 02:40:44PM +0200, Rémi Denis-Courmont wrote:
> > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > 
> > This switches from custom instruction patterns to the regular large
> > memory model sequence with ADRP and LDR. In doing so, the ADD
> > instruction can be eliminated in the SDEI handler, and the code no
> > longer assumes that the trampoline vectors and the vectors address both
> > start on a page boundary.
> > 
> > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> 
> I queued the 3 trampoline patches for 5.7. Thanks.

... and removed. I applied them on top of arm64 for-next/asm-annotations
and with defconfig I get:

  LD      .tmp_vmlinux1
arch/arm64/kernel/entry.o: in function `tramp_vectors':
arch/arm64/kernel/entry.S:838:(.entry.tramp.text+0x43c): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:838: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:839:(.entry.tramp.text+0x4bc): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:839: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:840:(.entry.tramp.text+0x53c): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:840: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:841:(.entry.tramp.text+0x5bc): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:841: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:843:(.entry.tramp.text+0x638): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:843: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:844:(.entry.tramp.text+0x6b8): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:844: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:845:(.entry.tramp.text+0x738): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:845: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
arch/arm64/kernel/entry.S:846:(.entry.tramp.text+0x7b8): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `__entry_tramp_data_start' defined in .rodata section in arch/arm64/kernel/entry.o
ld: arch/arm64/kernel/entry.S:846: warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
make[1]: *** [Makefile:1077: vmlinux] Error 1

I haven't bisected to see which patch caused this issue.

$ gcc --version
gcc (Debian 9.2.1-30) 9.2.1 20200224

$ ld --version
GNU ld (GNU Binutils for Debian) 2.34

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-18 18:06   ` Catalin Marinas
@ 2020-03-18 18:29     ` Rémi Denis-Courmont
  2020-03-18 19:48       ` Remi Denis-Courmont
  0 siblings, 1 reply; 15+ messages in thread
From: Rémi Denis-Courmont @ 2020-03-18 18:29 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: mark.rutland, will, linux-kernel, linux-arm-kernel

Le keskiviikkona 18. maaliskuuta 2020, 20.06.30 EET Catalin Marinas a écrit :
> On Wed, Mar 18, 2020 at 05:57:09PM +0000, Catalin Marinas wrote:
> > On Mon, Mar 16, 2020 at 02:40:44PM +0200, Rémi Denis-Courmont wrote:
> > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > 
> > > This switches from custom instruction patterns to the regular large
> > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > instruction can be eliminated in the SDEI handler, and the code no
> > > longer assumes that the trampoline vectors and the vectors address both
> > > start on a page boundary.
> > > 
> > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > 
> > I queued the 3 trampoline patches for 5.7. Thanks.
> 
> ... and removed. I applied them on top of arm64 for-next/asm-annotations
> and with defconfig I get:
> 
>   LD      .tmp_vmlinux1
> arch/arm64/kernel/entry.o: in function `tramp_vectors':
> arch/arm64/kernel/entry.S:838:(.entry.tramp.text+0x43c): relocation
> truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol
> `__entry_tramp_data_start' defined in .rodata section in
> 
> I haven't bisected to see which patch caused this issue.

Uho, right :-( It only builds with SDEI enabled :-$

I'll check further.

-- 
Rémi Denis-Courmont
http://www.remlab.net/




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-18 18:29     ` Rémi Denis-Courmont
@ 2020-03-18 19:48       ` Remi Denis-Courmont
  0 siblings, 0 replies; 15+ messages in thread
From: Remi Denis-Courmont @ 2020-03-18 19:48 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: mark.rutland, will, linux-kernel, linux-arm-kernel

Le 2020-03-18 20:29, Rémi Denis-Courmont a écrit :
> Le keskiviikkona 18. maaliskuuta 2020, 20.06.30 EET Catalin Marinas a 
> écrit :
>> On Wed, Mar 18, 2020 at 05:57:09PM +0000, Catalin Marinas wrote:
>> > On Mon, Mar 16, 2020 at 02:40:44PM +0200, Rémi Denis-Courmont wrote:
>> > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
>> > >
>> > > This switches from custom instruction patterns to the regular large
>> > > memory model sequence with ADRP and LDR. In doing so, the ADD
>> > > instruction can be eliminated in the SDEI handler, and the code no
>> > > longer assumes that the trampoline vectors and the vectors address both
>> > > start on a page boundary.
>> > >
>> > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
>> >
>> > I queued the 3 trampoline patches for 5.7. Thanks.
>> 
>> ... and removed. I applied them on top of arm64 
>> for-next/asm-annotations
>> and with defconfig I get:
>> 
>>   LD      .tmp_vmlinux1
>> arch/arm64/kernel/entry.o: in function `tramp_vectors':
>> arch/arm64/kernel/entry.S:838:(.entry.tramp.text+0x43c): relocation
>> truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol
>> `__entry_tramp_data_start' defined in .rodata section in
>> 
>> I haven't bisected to see which patch caused this issue.

It's the third patch.

> Uho, right :-( It only builds with SDEI enabled :-$
> 
> I'll check further.

It seems that the SYM_DATA_START macro does not align the data on its 
natural boundary. I guess that is all fine on x86 where data needs not 
be aligned, but it leads to this kind of mischief on arm64. Though even 
then, the address is of course actually aligned correctly on an 8-bytes 
boundary, so I suppose binutils is just being pointlessly pedantic here?

-- 
Rémi Denis-Courmont

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-24 10:52             ` Mark Rutland
@ 2020-03-24 11:23               ` Catalin Marinas
  0 siblings, 0 replies; 15+ messages in thread
From: Catalin Marinas @ 2020-03-24 11:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Rémi Denis-Courmont, will, james.morse, linux-arm-kernel,
	linux-kernel

On Tue, Mar 24, 2020 at 10:52:17AM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > > below doesn't solve the issue I'm seeing (only reverting patch 3).
> > 
> > AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> > in the main vmlinux does not match the architected offset inside the fixmap. If 
> > so, then using the symbol directly will not work at all.
> 
> Indeed. I can't see a neat way of avoiding this right now, so should we
> drop these patches and leave the code as-is (but with comments as to the
> special requirements that it has)?

I'm going to drop these three patches from -next for now but I can take
any updated comments (they are pretty much missing from this code).

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 20:42           ` Rémi Denis-Courmont
  2020-03-24 10:37             ` Catalin Marinas
@ 2020-03-24 10:52             ` Mark Rutland
  2020-03-24 11:23               ` Catalin Marinas
  1 sibling, 1 reply; 15+ messages in thread
From: Mark Rutland @ 2020-03-24 10:52 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: Catalin Marinas, will, james.morse, linux-arm-kernel, linux-kernel

On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > > 
> > > > > > This switches from custom instruction patterns to the regular large
> > > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > > both
> > > > > > start on a page boundary.
> > > > > > 
> > > > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > > ---
> > > > > > 
> > > > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > > index e5d4e30ee242..24f828739696 100644
> > > > > > --- a/arch/arm64/kernel/entry.S
> > > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > > > 
> > > > > >  2:
> > > > > >  	tramp_map_kernel	x30
> > > > > >  
> > > > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > > > 
> > > > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > > > 
> > > > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > > > 
> > > > > > -	ldr	x30, [x30]
> > > > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > > > 
> > > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > > __entry_tramp_data_start.
> > > > > 
> > > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > > lo12
> > > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > > > 
> > > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> > > 
> > > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > > issue still stands -- we haven't accounted for bits 15:12, as those can
> > > differ between tramp_vectors and __entry_tramp_data_start.
> 
> Does that mean that the SDEI code never worked with page size > 4 KiB?

I think this happens to work, but is fragile. Because nothing happens to
get placed in .rodata between the _entry_tramp_data_start data and the
__sdei_asm_trampoline_next_handler data, the
__sdei_asm_trampoline_next_handler data doesn't spill into a separate
page from the _entry_tramp_data_start data.

If we did start adding stuff into .rodata between those two, there'd be
a bigger risk of things going wrong. That was why I suggested a
.entry.tramp.data section previously.

> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
> 
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> in the main vmlinux does not match the architected offset inside the fixmap. If 
> so, then using the symbol directly will not work at all.

Indeed. I can't see a neat way of avoiding this right now, so should we
drop these patches and leave the code as-is (but with comments as to the
special requirements that it has)?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 20:42           ` Rémi Denis-Courmont
@ 2020-03-24 10:37             ` Catalin Marinas
  2020-03-24 10:52             ` Mark Rutland
  1 sibling, 0 replies; 15+ messages in thread
From: Catalin Marinas @ 2020-03-24 10:37 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: Mark Rutland, will, james.morse, linux-arm-kernel, linux-kernel

On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
> 
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> in the main vmlinux does not match the architected offset inside the fixmap. If 
> so, then using the symbol directly will not work at all.

You are right, it broke the defconfig as well.

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 19:04         ` Catalin Marinas
@ 2020-03-23 20:42           ` Rémi Denis-Courmont
  2020-03-24 10:37             ` Catalin Marinas
  2020-03-24 10:52             ` Mark Rutland
  0 siblings, 2 replies; 15+ messages in thread
From: Rémi Denis-Courmont @ 2020-03-23 20:42 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, will, james.morse, linux-arm-kernel, linux-kernel

Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > 
> > > > > This switches from custom instruction patterns to the regular large
> > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > both
> > > > > start on a page boundary.
> > > > > 
> > > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > ---
> > > > > 
> > > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > index e5d4e30ee242..24f828739696 100644
> > > > > --- a/arch/arm64/kernel/entry.S
> > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > > 
> > > > >  2:
> > > > >  	tramp_map_kernel	x30
> > > > >  
> > > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > > 
> > > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > > 
> > > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > > 
> > > > > -	ldr	x30, [x30]
> > > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > > 
> > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > __entry_tramp_data_start.
> > > > 
> > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > lo12
> > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > > 
> > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> > 
> > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > issue still stands -- we haven't accounted for bits 15:12, as those can
> > differ between tramp_vectors and __entry_tramp_data_start.

Does that mean that the SDEI code never worked with page size > 4 KiB?

> Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> below doesn't solve the issue I'm seeing (only reverting patch 3).

AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
in the main vmlinux does not match the architected offset inside the fixmap. If 
so, then using the symbol directly will not work at all.




> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ca1340eb46d8..4cc9d1df3985 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -810,7 +810,7 @@ alternative_else_nop_endif
>  2:
>  	tramp_map_kernel	x30
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adrp	x30, tramp_vectors + PAGE_SIZE
> +	adrp	x30, __entry_tramp_data_start
>  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
>  	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
>  #else
> @@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
>  1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
> 
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adrp	x4, tramp_vectors + PAGE_SIZE
> +	adrp	x4, __sdei_asm_trampoline_next_handler
>  	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
>  #else
>  	ldr	x4, =__sdei_asm_handler


-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 12:14       ` Mark Rutland
@ 2020-03-23 19:04         ` Catalin Marinas
  2020-03-23 20:42           ` Rémi Denis-Courmont
  0 siblings, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2020-03-23 19:04 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Rémi Denis-Courmont, will, james.morse, linux-arm-kernel,
	linux-kernel

On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > 
> > > > This switches from custom instruction patterns to the regular large
> > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > longer assumes that the trampoline vectors and the vectors address both
> > > > start on a page boundary.
> > > > 
> > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > ---
> > > > 
> > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > index e5d4e30ee242..24f828739696 100644
> > > > --- a/arch/arm64/kernel/entry.S
> > > > +++ b/arch/arm64/kernel/entry.S
> > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > 
> > > >  2:
> > > >  	tramp_map_kernel	x30
> > > >  
> > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > 
> > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > 
> > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > 
> > > > -	ldr	x30, [x30]
> > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > 
> > > I think this is busted for !4K kernels once we reduce the alignment of
> > > __entry_tramp_data_start.
> > > 
> > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > 
> > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> 
> Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> issue still stands -- we haven't accounted for bits 15:12, as those can
> differ between tramp_vectors and __entry_tramp_data_start.

Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
below doesn't solve the issue I'm seeing (only reverting patch 3).

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ca1340eb46d8..4cc9d1df3985 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -810,7 +810,7 @@ alternative_else_nop_endif
 2:
 	tramp_map_kernel	x30
 #ifdef CONFIG_RANDOMIZE_BASE
-	adrp	x30, tramp_vectors + PAGE_SIZE
+	adrp	x30, __entry_tramp_data_start
 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
 	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
 #else
@@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
 1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
 
 #ifdef CONFIG_RANDOMIZE_BASE
-	adrp	x4, tramp_vectors + PAGE_SIZE
+	adrp	x4, __sdei_asm_trampoline_next_handler
 	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
 #else
 	ldr	x4, =__sdei_asm_handler

-- 
Catalin

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 12:08     ` Rémi Denis-Courmont
@ 2020-03-23 12:14       ` Mark Rutland
  2020-03-23 19:04         ` Catalin Marinas
  0 siblings, 1 reply; 15+ messages in thread
From: Mark Rutland @ 2020-03-23 12:14 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: catalin.marinas, will, linux-arm-kernel, james.morse, linux-kernel

On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > 
> > > This switches from custom instruction patterns to the regular large
> > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > instruction can be eliminated in the SDEI handler, and the code no
> > > longer assumes that the trampoline vectors and the vectors address both
> > > start on a page boundary.
> > > 
> > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > ---
> > > 
> > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > index e5d4e30ee242..24f828739696 100644
> > > --- a/arch/arm64/kernel/entry.S
> > > +++ b/arch/arm64/kernel/entry.S
> > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > 
> > >  2:
> > >  	tramp_map_kernel	x30
> > >  
> > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > 
> > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > 
> > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > 
> > > -	ldr	x30, [x30]
> > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > 
> > I think this is busted for !4K kernels once we reduce the alignment of
> > __entry_tramp_data_start.
> > 
> > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> 
> IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

Sorry, I had erroneously assumed tramp_vectors was page aligned. The
issue still stands -- we haven't accounted for bits 15:12, as those can
differ between tramp_vectors and __entry_tramp_data_start.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-23 12:07   ` Mark Rutland
@ 2020-03-23 12:08     ` Rémi Denis-Courmont
  2020-03-23 12:14       ` Mark Rutland
  0 siblings, 1 reply; 15+ messages in thread
From: Rémi Denis-Courmont @ 2020-03-23 12:08 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, will, linux-arm-kernel, james.morse, linux-kernel

Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > 
> > This switches from custom instruction patterns to the regular large
> > memory model sequence with ADRP and LDR. In doing so, the ADD
> > instruction can be eliminated in the SDEI handler, and the code no
> > longer assumes that the trampoline vectors and the vectors address both
> > start on a page boundary.
> > 
> > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > ---
> > 
> >  arch/arm64/kernel/entry.S | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index e5d4e30ee242..24f828739696 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > 
> >  2:
> >  	tramp_map_kernel	x30
> >  
> >  #ifdef CONFIG_RANDOMIZE_BASE
> > 
> > -	adr	x30, tramp_vectors + PAGE_SIZE
> > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > 
> >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > 
> > -	ldr	x30, [x30]
> > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> 
> I think this is busted for !4K kernels once we reduce the alignment of
> __entry_tramp_data_start.
> 
> The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> relocation gives us bits 11:0, so we haven't accounted for bits 15:12.

IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

I rather suspect that the problem is with my C code diff assuming that 
PAGE_MASK is 4095.

-- 
Rémi Denis-Courmont
http://www.remlab.net/




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-19  9:14 ` [PATCH 1/3] arm64: clean up trampoline vector loads Rémi Denis-Courmont
@ 2020-03-23 12:07   ` Mark Rutland
  2020-03-23 12:08     ` Rémi Denis-Courmont
  0 siblings, 1 reply; 15+ messages in thread
From: Mark Rutland @ 2020-03-23 12:07 UTC (permalink / raw)
  To: Rémi Denis-Courmont
  Cc: catalin.marinas, will, linux-arm-kernel, james.morse, linux-kernel

On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> 
> This switches from custom instruction patterns to the regular large
> memory model sequence with ADRP and LDR. In doing so, the ADD
> instruction can be eliminated in the SDEI handler, and the code no
> longer assumes that the trampoline vectors and the vectors address both
> start on a page boundary.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> ---
>  arch/arm64/kernel/entry.S | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5d4e30ee242..24f828739696 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -805,9 +805,9 @@ alternative_else_nop_endif
>  2:
>  	tramp_map_kernel	x30
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x30, tramp_vectors + PAGE_SIZE
> +	adrp	x30, tramp_vectors + PAGE_SIZE
>  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> -	ldr	x30, [x30]
> +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]

I think this is busted for !4K kernels once we reduce the alignment of
__entry_tramp_data_start.

The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
I think that's what's causing the hang Catalin sees with 64K pages (and
would also be a problem for 16K pages).

Ideally, we'd account for those bits with the ADRP, but I'm not sure
that an ELF relocation can encode symbol + addr + symbol:15-12, so we
likely nned more instructions to explicitly mask that in.

... either that, or leave this page aligned.

>  #else
>  	ldr	x30, =vectors
>  #endif
> @@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
>  1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
>  
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x4, tramp_vectors + PAGE_SIZE
> -	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
> -	ldr	x4, [x4]
> +	adrp	x4, tramp_vectors + PAGE_SIZE
> +	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]

Likewise here.

Thanks,
Mark.

>  #else
>  	ldr	x4, =__sdei_asm_handler
>  #endif
> -- 
> 2.26.0.rc2
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] arm64: clean up trampoline vector loads
  2020-03-19  9:12 [PATCHv3 0/3] clean up KPTI / SDEI trampoline data alignment Rémi Denis-Courmont
@ 2020-03-19  9:14 ` Rémi Denis-Courmont
  2020-03-23 12:07   ` Mark Rutland
  0 siblings, 1 reply; 15+ messages in thread
From: Rémi Denis-Courmont @ 2020-03-19  9:14 UTC (permalink / raw)
  To: catalin.marinas, will, linux-arm-kernel
  Cc: mark.rutland, james.morse, linux-kernel

From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>

This switches from custom instruction patterns to the regular large
memory model sequence with ADRP and LDR. In doing so, the ADD
instruction can be eliminated in the SDEI handler, and the code no
longer assumes that the trampoline vectors and the vectors address both
start on a page boundary.

Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
---
 arch/arm64/kernel/entry.S | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e5d4e30ee242..24f828739696 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -805,9 +805,9 @@ alternative_else_nop_endif
 2:
 	tramp_map_kernel	x30
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x30, tramp_vectors + PAGE_SIZE
+	adrp	x30, tramp_vectors + PAGE_SIZE
 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
-	ldr	x30, [x30]
+	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
 #else
 	ldr	x30, =vectors
 #endif
@@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
 1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
 
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x4, tramp_vectors + PAGE_SIZE
-	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
-	ldr	x4, [x4]
+	adrp	x4, tramp_vectors + PAGE_SIZE
+	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
 #else
 	ldr	x4, =__sdei_asm_handler
 #endif
-- 
2.26.0.rc2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-24 11:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-16 12:40 [PATCH 1/3] arm64: clean up trampoline vector loads Rémi Denis-Courmont
2020-03-17 22:30 ` Will Deacon
2020-03-18 17:57 ` Catalin Marinas
2020-03-18 18:06   ` Catalin Marinas
2020-03-18 18:29     ` Rémi Denis-Courmont
2020-03-18 19:48       ` Remi Denis-Courmont
2020-03-19  9:12 [PATCHv3 0/3] clean up KPTI / SDEI trampoline data alignment Rémi Denis-Courmont
2020-03-19  9:14 ` [PATCH 1/3] arm64: clean up trampoline vector loads Rémi Denis-Courmont
2020-03-23 12:07   ` Mark Rutland
2020-03-23 12:08     ` Rémi Denis-Courmont
2020-03-23 12:14       ` Mark Rutland
2020-03-23 19:04         ` Catalin Marinas
2020-03-23 20:42           ` Rémi Denis-Courmont
2020-03-24 10:37             ` Catalin Marinas
2020-03-24 10:52             ` Mark Rutland
2020-03-24 11:23               ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).