All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-14  9:56 ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

This is a respin of the adr_l/ldr_l code I wrote some years ago in the
context of my KASLR proof of concept for 32-bit ARM.

A new use case came up, in the form of Clang, which does not implement
the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
references that don't fit into a ARM adr instruction, we need something
else. Patch #2 addresses an actual Clang build issue of this nature, by
replacing an occurrence of adrl with adr_l.

I have included my existing cleanup patches that were built on top of the
adr_l macro, which replace several occurrences of open coded arithmetic to
calculate runtime addresses based on link time virtual addresses stored
in literals.

Note that all of these patches with the exception of #2 were reviewed or
acked by Nico before, but given that this was a while ago (and the fact
that neither of us work for Linaro anymore), I have dropped these. Note
that only patch #1 deviates significantly from the last version that I
sent out, the remaining ones were just freshened up (and their commit
logs slightly expanded).

Cc: Russell King <linux@armlinux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Peter Smith <Peter.Smith@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>

Ard Biesheuvel (12):
  ARM: assembler: introduce adr_l, ldr_l and str_l macros
  ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
  ARM: module: add support for place relative relocations
  ARM: head-common.S: use PC-relative insn sequence for __proc_info
  ARM: head-common.S: use PC-relative insn sequence for idmap creation
  ARM: head.S: use PC-relative insn sequence for secondary_data
  ARM: kernel: use relative references for UP/SMP alternatives
  ARM: head: use PC-relative insn sequence for __smp_alt
  ARM: sleep.S: use PC-relative insn sequence for
    sleep_save_sp/mpidr_hash
  ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
  ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
  ARM: kvm: replace open coded VA->PA calculations with adr_l call

 arch/arm/boot/compressed/head.S  | 18 +---
 arch/arm/include/asm/assembler.h | 88 ++++++++++++++++++-
 arch/arm/include/asm/elf.h       |  5 ++
 arch/arm/include/asm/processor.h |  2 +-
 arch/arm/kernel/head-common.S    | 22 ++---
 arch/arm/kernel/head.S           | 90 +++++---------------
 arch/arm/kernel/hyp-stub.S       | 27 +++---
 arch/arm/kernel/module.c         | 20 ++++-
 arch/arm/kernel/sleep.S          | 19 ++---
 9 files changed, 159 insertions(+), 132 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-14  9:56 ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

This is a respin of the adr_l/ldr_l code I wrote some years ago in the
context of my KASLR proof of concept for 32-bit ARM.

A new use case came up, in the form of Clang, which does not implement
the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
references that don't fit into a ARM adr instruction, we need something
else. Patch #2 addresses an actual Clang build issue of this nature, by
replacing an occurrence of adrl with adr_l.

I have included my existing cleanup patches that were built on top of the
adr_l macro, which replace several occurrences of open coded arithmetic to
calculate runtime addresses based on link time virtual addresses stored
in literals.

Note that all of these patches with the exception of #2 were reviewed or
acked by Nico before, but given that this was a while ago (and the fact
that neither of us work for Linaro anymore), I have dropped these. Note
that only patch #1 deviates significantly from the last version that I
sent out, the remaining ones were just freshened up (and their commit
logs slightly expanded).

Cc: Russell King <linux@armlinux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Peter Smith <Peter.Smith@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>

Ard Biesheuvel (12):
  ARM: assembler: introduce adr_l, ldr_l and str_l macros
  ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
  ARM: module: add support for place relative relocations
  ARM: head-common.S: use PC-relative insn sequence for __proc_info
  ARM: head-common.S: use PC-relative insn sequence for idmap creation
  ARM: head.S: use PC-relative insn sequence for secondary_data
  ARM: kernel: use relative references for UP/SMP alternatives
  ARM: head: use PC-relative insn sequence for __smp_alt
  ARM: sleep.S: use PC-relative insn sequence for
    sleep_save_sp/mpidr_hash
  ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
  ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
  ARM: kvm: replace open coded VA->PA calculations with adr_l call

 arch/arm/boot/compressed/head.S  | 18 +---
 arch/arm/include/asm/assembler.h | 88 ++++++++++++++++++-
 arch/arm/include/asm/elf.h       |  5 ++
 arch/arm/include/asm/processor.h |  2 +-
 arch/arm/kernel/head-common.S    | 22 ++---
 arch/arm/kernel/head.S           | 90 +++++---------------
 arch/arm/kernel/hyp-stub.S       | 27 +++---
 arch/arm/kernel/module.c         | 20 ++++-
 arch/arm/kernel/sleep.S          | 19 ++---
 9 files changed, 159 insertions(+), 132 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:56   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Like arm64, ARM supports position independent code sequences that
produce symbol references with a greater reach than the ordinary
adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
only supported in ARM mode (and not at all when using Clang), having
a adr_l macro like we do on arm64 is useful, and increases symmetry
as well.

Currently, we use open coded instruction sequences involving literals
and arithmetic operations. Instead, we can use movw/movt pairs on v7
CPUs, circumventing the D-cache entirely.

E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:

       movw         <reg>, #:lower16:<sym> - (1f + 8)
       movt         <reg>, #:upper16:<sym> - (1f + 8)
  1:   add          <reg>, <reg>, pc

For older CPUs, we can emit the literal into a subsection, allowing it
to be emitted out of line while retaining the ability to perform
arithmetic on label offsets.

E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:

       ldr          <reg>, 2f
  1:   add          <reg>, <reg>, pc
       .subsection  1
  2:   .long        <sym> - (1b + 8)
       .previous

This is allowed by the assembler because, unlike ordinary sections,
subsections are combined into a single section in the object file, and
so the label references are not true cross-section references that are
visible as relocations. (Subsections have been available in binutils
since 2004 at least, so they should not cause any issues with older
toolchains.)

So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
all of which will use movw/movt pairs on v7 and later CPUs, and use
PC-relative literals otherwise.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/assembler.h | 84 ++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index feac2c8b86f2..39e972eaaa3f 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -494,4 +494,88 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #define _ASM_NOKPROBE(entry)
 #endif
 
+	.macro		__adldst_l, op, reg, sym, tmp, c
+	.if		__LINUX_ARM_ARCH__ < 7
+	ldr\c		\tmp, .La\@
+	.subsection	1
+	.align		2
+.La\@:	.long		\sym - .Lpc\@
+	.previous
+	.else
+	.ifnb		\c
+ THUMB(	ittt		\c			)
+	.endif
+	movw\c		\tmp, #:lower16:\sym - .Lpc\@
+	movt\c		\tmp, #:upper16:\sym - .Lpc\@
+	.endif
+
+#ifndef CONFIG_THUMB2_KERNEL
+	.set		.Lpc\@, . + 8			// PC bias
+	.ifc		\op, add
+	add\c		\reg, \tmp, pc
+	.else
+	\op\c		\reg, [\tmp, pc]
+	.endif
+#else
+.Lb\@:	add\c		\tmp, \tmp, pc
+	/*
+	 * In Thumb-2 builds, the PC bias depends on whether we are currently
+	 * emitting into a .arm or a .thumb section. The size of the add opcode
+	 * above will be 2 bytes when emitting in Thumb mode and 4 bytes when
+	 * emitting in ARM mode, so let's use this to account for the bias.
+	 */
+	.set		.Lpc\@, . + (. - .Lb\@)
+
+	.ifnc		\op, add
+	\op\c		\reg, [\tmp]
+	.endif
+#endif
+	.endm
+
+	/*
+	 * mov_l - move a constant value or [relocated] address into a register
+	 */
+	.macro		mov_l, dst:req, imm:req
+	.if		__LINUX_ARM_ARCH__ < 7
+	ldr		\dst, =\imm
+	.else
+	movw		\dst, #:lower16:\imm
+	movt		\dst, #:upper16:\imm
+	.endif
+	.endm
+
+	/*
+	 * adr_l - adr pseudo-op with unlimited range
+	 *
+	 * @dst: destination register
+	 * @sym: name of the symbol
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		adr_l, dst:req, sym:req, cond
+	__adldst_l	add, \dst, \sym, \dst, \cond
+	.endm
+
+	/*
+	 * ldr_l - ldr <literal> pseudo-op with unlimited range
+	 *
+	 * @dst: destination register
+	 * @sym: name of the symbol
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		ldr_l, dst:req, sym:req, cond
+	__adldst_l	ldr, \dst, \sym, \dst, \cond
+	.endm
+
+	/*
+	 * str_l - str <literal> pseudo-op with unlimited range
+	 *
+	 * @src: source register
+	 * @sym: name of the symbol
+	 * @tmp: mandatory scratch register
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		str_l, src:req, sym:req, tmp:req, cond
+	__adldst_l	str, \src, \sym, \tmp, \cond
+	.endm
+
 #endif /* __ASM_ASSEMBLER_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
@ 2020-09-14  9:56   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Like arm64, ARM supports position independent code sequences that
produce symbol references with a greater reach than the ordinary
adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
only supported in ARM mode (and not at all when using Clang), having
a adr_l macro like we do on arm64 is useful, and increases symmetry
as well.

Currently, we use open coded instruction sequences involving literals
and arithmetic operations. Instead, we can use movw/movt pairs on v7
CPUs, circumventing the D-cache entirely.

E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:

       movw         <reg>, #:lower16:<sym> - (1f + 8)
       movt         <reg>, #:upper16:<sym> - (1f + 8)
  1:   add          <reg>, <reg>, pc

For older CPUs, we can emit the literal into a subsection, allowing it
to be emitted out of line while retaining the ability to perform
arithmetic on label offsets.

E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:

       ldr          <reg>, 2f
  1:   add          <reg>, <reg>, pc
       .subsection  1
  2:   .long        <sym> - (1b + 8)
       .previous

This is allowed by the assembler because, unlike ordinary sections,
subsections are combined into a single section in the object file, and
so the label references are not true cross-section references that are
visible as relocations. (Subsections have been available in binutils
since 2004 at least, so they should not cause any issues with older
toolchains.)

So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
all of which will use movw/movt pairs on v7 and later CPUs, and use
PC-relative literals otherwise.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/assembler.h | 84 ++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index feac2c8b86f2..39e972eaaa3f 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -494,4 +494,88 @@ THUMB(	orr	\reg , \reg , #PSR_T_BIT	)
 #define _ASM_NOKPROBE(entry)
 #endif
 
+	.macro		__adldst_l, op, reg, sym, tmp, c
+	.if		__LINUX_ARM_ARCH__ < 7
+	ldr\c		\tmp, .La\@
+	.subsection	1
+	.align		2
+.La\@:	.long		\sym - .Lpc\@
+	.previous
+	.else
+	.ifnb		\c
+ THUMB(	ittt		\c			)
+	.endif
+	movw\c		\tmp, #:lower16:\sym - .Lpc\@
+	movt\c		\tmp, #:upper16:\sym - .Lpc\@
+	.endif
+
+#ifndef CONFIG_THUMB2_KERNEL
+	.set		.Lpc\@, . + 8			// PC bias
+	.ifc		\op, add
+	add\c		\reg, \tmp, pc
+	.else
+	\op\c		\reg, [\tmp, pc]
+	.endif
+#else
+.Lb\@:	add\c		\tmp, \tmp, pc
+	/*
+	 * In Thumb-2 builds, the PC bias depends on whether we are currently
+	 * emitting into a .arm or a .thumb section. The size of the add opcode
+	 * above will be 2 bytes when emitting in Thumb mode and 4 bytes when
+	 * emitting in ARM mode, so let's use this to account for the bias.
+	 */
+	.set		.Lpc\@, . + (. - .Lb\@)
+
+	.ifnc		\op, add
+	\op\c		\reg, [\tmp]
+	.endif
+#endif
+	.endm
+
+	/*
+	 * mov_l - move a constant value or [relocated] address into a register
+	 */
+	.macro		mov_l, dst:req, imm:req
+	.if		__LINUX_ARM_ARCH__ < 7
+	ldr		\dst, =\imm
+	.else
+	movw		\dst, #:lower16:\imm
+	movt		\dst, #:upper16:\imm
+	.endif
+	.endm
+
+	/*
+	 * adr_l - adr pseudo-op with unlimited range
+	 *
+	 * @dst: destination register
+	 * @sym: name of the symbol
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		adr_l, dst:req, sym:req, cond
+	__adldst_l	add, \dst, \sym, \dst, \cond
+	.endm
+
+	/*
+	 * ldr_l - ldr <literal> pseudo-op with unlimited range
+	 *
+	 * @dst: destination register
+	 * @sym: name of the symbol
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		ldr_l, dst:req, sym:req, cond
+	__adldst_l	ldr, \dst, \sym, \dst, \cond
+	.endm
+
+	/*
+	 * str_l - str <literal> pseudo-op with unlimited range
+	 *
+	 * @src: source register
+	 * @sym: name of the symbol
+	 * @tmp: mandatory scratch register
+	 * @cond: conditional opcode suffix
+	 */
+	.macro		str_l, src:req, sym:req, tmp:req, cond
+	__adldst_l	str, \src, \sym, \tmp, \cond
+	.endm
+
 #endif /* __ASM_ASSEMBLER_H__ */
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 02/12] ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:56   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

The ARM 'adrl' pseudo instruction is a bit problematic, as it does not
exist in Thumb mode, and it is not implemented by Clang either. Since
the Thumb variant has a slightly bigger range, it is sometimes necessary
to emit the 'adrl' variant in ARM mode where Thumb mode can use adr just
fine. However, that still leaves the Clang issue, which does not appear
to be supporting this any time soon.

So let's switch to the adr_l macro, which works for both ARM and Thumb,
and has unlimited range.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/boot/compressed/head.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 434a16982e34..6a430d1f4d31 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -1444,8 +1444,7 @@ ENTRY(efi_enter_kernel)
 		mov	r4, r0			@ preserve image base
 		mov	r8, r1			@ preserve DT pointer
 
- ARM(		adrl	r0, call_cache_fn	)
- THUMB(		adr	r0, call_cache_fn	)
+		adr_l	r0, call_cache_fn
 		adr	r1, 0f			@ clean the region of code we
 		bl	cache_clean_flush	@ may run with the MMU off
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 02/12] ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
@ 2020-09-14  9:56   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

The ARM 'adrl' pseudo instruction is a bit problematic, as it does not
exist in Thumb mode, and it is not implemented by Clang either. Since
the Thumb variant has a slightly bigger range, it is sometimes necessary
to emit the 'adrl' variant in ARM mode where Thumb mode can use adr just
fine. However, that still leaves the Clang issue, which does not appear
to be supporting this any time soon.

So let's switch to the adr_l macro, which works for both ARM and Thumb,
and has unlimited range.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/boot/compressed/head.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 434a16982e34..6a430d1f4d31 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -1444,8 +1444,7 @@ ENTRY(efi_enter_kernel)
 		mov	r4, r0			@ preserve image base
 		mov	r8, r1			@ preserve DT pointer
 
- ARM(		adrl	r0, call_cache_fn	)
- THUMB(		adr	r0, call_cache_fn	)
+		adr_l	r0, call_cache_fn
 		adr	r1, 0f			@ clean the region of code we
 		bl	cache_clean_flush	@ may run with the MMU off
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 03/12] ARM: module: add support for place relative relocations
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:56   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

When using the new adr_l/ldr_l/str_l macros to refer to external symbols
from modules, the linker may emit place relative ELF relocations that
need to be fixed up by the module loader. So add support for these.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/elf.h |  5 +++++
 arch/arm/kernel/module.c   | 20 ++++++++++++++++++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index b078d992414b..0ac62a54b73c 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -51,6 +51,7 @@ typedef struct user_fp elf_fpregset_t;
 #define R_ARM_NONE		0
 #define R_ARM_PC24		1
 #define R_ARM_ABS32		2
+#define R_ARM_REL32		3
 #define R_ARM_CALL		28
 #define R_ARM_JUMP24		29
 #define R_ARM_TARGET1		38
@@ -58,11 +59,15 @@ typedef struct user_fp elf_fpregset_t;
 #define R_ARM_PREL31		42
 #define R_ARM_MOVW_ABS_NC	43
 #define R_ARM_MOVT_ABS		44
+#define R_ARM_MOVW_PREL_NC	45
+#define R_ARM_MOVT_PREL		46
 
 #define R_ARM_THM_CALL		10
 #define R_ARM_THM_JUMP24	30
 #define R_ARM_THM_MOVW_ABS_NC	47
 #define R_ARM_THM_MOVT_ABS	48
+#define R_ARM_THM_MOVW_PREL_NC	49
+#define R_ARM_THM_MOVT_PREL	50
 
 /*
  * These are used to set parameters in the core dumps.
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index e15444b25ca0..beac45e89ba6 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -185,14 +185,24 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 			*(u32 *)loc |= offset & 0x7fffffff;
 			break;
 
+		case R_ARM_REL32:
+			*(u32 *)loc += sym->st_value - loc;
+			break;
+
 		case R_ARM_MOVW_ABS_NC:
 		case R_ARM_MOVT_ABS:
+		case R_ARM_MOVW_PREL_NC:
+		case R_ARM_MOVT_PREL:
 			offset = tmp = __mem_to_opcode_arm(*(u32 *)loc);
 			offset = ((offset & 0xf0000) >> 4) | (offset & 0xfff);
 			offset = (offset ^ 0x8000) - 0x8000;
 
 			offset += sym->st_value;
-			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS)
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVW_PREL_NC)
+				offset -= loc;
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
 				offset >>= 16;
 
 			tmp &= 0xfff0f000;
@@ -283,6 +293,8 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 
 		case R_ARM_THM_MOVW_ABS_NC:
 		case R_ARM_THM_MOVT_ABS:
+		case R_ARM_THM_MOVW_PREL_NC:
+		case R_ARM_THM_MOVT_PREL:
 			upper = __mem_to_opcode_thumb16(*(u16 *)loc);
 			lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2));
 
@@ -302,7 +314,11 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 			offset = (offset ^ 0x8000) - 0x8000;
 			offset += sym->st_value;
 
-			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS)
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVW_PREL_NC)
+				offset -= loc;
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL)
 				offset >>= 16;
 
 			upper = (u16)((upper & 0xfbf0) |
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 03/12] ARM: module: add support for place relative relocations
@ 2020-09-14  9:56   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

When using the new adr_l/ldr_l/str_l macros to refer to external symbols
from modules, the linker may emit place relative ELF relocations that
need to be fixed up by the module loader. So add support for these.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/elf.h |  5 +++++
 arch/arm/kernel/module.c   | 20 ++++++++++++++++++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index b078d992414b..0ac62a54b73c 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -51,6 +51,7 @@ typedef struct user_fp elf_fpregset_t;
 #define R_ARM_NONE		0
 #define R_ARM_PC24		1
 #define R_ARM_ABS32		2
+#define R_ARM_REL32		3
 #define R_ARM_CALL		28
 #define R_ARM_JUMP24		29
 #define R_ARM_TARGET1		38
@@ -58,11 +59,15 @@ typedef struct user_fp elf_fpregset_t;
 #define R_ARM_PREL31		42
 #define R_ARM_MOVW_ABS_NC	43
 #define R_ARM_MOVT_ABS		44
+#define R_ARM_MOVW_PREL_NC	45
+#define R_ARM_MOVT_PREL		46
 
 #define R_ARM_THM_CALL		10
 #define R_ARM_THM_JUMP24	30
 #define R_ARM_THM_MOVW_ABS_NC	47
 #define R_ARM_THM_MOVT_ABS	48
+#define R_ARM_THM_MOVW_PREL_NC	49
+#define R_ARM_THM_MOVT_PREL	50
 
 /*
  * These are used to set parameters in the core dumps.
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index e15444b25ca0..beac45e89ba6 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -185,14 +185,24 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 			*(u32 *)loc |= offset & 0x7fffffff;
 			break;
 
+		case R_ARM_REL32:
+			*(u32 *)loc += sym->st_value - loc;
+			break;
+
 		case R_ARM_MOVW_ABS_NC:
 		case R_ARM_MOVT_ABS:
+		case R_ARM_MOVW_PREL_NC:
+		case R_ARM_MOVT_PREL:
 			offset = tmp = __mem_to_opcode_arm(*(u32 *)loc);
 			offset = ((offset & 0xf0000) >> 4) | (offset & 0xfff);
 			offset = (offset ^ 0x8000) - 0x8000;
 
 			offset += sym->st_value;
-			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS)
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVW_PREL_NC)
+				offset -= loc;
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
 				offset >>= 16;
 
 			tmp &= 0xfff0f000;
@@ -283,6 +293,8 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 
 		case R_ARM_THM_MOVW_ABS_NC:
 		case R_ARM_THM_MOVT_ABS:
+		case R_ARM_THM_MOVW_PREL_NC:
+		case R_ARM_THM_MOVT_PREL:
 			upper = __mem_to_opcode_thumb16(*(u16 *)loc);
 			lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2));
 
@@ -302,7 +314,11 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 			offset = (offset ^ 0x8000) - 0x8000;
 			offset += sym->st_value;
 
-			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS)
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVW_PREL_NC)
+				offset -= loc;
+			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS ||
+			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL)
 				offset >>= 16;
 
 			upper = (u16)((upper & 0xfbf0) |
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 04/12] ARM: head-common.S: use PC-relative insn sequence for __proc_info
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:56   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded PC relative offset calculations with a pair of
adr_l invocations. This removes some open coded arithmetic involving
virtual addresses, avoids literal pools on v7+, and slightly reduces
the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head-common.S | 22 ++++++--------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kernel/head-common.S b/arch/arm/kernel/head-common.S
index 4a3982812a40..9a5ab6c19568 100644
--- a/arch/arm/kernel/head-common.S
+++ b/arch/arm/kernel/head-common.S
@@ -170,11 +170,12 @@ ENDPROC(lookup_processor_type)
  *	r9 = cpuid (preserved)
  */
 __lookup_processor_type:
-	adr	r3, __lookup_processor_type_data
-	ldmia	r3, {r4 - r6}
-	sub	r3, r3, r4			@ get offset between virt&phys
-	add	r5, r5, r3			@ convert virt addresses to
-	add	r6, r6, r3			@ physical address space
+	/*
+	 * Look in <asm/procinfo.h> for information about the __proc_info
+	 * structure.
+	 */
+	adr_l	r5, __proc_info_begin
+	adr_l	r6, __proc_info_end
 1:	ldmia	r5, {r3, r4}			@ value, mask
 	and	r4, r4, r9			@ mask wanted bits
 	teq	r3, r4
@@ -186,17 +187,6 @@ __lookup_processor_type:
 2:	ret	lr
 ENDPROC(__lookup_processor_type)
 
-/*
- * Look in <asm/procinfo.h> for information about the __proc_info structure.
- */
-	.align	2
-	.type	__lookup_processor_type_data, %object
-__lookup_processor_type_data:
-	.long	.
-	.long	__proc_info_begin
-	.long	__proc_info_end
-	.size	__lookup_processor_type_data, . - __lookup_processor_type_data
-
 __error_lpae:
 #ifdef CONFIG_DEBUG_LL
 	adr	r0, str_lpae
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 04/12] ARM: head-common.S: use PC-relative insn sequence for __proc_info
@ 2020-09-14  9:56   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded PC relative offset calculations with a pair of
adr_l invocations. This removes some open coded arithmetic involving
virtual addresses, avoids literal pools on v7+, and slightly reduces
the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head-common.S | 22 ++++++--------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kernel/head-common.S b/arch/arm/kernel/head-common.S
index 4a3982812a40..9a5ab6c19568 100644
--- a/arch/arm/kernel/head-common.S
+++ b/arch/arm/kernel/head-common.S
@@ -170,11 +170,12 @@ ENDPROC(lookup_processor_type)
  *	r9 = cpuid (preserved)
  */
 __lookup_processor_type:
-	adr	r3, __lookup_processor_type_data
-	ldmia	r3, {r4 - r6}
-	sub	r3, r3, r4			@ get offset between virt&phys
-	add	r5, r5, r3			@ convert virt addresses to
-	add	r6, r6, r3			@ physical address space
+	/*
+	 * Look in <asm/procinfo.h> for information about the __proc_info
+	 * structure.
+	 */
+	adr_l	r5, __proc_info_begin
+	adr_l	r6, __proc_info_end
 1:	ldmia	r5, {r3, r4}			@ value, mask
 	and	r4, r4, r9			@ mask wanted bits
 	teq	r3, r4
@@ -186,17 +187,6 @@ __lookup_processor_type:
 2:	ret	lr
 ENDPROC(__lookup_processor_type)
 
-/*
- * Look in <asm/procinfo.h> for information about the __proc_info structure.
- */
-	.align	2
-	.type	__lookup_processor_type_data, %object
-__lookup_processor_type_data:
-	.long	.
-	.long	__proc_info_begin
-	.long	__proc_info_end
-	.size	__lookup_processor_type_data, . - __lookup_processor_type_data
-
 __error_lpae:
 #ifdef CONFIG_DEBUG_LL
 	adr	r0, str_lpae
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 05/12] ARM: head-common.S: use PC-relative insn sequence for idmap creation
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:56   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded PC relative offset calculations involving
__turn_mmu_on and __turn_mmu_on_end with a pair of adr_l invocations.
This removes some open coded arithmetic involving virtual addresses,
avoids literal pools on v7+, and slightly reduces the footprint of the
code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index f8904227e7fd..7d8e2a296216 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -224,11 +224,8 @@ __create_page_tables:
 	 * Create identity mapping to cater for __enable_mmu.
 	 * This identity mapping will be removed by paging_init().
 	 */
-	adr	r0, __turn_mmu_on_loc
-	ldmia	r0, {r3, r5, r6}
-	sub	r0, r0, r3			@ virt->phys offset
-	add	r5, r5, r0			@ phys __turn_mmu_on
-	add	r6, r6, r0			@ phys __turn_mmu_on_end
+	adr_l	r5, __turn_mmu_on		@ _pa(__turn_mmu_on)
+	adr_l	r6, __turn_mmu_on_end		@ _pa(__turn_mmu_on_end)
 	mov	r5, r5, lsr #SECTION_SHIFT
 	mov	r6, r6, lsr #SECTION_SHIFT
 
@@ -351,11 +348,6 @@ __create_page_tables:
 	ret	lr
 ENDPROC(__create_page_tables)
 	.ltorg
-	.align
-__turn_mmu_on_loc:
-	.long	.
-	.long	__turn_mmu_on
-	.long	__turn_mmu_on_end
 
 #if defined(CONFIG_SMP)
 	.text
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 05/12] ARM: head-common.S: use PC-relative insn sequence for idmap creation
@ 2020-09-14  9:56   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:56 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded PC relative offset calculations involving
__turn_mmu_on and __turn_mmu_on_end with a pair of adr_l invocations.
This removes some open coded arithmetic involving virtual addresses,
avoids literal pools on v7+, and slightly reduces the footprint of the
code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index f8904227e7fd..7d8e2a296216 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -224,11 +224,8 @@ __create_page_tables:
 	 * Create identity mapping to cater for __enable_mmu.
 	 * This identity mapping will be removed by paging_init().
 	 */
-	adr	r0, __turn_mmu_on_loc
-	ldmia	r0, {r3, r5, r6}
-	sub	r0, r0, r3			@ virt->phys offset
-	add	r5, r5, r0			@ phys __turn_mmu_on
-	add	r6, r6, r0			@ phys __turn_mmu_on_end
+	adr_l	r5, __turn_mmu_on		@ _pa(__turn_mmu_on)
+	adr_l	r6, __turn_mmu_on_end		@ _pa(__turn_mmu_on_end)
 	mov	r5, r5, lsr #SECTION_SHIFT
 	mov	r6, r6, lsr #SECTION_SHIFT
 
@@ -351,11 +348,6 @@ __create_page_tables:
 	ret	lr
 ENDPROC(__create_page_tables)
 	.ltorg
-	.align
-__turn_mmu_on_loc:
-	.long	.
-	.long	__turn_mmu_on
-	.long	__turn_mmu_on_end
 
 #if defined(CONFIG_SMP)
 	.text
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 06/12] ARM: head.S: use PC-relative insn sequence for secondary_data
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded PC relative offset calculations with adr_l
and ldr_l invocations. This removes some open coded arithmetic
involving virtual addresses, avoids literal pools on v7+, and slightly
reduces the footprint of the code.

Note that it also removes a stale comment about the contents of r6.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7d8e2a296216..40407a4727e0 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -383,10 +383,8 @@ ENTRY(secondary_startup)
 	/*
 	 * Use the page tables supplied from  __cpu_up.
 	 */
-	adr	r4, __secondary_data
-	ldmia	r4, {r5, r7, r12}		@ address to jump to after
-	sub	lr, r4, r5			@ mmu has been enabled
-	add	r3, r7, lr
+	adr_l	r3, secondary_data
+	mov_l	r12, __secondary_switched
 	ldrd	r4, r5, [r3, #0]		@ get secondary_data.pgdir
 ARM_BE8(eor	r4, r4, r5)			@ Swap r5 and r4 in BE:
 ARM_BE8(eor	r5, r4, r5)			@ it can be done in 3 steps
@@ -401,22 +399,13 @@ ARM_BE8(eor	r4, r4, r5)			@ without using a temp reg.
 ENDPROC(secondary_startup)
 ENDPROC(secondary_startup_arm)
 
-	/*
-	 * r6  = &secondary_data
-	 */
 ENTRY(__secondary_switched)
-	ldr	sp, [r7, #12]			@ get secondary_data.stack
+	ldr_l	r7, secondary_data + 12		@ get secondary_data.stack
+	mov	sp, r7
 	mov	fp, #0
 	b	secondary_start_kernel
 ENDPROC(__secondary_switched)
 
-	.align
-
-	.type	__secondary_data, %object
-__secondary_data:
-	.long	.
-	.long	secondary_data
-	.long	__secondary_switched
 #endif /* defined(CONFIG_SMP) */
 
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 06/12] ARM: head.S: use PC-relative insn sequence for secondary_data
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded PC relative offset calculations with adr_l
and ldr_l invocations. This removes some open coded arithmetic
involving virtual addresses, avoids literal pools on v7+, and slightly
reduces the footprint of the code.

Note that it also removes a stale comment about the contents of r6.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7d8e2a296216..40407a4727e0 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -383,10 +383,8 @@ ENTRY(secondary_startup)
 	/*
 	 * Use the page tables supplied from  __cpu_up.
 	 */
-	adr	r4, __secondary_data
-	ldmia	r4, {r5, r7, r12}		@ address to jump to after
-	sub	lr, r4, r5			@ mmu has been enabled
-	add	r3, r7, lr
+	adr_l	r3, secondary_data
+	mov_l	r12, __secondary_switched
 	ldrd	r4, r5, [r3, #0]		@ get secondary_data.pgdir
 ARM_BE8(eor	r4, r4, r5)			@ Swap r5 and r4 in BE:
 ARM_BE8(eor	r5, r4, r5)			@ it can be done in 3 steps
@@ -401,22 +399,13 @@ ARM_BE8(eor	r4, r4, r5)			@ without using a temp reg.
 ENDPROC(secondary_startup)
 ENDPROC(secondary_startup_arm)
 
-	/*
-	 * r6  = &secondary_data
-	 */
 ENTRY(__secondary_switched)
-	ldr	sp, [r7, #12]			@ get secondary_data.stack
+	ldr_l	r7, secondary_data + 12		@ get secondary_data.stack
+	mov	sp, r7
 	mov	fp, #0
 	b	secondary_start_kernel
 ENDPROC(__secondary_switched)
 
-	.align
-
-	.type	__secondary_data, %object
-__secondary_data:
-	.long	.
-	.long	secondary_data
-	.long	__secondary_switched
 #endif /* defined(CONFIG_SMP) */
 
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 07/12] ARM: kernel: use relative references for UP/SMP alternatives
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Currently, the .alt.smp.init section contains the virtual addresses
of the patch sites. Since patching may occur both before and after
switching into virtual mode, this requires some manual handling of
the address when applying the UP alternative.

Let's simplify this by using relative offsets in the table entries:
this allows us to simply add each entry's address to its contents,
regardless of whether we are running in virtual mode or not.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/assembler.h |  4 ++--
 arch/arm/include/asm/processor.h |  2 +-
 arch/arm/kernel/head.S           | 10 +++++-----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 39e972eaaa3f..bf7501dfcb71 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -259,7 +259,7 @@
  */
 #define ALT_UP(instr...)					\
 	.pushsection ".alt.smp.init", "a"			;\
-	.long	9998b						;\
+	.long	9998b - .					;\
 9997:	instr							;\
 	.if . - 9997b == 2					;\
 		nop						;\
@@ -270,7 +270,7 @@
 	.popsection
 #define ALT_UP_B(label)					\
 	.pushsection ".alt.smp.init", "a"			;\
-	.long	9998b						;\
+	.long	9998b - .					;\
 	W(b)	. + (label - 9998b)					;\
 	.popsection
 #else
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index b9241051e5cb..9e6b97286307 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -96,7 +96,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define __ALT_SMP_ASM(smp, up)						\
 	"9998:	" smp "\n"						\
 	"	.pushsection \".alt.smp.init\", \"a\"\n"		\
-	"	.long	9998b\n"					\
+	"	.long	9998b - .\n"					\
 	"	" up "\n"						\
 	"	.popsection\n"
 #else
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 40407a4727e0..3199d29f4480 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -546,14 +546,15 @@ smp_on_up:
 __do_fixup_smp_on_up:
 	cmp	r4, r5
 	reths	lr
-	ldmia	r4!, {r0, r6}
- ARM(	str	r6, [r0, r3]	)
- THUMB(	add	r0, r0, r3	)
+	ldmia	r4, {r0, r6}
+ ARM(	str	r6, [r0, r4]	)
+ THUMB(	add	r0, r0, r4	)
+	add	r4, r4, #8
 #ifdef __ARMEB__
  THUMB(	mov	r6, r6, ror #16	)	@ Convert word order for big-endian.
 #endif
  THUMB(	strh	r6, [r0], #2	)	@ For Thumb-2, store as two halfwords
- THUMB(	mov	r6, r6, lsr #16	)	@ to be robust against misaligned r3.
+ THUMB(	mov	r6, r6, lsr #16	)	@ to be robust against misaligned r0.
  THUMB(	strh	r6, [r0]	)
 	b	__do_fixup_smp_on_up
 ENDPROC(__do_fixup_smp_on_up)
@@ -562,7 +563,6 @@ ENTRY(fixup_smp)
 	stmfd	sp!, {r4 - r6, lr}
 	mov	r4, r0
 	add	r5, r0, r1
-	mov	r3, #0
 	bl	__do_fixup_smp_on_up
 	ldmfd	sp!, {r4 - r6, pc}
 ENDPROC(fixup_smp)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 07/12] ARM: kernel: use relative references for UP/SMP alternatives
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Currently, the .alt.smp.init section contains the virtual addresses
of the patch sites. Since patching may occur both before and after
switching into virtual mode, this requires some manual handling of
the address when applying the UP alternative.

Let's simplify this by using relative offsets in the table entries:
this allows us to simply add each entry's address to its contents,
regardless of whether we are running in virtual mode or not.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/assembler.h |  4 ++--
 arch/arm/include/asm/processor.h |  2 +-
 arch/arm/kernel/head.S           | 10 +++++-----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 39e972eaaa3f..bf7501dfcb71 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -259,7 +259,7 @@
  */
 #define ALT_UP(instr...)					\
 	.pushsection ".alt.smp.init", "a"			;\
-	.long	9998b						;\
+	.long	9998b - .					;\
 9997:	instr							;\
 	.if . - 9997b == 2					;\
 		nop						;\
@@ -270,7 +270,7 @@
 	.popsection
 #define ALT_UP_B(label)					\
 	.pushsection ".alt.smp.init", "a"			;\
-	.long	9998b						;\
+	.long	9998b - .					;\
 	W(b)	. + (label - 9998b)					;\
 	.popsection
 #else
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index b9241051e5cb..9e6b97286307 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -96,7 +96,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define __ALT_SMP_ASM(smp, up)						\
 	"9998:	" smp "\n"						\
 	"	.pushsection \".alt.smp.init\", \"a\"\n"		\
-	"	.long	9998b\n"					\
+	"	.long	9998b - .\n"					\
 	"	" up "\n"						\
 	"	.popsection\n"
 #else
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 40407a4727e0..3199d29f4480 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -546,14 +546,15 @@ smp_on_up:
 __do_fixup_smp_on_up:
 	cmp	r4, r5
 	reths	lr
-	ldmia	r4!, {r0, r6}
- ARM(	str	r6, [r0, r3]	)
- THUMB(	add	r0, r0, r3	)
+	ldmia	r4, {r0, r6}
+ ARM(	str	r6, [r0, r4]	)
+ THUMB(	add	r0, r0, r4	)
+	add	r4, r4, #8
 #ifdef __ARMEB__
  THUMB(	mov	r6, r6, ror #16	)	@ Convert word order for big-endian.
 #endif
  THUMB(	strh	r6, [r0], #2	)	@ For Thumb-2, store as two halfwords
- THUMB(	mov	r6, r6, lsr #16	)	@ to be robust against misaligned r3.
+ THUMB(	mov	r6, r6, lsr #16	)	@ to be robust against misaligned r0.
  THUMB(	strh	r6, [r0]	)
 	b	__do_fixup_smp_on_up
 ENDPROC(__do_fixup_smp_on_up)
@@ -562,7 +563,6 @@ ENTRY(fixup_smp)
 	stmfd	sp!, {r4 - r6, lr}
 	mov	r4, r0
 	add	r5, r0, r1
-	mov	r3, #0
 	bl	__do_fixup_smp_on_up
 	ldmfd	sp!, {r4 - r6, pc}
 ENDPROC(fixup_smp)
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 08/12] ARM: head: use PC-relative insn sequence for __smp_alt
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Now that calling __do_fixup_smp_on_up() can be done without passing
the physical-to-virtual offset in r3, we can replace the open coded
PC relative offset calculations with a pair of adr_l invocations. This
removes some open coded arithmetic involving virtual addresses, avoids
literal pools on v7+, and slightly reduces the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 3199d29f4480..5f6436a40db1 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -520,19 +520,11 @@ ARM_BE8(rev	r0, r0)			@ byteswap if big endian
 	retne	lr
 
 __fixup_smp_on_up:
-	adr	r0, 1f
-	ldmia	r0, {r3 - r5}
-	sub	r3, r0, r3
-	add	r4, r4, r3
-	add	r5, r5, r3
+	adr_l	r4, __smpalt_begin
+	adr_l	r5, __smpalt_end
 	b	__do_fixup_smp_on_up
 ENDPROC(__fixup_smp)
 
-	.align
-1:	.word	.
-	.word	__smpalt_begin
-	.word	__smpalt_end
-
 	.pushsection .data
 	.align	2
 	.globl	smp_on_up
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 08/12] ARM: head: use PC-relative insn sequence for __smp_alt
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Now that calling __do_fixup_smp_on_up() can be done without passing
the physical-to-virtual offset in r3, we can replace the open coded
PC relative offset calculations with a pair of adr_l invocations. This
removes some open coded arithmetic involving virtual addresses, avoids
literal pools on v7+, and slightly reduces the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 3199d29f4480..5f6436a40db1 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -520,19 +520,11 @@ ARM_BE8(rev	r0, r0)			@ byteswap if big endian
 	retne	lr
 
 __fixup_smp_on_up:
-	adr	r0, 1f
-	ldmia	r0, {r3 - r5}
-	sub	r3, r0, r3
-	add	r4, r4, r3
-	add	r5, r5, r3
+	adr_l	r4, __smpalt_begin
+	adr_l	r5, __smpalt_end
 	b	__do_fixup_smp_on_up
 ENDPROC(__fixup_smp)
 
-	.align
-1:	.word	.
-	.word	__smpalt_begin
-	.word	__smpalt_end
-
 	.pushsection .data
 	.align	2
 	.globl	smp_on_up
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 09/12] ARM: sleep.S: use PC-relative insn sequence for sleep_save_sp/mpidr_hash
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded PC relative offset calculations with adr_l and
ldr_l invocations. This removes some open coded PC relative arithmetic,
avoids literal pools on v7+, and slightly reduces the footprint of the
code. Note that ALT_SMP() expects a single instruction so move the macro
invocation after it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/sleep.S | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S
index 5dc8b80bb693..43077e11dafd 100644
--- a/arch/arm/kernel/sleep.S
+++ b/arch/arm/kernel/sleep.S
@@ -72,8 +72,9 @@ ENTRY(__cpu_suspend)
 	ldr	r3, =sleep_save_sp
 	stmfd	sp!, {r0, r1}		@ save suspend func arg and pointer
 	ldr	r3, [r3, #SLEEP_SAVE_SP_VIRT]
-	ALT_SMP(ldr r0, =mpidr_hash)
+	ALT_SMP(W(nop))			@ don't use adr_l inside ALT_SMP()
 	ALT_UP_B(1f)
+	adr_l	r0, mpidr_hash
 	/* This ldmia relies on the memory layout of the mpidr_hash struct */
 	ldmia	r0, {r1, r6-r8}	@ r1 = mpidr mask (r6,r7,r8) = l[0,1,2] shifts
 	compute_mpidr_hash	r0, r6, r7, r8, r2, r1
@@ -147,9 +148,8 @@ no_hyp:
 	mov	r1, #0
 	ALT_SMP(mrc p15, 0, r0, c0, c0, 5)
 	ALT_UP_B(1f)
-	adr	r2, mpidr_hash_ptr
-	ldr	r3, [r2]
-	add	r2, r2, r3		@ r2 = struct mpidr_hash phys address
+	adr_l	r2, mpidr_hash		@ r2 = struct mpidr_hash phys address
+
 	/*
 	 * This ldmia relies on the memory layout of the mpidr_hash
 	 * struct mpidr_hash.
@@ -157,10 +157,7 @@ no_hyp:
 	ldmia	r2, { r3-r6 }	@ r3 = mpidr mask (r4,r5,r6) = l[0,1,2] shifts
 	compute_mpidr_hash	r1, r4, r5, r6, r0, r3
 1:
-	adr	r0, _sleep_save_sp
-	ldr	r2, [r0]
-	add	r0, r0, r2
-	ldr	r0, [r0, #SLEEP_SAVE_SP_PHYS]
+	ldr_l	r0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	r0, [r0, r1, lsl #2]
 
 	@ load phys pgd, stack, resume fn
@@ -177,12 +174,6 @@ ENDPROC(cpu_resume_arm)
 ENDPROC(cpu_resume_no_hyp)
 #endif
 
-	.align 2
-_sleep_save_sp:
-	.long	sleep_save_sp - .
-mpidr_hash_ptr:
-	.long	mpidr_hash - .			@ mpidr_hash struct offset
-
 	.data
 	.align	2
 	.type	sleep_save_sp, #object
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 09/12] ARM: sleep.S: use PC-relative insn sequence for sleep_save_sp/mpidr_hash
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded PC relative offset calculations with adr_l and
ldr_l invocations. This removes some open coded PC relative arithmetic,
avoids literal pools on v7+, and slightly reduces the footprint of the
code. Note that ALT_SMP() expects a single instruction so move the macro
invocation after it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/sleep.S | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S
index 5dc8b80bb693..43077e11dafd 100644
--- a/arch/arm/kernel/sleep.S
+++ b/arch/arm/kernel/sleep.S
@@ -72,8 +72,9 @@ ENTRY(__cpu_suspend)
 	ldr	r3, =sleep_save_sp
 	stmfd	sp!, {r0, r1}		@ save suspend func arg and pointer
 	ldr	r3, [r3, #SLEEP_SAVE_SP_VIRT]
-	ALT_SMP(ldr r0, =mpidr_hash)
+	ALT_SMP(W(nop))			@ don't use adr_l inside ALT_SMP()
 	ALT_UP_B(1f)
+	adr_l	r0, mpidr_hash
 	/* This ldmia relies on the memory layout of the mpidr_hash struct */
 	ldmia	r0, {r1, r6-r8}	@ r1 = mpidr mask (r6,r7,r8) = l[0,1,2] shifts
 	compute_mpidr_hash	r0, r6, r7, r8, r2, r1
@@ -147,9 +148,8 @@ no_hyp:
 	mov	r1, #0
 	ALT_SMP(mrc p15, 0, r0, c0, c0, 5)
 	ALT_UP_B(1f)
-	adr	r2, mpidr_hash_ptr
-	ldr	r3, [r2]
-	add	r2, r2, r3		@ r2 = struct mpidr_hash phys address
+	adr_l	r2, mpidr_hash		@ r2 = struct mpidr_hash phys address
+
 	/*
 	 * This ldmia relies on the memory layout of the mpidr_hash
 	 * struct mpidr_hash.
@@ -157,10 +157,7 @@ no_hyp:
 	ldmia	r2, { r3-r6 }	@ r3 = mpidr mask (r4,r5,r6) = l[0,1,2] shifts
 	compute_mpidr_hash	r1, r4, r5, r6, r0, r3
 1:
-	adr	r0, _sleep_save_sp
-	ldr	r2, [r0]
-	add	r0, r0, r2
-	ldr	r0, [r0, #SLEEP_SAVE_SP_PHYS]
+	ldr_l	r0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	r0, [r0, r1, lsl #2]
 
 	@ load phys pgd, stack, resume fn
@@ -177,12 +174,6 @@ ENDPROC(cpu_resume_arm)
 ENDPROC(cpu_resume_no_hyp)
 #endif
 
-	.align 2
-_sleep_save_sp:
-	.long	sleep_save_sp - .
-mpidr_hash_ptr:
-	.long	mpidr_hash - .			@ mpidr_hash struct offset
-
 	.data
 	.align	2
 	.type	sleep_save_sp, #object
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 10/12] ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded PC relative offset calculations with adr_l
and mov_l invocations. This removes some open coded arithmetic involving
virtual addresses and avoids literal pools on v7+. Note that the footprint
of the code increases slightly, but the resulting code is a bit easier to
follow.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 27 ++++++--------------
 1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 5f6436a40db1..6f334df5d3b9 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -576,14 +576,11 @@ ENDPROC(fixup_smp)
  */
 	__HEAD
 __fixup_pv_table:
-	adr	r0, 1f
-	ldmia	r0, {r3-r7}
+	adr_l	r6, __pv_phys_pfn_offset
+	adr_l	r7, __pv_offset			@ __pa(__pv_offset)
+	mov_l	r3, __pv_offset			@ __va(__pv_offset)
 	mvn	ip, #0
-	subs	r3, r0, r3	@ PHYS_OFFSET - PAGE_OFFSET
-	add	r4, r4, r3	@ adjust table start address
-	add	r5, r5, r3	@ adjust table end address
-	add	r6, r6, r3	@ adjust __pv_phys_pfn_offset address
-	add	r7, r7, r3	@ adjust __pv_offset address
+	subs	r3, r7, r3	@ PHYS_OFFSET - PAGE_OFFSET
 	mov	r0, r8, lsr #PAGE_SHIFT	@ convert to PFN
 	str	r0, [r6]	@ save computed PHYS_OFFSET to __pv_phys_pfn_offset
 	strcc	ip, [r7, #HIGH_OFFSET]	@ save to __pv_offset high bits
@@ -592,20 +589,15 @@ __fixup_pv_table:
 THUMB(	it	ne		@ cross section branch )
 	bne	__error
 	str	r3, [r7, #LOW_OFFSET]	@ save to __pv_offset low bits
+	adr_l	r4, __pv_table_begin
+	adr_l	r5, __pv_table_end
 	b	__fixup_a_pv_table
 ENDPROC(__fixup_pv_table)
-
-	.align
-1:	.long	.
-	.long	__pv_table_begin
-	.long	__pv_table_end
-2:	.long	__pv_phys_pfn_offset
-	.long	__pv_offset
+	.ltorg
 
 	.text
 __fixup_a_pv_table:
-	adr	r0, 3f
-	ldr	r6, [r0]
+	mov_l	r6, __pv_offset
 	add	r6, r6, r3
 	ldr	r0, [r6, #HIGH_OFFSET]	@ pv_offset high word
 	ldr	r6, [r6, #LOW_OFFSET]	@ pv_offset low word
@@ -674,9 +666,6 @@ ARM_BE8(rev16	ip, ip)
 #endif
 ENDPROC(__fixup_a_pv_table)
 
-	.align
-3:	.long __pv_offset
-
 ENTRY(fixup_pv_table)
 	stmfd	sp!, {r4 - r7, lr}
 	mov	r3, #0			@ no offset
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 10/12] ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded PC relative offset calculations with adr_l
and mov_l invocations. This removes some open coded arithmetic involving
virtual addresses and avoids literal pools on v7+. Note that the footprint
of the code increases slightly, but the resulting code is a bit easier to
follow.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 27 ++++++--------------
 1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 5f6436a40db1..6f334df5d3b9 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -576,14 +576,11 @@ ENDPROC(fixup_smp)
  */
 	__HEAD
 __fixup_pv_table:
-	adr	r0, 1f
-	ldmia	r0, {r3-r7}
+	adr_l	r6, __pv_phys_pfn_offset
+	adr_l	r7, __pv_offset			@ __pa(__pv_offset)
+	mov_l	r3, __pv_offset			@ __va(__pv_offset)
 	mvn	ip, #0
-	subs	r3, r0, r3	@ PHYS_OFFSET - PAGE_OFFSET
-	add	r4, r4, r3	@ adjust table start address
-	add	r5, r5, r3	@ adjust table end address
-	add	r6, r6, r3	@ adjust __pv_phys_pfn_offset address
-	add	r7, r7, r3	@ adjust __pv_offset address
+	subs	r3, r7, r3	@ PHYS_OFFSET - PAGE_OFFSET
 	mov	r0, r8, lsr #PAGE_SHIFT	@ convert to PFN
 	str	r0, [r6]	@ save computed PHYS_OFFSET to __pv_phys_pfn_offset
 	strcc	ip, [r7, #HIGH_OFFSET]	@ save to __pv_offset high bits
@@ -592,20 +589,15 @@ __fixup_pv_table:
 THUMB(	it	ne		@ cross section branch )
 	bne	__error
 	str	r3, [r7, #LOW_OFFSET]	@ save to __pv_offset low bits
+	adr_l	r4, __pv_table_begin
+	adr_l	r5, __pv_table_end
 	b	__fixup_a_pv_table
 ENDPROC(__fixup_pv_table)
-
-	.align
-1:	.long	.
-	.long	__pv_table_begin
-	.long	__pv_table_end
-2:	.long	__pv_phys_pfn_offset
-	.long	__pv_offset
+	.ltorg
 
 	.text
 __fixup_a_pv_table:
-	adr	r0, 3f
-	ldr	r6, [r0]
+	mov_l	r6, __pv_offset
 	add	r6, r6, r3
 	ldr	r0, [r6, #HIGH_OFFSET]	@ pv_offset high word
 	ldr	r6, [r6, #LOW_OFFSET]	@ pv_offset low word
@@ -674,9 +666,6 @@ ARM_BE8(rev16	ip, ip)
 #endif
 ENDPROC(__fixup_a_pv_table)
 
-	.align
-3:	.long __pv_offset
-
 ENTRY(fixup_pv_table)
 	stmfd	sp!, {r4 - r7, lr}
 	mov	r3, #0			@ no offset
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 11/12] ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded arithmetic with a simple adr_l/sub pair. This
removes some open coded arithmetic involving virtual addresses, avoids
literal pools on v7+, and slightly reduces the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 6f334df5d3b9..2e6127b2662e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -103,10 +103,8 @@ ENTRY(stext)
 #endif
 
 #ifndef CONFIG_XIP_KERNEL
-	adr	r3, 2f
-	ldmia	r3, {r4, r8}
-	sub	r4, r3, r4			@ (PHYS_OFFSET - PAGE_OFFSET)
-	add	r8, r8, r4			@ PHYS_OFFSET
+	adr_l	r8, _text			@ __pa(_text)
+	sub	r8, r8, #TEXT_OFFSET		@ PHYS_OFFSET
 #else
 	ldr	r8, =PLAT_PHYS_OFFSET		@ always constant in this case
 #endif
@@ -158,10 +156,6 @@ ENTRY(stext)
 1:	b	__enable_mmu
 ENDPROC(stext)
 	.ltorg
-#ifndef CONFIG_XIP_KERNEL
-2:	.long	.
-	.long	PAGE_OFFSET
-#endif
 
 /*
  * Setup the initial page tables.  We only setup the barest
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 11/12] ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded arithmetic with a simple adr_l/sub pair. This
removes some open coded arithmetic involving virtual addresses, avoids
literal pools on v7+, and slightly reduces the footprint of the code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/head.S | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 6f334df5d3b9..2e6127b2662e 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -103,10 +103,8 @@ ENTRY(stext)
 #endif
 
 #ifndef CONFIG_XIP_KERNEL
-	adr	r3, 2f
-	ldmia	r3, {r4, r8}
-	sub	r4, r3, r4			@ (PHYS_OFFSET - PAGE_OFFSET)
-	add	r8, r8, r4			@ PHYS_OFFSET
+	adr_l	r8, _text			@ __pa(_text)
+	sub	r8, r8, #TEXT_OFFSET		@ PHYS_OFFSET
 #else
 	ldr	r8, =PLAT_PHYS_OFFSET		@ always constant in this case
 #endif
@@ -158,10 +156,6 @@ ENTRY(stext)
 1:	b	__enable_mmu
 ENDPROC(stext)
 	.ltorg
-#ifndef CONFIG_XIP_KERNEL
-2:	.long	.
-	.long	PAGE_OFFSET
-#endif
 
 /*
  * Setup the initial page tables.  We only setup the barest
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 12/12] ARM: kvm: replace open coded VA->PA calculations with adr_l call
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14  9:57   ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Russell King, Linus Walleij,
	Nicolas Pitre, Nick Desaulniers, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon

Replace the open coded calculations of the actual physical address
of the KVM stub vector table with a single adr_l invocation.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/boot/compressed/head.S | 15 ++---------
 arch/arm/kernel/hyp-stub.S      | 27 +++++++++-----------
 2 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 6a430d1f4d31..ab800f7e0dc1 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -472,15 +472,10 @@ dtb_check_done:
 
 		/*
 		 * Compute the address of the hyp vectors after relocation.
-		 * This requires some arithmetic since we cannot directly
-		 * reference __hyp_stub_vectors in a PC-relative way.
 		 * Call __hyp_set_vectors with the new address so that we
 		 * can HVC again after the copy.
 		 */
-0:		adr	r0, 0b
-		movw	r1, #:lower16:__hyp_stub_vectors - 0b
-		movt	r1, #:upper16:__hyp_stub_vectors - 0b
-		add	r0, r0, r1
+		adr_l	r0, __hyp_stub_vectors
 		sub	r0, r0, r5
 		add	r0, r0, r10
 		bl	__hyp_set_vectors
@@ -631,17 +626,11 @@ not_relocated:	mov	r0, #0
 		cmp	r0, #HYP_MODE		@ if not booted in HYP mode...
 		bne	__enter_kernel		@ boot kernel directly
 
-		adr	r12, .L__hyp_reentry_vectors_offset
-		ldr	r0, [r12]
-		add	r0, r0, r12
-
+		adr_l	r0, __hyp_reentry_vectors
 		bl	__hyp_set_vectors
 		__HVC(0)			@ otherwise bounce to hyp mode
 
 		b	.			@ should never be reached
-
-		.align	2
-.L__hyp_reentry_vectors_offset:	.long	__hyp_reentry_vectors - .
 #else
 		b	__enter_kernel
 #endif
diff --git a/arch/arm/kernel/hyp-stub.S b/arch/arm/kernel/hyp-stub.S
index 26d8e03b1dd3..103d0bdb2b7e 100644
--- a/arch/arm/kernel/hyp-stub.S
+++ b/arch/arm/kernel/hyp-stub.S
@@ -24,41 +24,38 @@ ENTRY(__boot_cpu_mode)
 .text
 
 	/*
-	 * Save the primary CPU boot mode. Requires 3 scratch registers.
+	 * Save the primary CPU boot mode. Requires 2 scratch registers.
 	 */
-	.macro	store_primary_cpu_mode	reg1, reg2, reg3
+	.macro	store_primary_cpu_mode	reg1, reg2
 	mrs	\reg1, cpsr
 	and	\reg1, \reg1, #MODE_MASK
-	adr	\reg2, .L__boot_cpu_mode_offset
-	ldr	\reg3, [\reg2]
-	str	\reg1, [\reg2, \reg3]
+	str_l	\reg1, __boot_cpu_mode, \reg2
 	.endm
 
 	/*
 	 * Compare the current mode with the one saved on the primary CPU.
 	 * If they don't match, record that fact. The Z bit indicates
 	 * if there's a match or not.
-	 * Requires 3 additionnal scratch registers.
+	 * Requires 2 additional scratch registers.
 	 */
-	.macro	compare_cpu_mode_with_primary mode, reg1, reg2, reg3
-	adr	\reg2, .L__boot_cpu_mode_offset
-	ldr	\reg3, [\reg2]
-	ldr	\reg1, [\reg2, \reg3]
+	.macro	compare_cpu_mode_with_primary mode, reg1, reg2
+	adr_l	\reg2, __boot_cpu_mode
+	ldr	\reg1, [\reg2]
 	cmp	\mode, \reg1		@ matches primary CPU boot mode?
 	orrne	\reg1, \reg1, #BOOT_CPU_MODE_MISMATCH
-	strne	\reg1, [\reg2, \reg3]	@ record what happened and give up
+	strne	\reg1, [\reg2]		@ record what happened and give up
 	.endm
 
 #else	/* ZIMAGE */
 
-	.macro	store_primary_cpu_mode	reg1:req, reg2:req, reg3:req
+	.macro	store_primary_cpu_mode	reg1:req, reg2:req
 	.endm
 
 /*
  * The zImage loader only runs on one CPU, so we don't bother with mult-CPU
  * consistency checking:
  */
-	.macro	compare_cpu_mode_with_primary mode, reg1, reg2, reg3
+	.macro	compare_cpu_mode_with_primary mode, reg1, reg2
 	cmp	\mode, \mode
 	.endm
 
@@ -73,7 +70,7 @@ ENTRY(__boot_cpu_mode)
  */
 @ Call this from the primary CPU
 ENTRY(__hyp_stub_install)
-	store_primary_cpu_mode	r4, r5, r6
+	store_primary_cpu_mode	r4, r5
 ENDPROC(__hyp_stub_install)
 
 	@ fall through...
@@ -87,7 +84,7 @@ ENTRY(__hyp_stub_install_secondary)
 	 * If the secondary has booted with a different mode, give up
 	 * immediately.
 	 */
-	compare_cpu_mode_with_primary	r4, r5, r6, r7
+	compare_cpu_mode_with_primary	r4, r5, r6
 	retne	lr
 
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 12/12] ARM: kvm: replace open coded VA->PA calculations with adr_l call
@ 2020-09-14  9:57   ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-14  9:57 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	Ard Biesheuvel, linux-arm-kernel

Replace the open coded calculations of the actual physical address
of the KVM stub vector table with a single adr_l invocation.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/boot/compressed/head.S | 15 ++---------
 arch/arm/kernel/hyp-stub.S      | 27 +++++++++-----------
 2 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 6a430d1f4d31..ab800f7e0dc1 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -472,15 +472,10 @@ dtb_check_done:
 
 		/*
 		 * Compute the address of the hyp vectors after relocation.
-		 * This requires some arithmetic since we cannot directly
-		 * reference __hyp_stub_vectors in a PC-relative way.
 		 * Call __hyp_set_vectors with the new address so that we
 		 * can HVC again after the copy.
 		 */
-0:		adr	r0, 0b
-		movw	r1, #:lower16:__hyp_stub_vectors - 0b
-		movt	r1, #:upper16:__hyp_stub_vectors - 0b
-		add	r0, r0, r1
+		adr_l	r0, __hyp_stub_vectors
 		sub	r0, r0, r5
 		add	r0, r0, r10
 		bl	__hyp_set_vectors
@@ -631,17 +626,11 @@ not_relocated:	mov	r0, #0
 		cmp	r0, #HYP_MODE		@ if not booted in HYP mode...
 		bne	__enter_kernel		@ boot kernel directly
 
-		adr	r12, .L__hyp_reentry_vectors_offset
-		ldr	r0, [r12]
-		add	r0, r0, r12
-
+		adr_l	r0, __hyp_reentry_vectors
 		bl	__hyp_set_vectors
 		__HVC(0)			@ otherwise bounce to hyp mode
 
 		b	.			@ should never be reached
-
-		.align	2
-.L__hyp_reentry_vectors_offset:	.long	__hyp_reentry_vectors - .
 #else
 		b	__enter_kernel
 #endif
diff --git a/arch/arm/kernel/hyp-stub.S b/arch/arm/kernel/hyp-stub.S
index 26d8e03b1dd3..103d0bdb2b7e 100644
--- a/arch/arm/kernel/hyp-stub.S
+++ b/arch/arm/kernel/hyp-stub.S
@@ -24,41 +24,38 @@ ENTRY(__boot_cpu_mode)
 .text
 
 	/*
-	 * Save the primary CPU boot mode. Requires 3 scratch registers.
+	 * Save the primary CPU boot mode. Requires 2 scratch registers.
 	 */
-	.macro	store_primary_cpu_mode	reg1, reg2, reg3
+	.macro	store_primary_cpu_mode	reg1, reg2
 	mrs	\reg1, cpsr
 	and	\reg1, \reg1, #MODE_MASK
-	adr	\reg2, .L__boot_cpu_mode_offset
-	ldr	\reg3, [\reg2]
-	str	\reg1, [\reg2, \reg3]
+	str_l	\reg1, __boot_cpu_mode, \reg2
 	.endm
 
 	/*
 	 * Compare the current mode with the one saved on the primary CPU.
 	 * If they don't match, record that fact. The Z bit indicates
 	 * if there's a match or not.
-	 * Requires 3 additionnal scratch registers.
+	 * Requires 2 additional scratch registers.
 	 */
-	.macro	compare_cpu_mode_with_primary mode, reg1, reg2, reg3
-	adr	\reg2, .L__boot_cpu_mode_offset
-	ldr	\reg3, [\reg2]
-	ldr	\reg1, [\reg2, \reg3]
+	.macro	compare_cpu_mode_with_primary mode, reg1, reg2
+	adr_l	\reg2, __boot_cpu_mode
+	ldr	\reg1, [\reg2]
 	cmp	\mode, \reg1		@ matches primary CPU boot mode?
 	orrne	\reg1, \reg1, #BOOT_CPU_MODE_MISMATCH
-	strne	\reg1, [\reg2, \reg3]	@ record what happened and give up
+	strne	\reg1, [\reg2]		@ record what happened and give up
 	.endm
 
 #else	/* ZIMAGE */
 
-	.macro	store_primary_cpu_mode	reg1:req, reg2:req, reg3:req
+	.macro	store_primary_cpu_mode	reg1:req, reg2:req
 	.endm
 
 /*
  * The zImage loader only runs on one CPU, so we don't bother with mult-CPU
  * consistency checking:
  */
-	.macro	compare_cpu_mode_with_primary mode, reg1, reg2, reg3
+	.macro	compare_cpu_mode_with_primary mode, reg1, reg2
 	cmp	\mode, \mode
 	.endm
 
@@ -73,7 +70,7 @@ ENTRY(__boot_cpu_mode)
  */
 @ Call this from the primary CPU
 ENTRY(__hyp_stub_install)
-	store_primary_cpu_mode	r4, r5, r6
+	store_primary_cpu_mode	r4, r5
 ENDPROC(__hyp_stub_install)
 
 	@ fall through...
@@ -87,7 +84,7 @@ ENTRY(__hyp_stub_install_secondary)
 	 * If the secondary has booted with a different mode, give up
 	 * immediately.
 	 */
-	compare_cpu_mode_with_primary	r4, r5, r6, r7
+	compare_cpu_mode_with_primary	r4, r5, r6
 	retne	lr
 
 	/*
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/12] ARM: module: add support for place relative relocations
  2020-09-14  9:56   ` Ard Biesheuvel
@ 2020-09-14 13:35     ` Nicolas Pitre
  -1 siblings, 0 replies; 62+ messages in thread
From: Nicolas Pitre @ 2020-09-14 13:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Russell King, Linus Walleij,
	Nick Desaulniers, Stefan Agner, Peter Smith, Marc Zyngier,
	Will Deacon

On Mon, 14 Sep 2020, Ard Biesheuvel wrote:

> When using the new adr_l/ldr_l/str_l macros to refer to external symbols
> from modules, the linker may emit place relative ELF relocations that
> need to be fixed up by the module loader. So add support for these.

Just wondering if that capability requirement should be added to the 
module signature somehow...?

Maybe not. The MODULE_ARCH_VERMAGIC definition only contains things that 
are configurable for a given build.


> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/include/asm/elf.h |  5 +++++
>  arch/arm/kernel/module.c   | 20 ++++++++++++++++++--
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
> index b078d992414b..0ac62a54b73c 100644
> --- a/arch/arm/include/asm/elf.h
> +++ b/arch/arm/include/asm/elf.h
> @@ -51,6 +51,7 @@ typedef struct user_fp elf_fpregset_t;
>  #define R_ARM_NONE		0
>  #define R_ARM_PC24		1
>  #define R_ARM_ABS32		2
> +#define R_ARM_REL32		3
>  #define R_ARM_CALL		28
>  #define R_ARM_JUMP24		29
>  #define R_ARM_TARGET1		38
> @@ -58,11 +59,15 @@ typedef struct user_fp elf_fpregset_t;
>  #define R_ARM_PREL31		42
>  #define R_ARM_MOVW_ABS_NC	43
>  #define R_ARM_MOVT_ABS		44
> +#define R_ARM_MOVW_PREL_NC	45
> +#define R_ARM_MOVT_PREL		46
>  
>  #define R_ARM_THM_CALL		10
>  #define R_ARM_THM_JUMP24	30
>  #define R_ARM_THM_MOVW_ABS_NC	47
>  #define R_ARM_THM_MOVT_ABS	48
> +#define R_ARM_THM_MOVW_PREL_NC	49
> +#define R_ARM_THM_MOVT_PREL	50
>  
>  /*
>   * These are used to set parameters in the core dumps.
> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index e15444b25ca0..beac45e89ba6 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -185,14 +185,24 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  			*(u32 *)loc |= offset & 0x7fffffff;
>  			break;
>  
> +		case R_ARM_REL32:
> +			*(u32 *)loc += sym->st_value - loc;
> +			break;
> +
>  		case R_ARM_MOVW_ABS_NC:
>  		case R_ARM_MOVT_ABS:
> +		case R_ARM_MOVW_PREL_NC:
> +		case R_ARM_MOVT_PREL:
>  			offset = tmp = __mem_to_opcode_arm(*(u32 *)loc);
>  			offset = ((offset & 0xf0000) >> 4) | (offset & 0xfff);
>  			offset = (offset ^ 0x8000) - 0x8000;
>  
>  			offset += sym->st_value;
> -			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS)
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVW_PREL_NC)
> +				offset -= loc;
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
>  				offset >>= 16;
>  
>  			tmp &= 0xfff0f000;
> @@ -283,6 +293,8 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  
>  		case R_ARM_THM_MOVW_ABS_NC:
>  		case R_ARM_THM_MOVT_ABS:
> +		case R_ARM_THM_MOVW_PREL_NC:
> +		case R_ARM_THM_MOVT_PREL:
>  			upper = __mem_to_opcode_thumb16(*(u16 *)loc);
>  			lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2));
>  
> @@ -302,7 +314,11 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  			offset = (offset ^ 0x8000) - 0x8000;
>  			offset += sym->st_value;
>  
> -			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS)
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVW_PREL_NC)
> +				offset -= loc;
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL)
>  				offset >>= 16;
>  
>  			upper = (u16)((upper & 0xfbf0) |
> -- 
> 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/12] ARM: module: add support for place relative relocations
@ 2020-09-14 13:35     ` Nicolas Pitre
  0 siblings, 0 replies; 62+ messages in thread
From: Nicolas Pitre @ 2020-09-14 13:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	linux-arm-kernel

On Mon, 14 Sep 2020, Ard Biesheuvel wrote:

> When using the new adr_l/ldr_l/str_l macros to refer to external symbols
> from modules, the linker may emit place relative ELF relocations that
> need to be fixed up by the module loader. So add support for these.

Just wondering if that capability requirement should be added to the 
module signature somehow...?

Maybe not. The MODULE_ARCH_VERMAGIC definition only contains things that 
are configurable for a given build.


> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/include/asm/elf.h |  5 +++++
>  arch/arm/kernel/module.c   | 20 ++++++++++++++++++--
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
> index b078d992414b..0ac62a54b73c 100644
> --- a/arch/arm/include/asm/elf.h
> +++ b/arch/arm/include/asm/elf.h
> @@ -51,6 +51,7 @@ typedef struct user_fp elf_fpregset_t;
>  #define R_ARM_NONE		0
>  #define R_ARM_PC24		1
>  #define R_ARM_ABS32		2
> +#define R_ARM_REL32		3
>  #define R_ARM_CALL		28
>  #define R_ARM_JUMP24		29
>  #define R_ARM_TARGET1		38
> @@ -58,11 +59,15 @@ typedef struct user_fp elf_fpregset_t;
>  #define R_ARM_PREL31		42
>  #define R_ARM_MOVW_ABS_NC	43
>  #define R_ARM_MOVT_ABS		44
> +#define R_ARM_MOVW_PREL_NC	45
> +#define R_ARM_MOVT_PREL		46
>  
>  #define R_ARM_THM_CALL		10
>  #define R_ARM_THM_JUMP24	30
>  #define R_ARM_THM_MOVW_ABS_NC	47
>  #define R_ARM_THM_MOVT_ABS	48
> +#define R_ARM_THM_MOVW_PREL_NC	49
> +#define R_ARM_THM_MOVT_PREL	50
>  
>  /*
>   * These are used to set parameters in the core dumps.
> diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
> index e15444b25ca0..beac45e89ba6 100644
> --- a/arch/arm/kernel/module.c
> +++ b/arch/arm/kernel/module.c
> @@ -185,14 +185,24 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  			*(u32 *)loc |= offset & 0x7fffffff;
>  			break;
>  
> +		case R_ARM_REL32:
> +			*(u32 *)loc += sym->st_value - loc;
> +			break;
> +
>  		case R_ARM_MOVW_ABS_NC:
>  		case R_ARM_MOVT_ABS:
> +		case R_ARM_MOVW_PREL_NC:
> +		case R_ARM_MOVT_PREL:
>  			offset = tmp = __mem_to_opcode_arm(*(u32 *)loc);
>  			offset = ((offset & 0xf0000) >> 4) | (offset & 0xfff);
>  			offset = (offset ^ 0x8000) - 0x8000;
>  
>  			offset += sym->st_value;
> -			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS)
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVW_PREL_NC)
> +				offset -= loc;
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_ABS ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_MOVT_PREL)
>  				offset >>= 16;
>  
>  			tmp &= 0xfff0f000;
> @@ -283,6 +293,8 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  
>  		case R_ARM_THM_MOVW_ABS_NC:
>  		case R_ARM_THM_MOVT_ABS:
> +		case R_ARM_THM_MOVW_PREL_NC:
> +		case R_ARM_THM_MOVT_PREL:
>  			upper = __mem_to_opcode_thumb16(*(u16 *)loc);
>  			lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2));
>  
> @@ -302,7 +314,11 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
>  			offset = (offset ^ 0x8000) - 0x8000;
>  			offset += sym->st_value;
>  
> -			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS)
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVW_PREL_NC)
> +				offset -= loc;
> +			if (ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_ABS ||
> +			    ELF32_R_TYPE(rel->r_info) == R_ARM_THM_MOVT_PREL)
>  				offset >>= 16;
>  
>  			upper = (u16)((upper & 0xfbf0) |
> -- 
> 2.17.1
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-14 14:06   ` Nicolas Pitre
  -1 siblings, 0 replies; 62+ messages in thread
From: Nicolas Pitre @ 2020-09-14 14:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Russell King, Linus Walleij,
	Nick Desaulniers, Stefan Agner, Peter Smith, Marc Zyngier,
	Will Deacon

On Mon, 14 Sep 2020, Ard Biesheuvel wrote:

> This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> context of my KASLR proof of concept for 32-bit ARM.
> 
> A new use case came up, in the form of Clang, which does not implement
> the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> references that don't fit into a ARM adr instruction, we need something
> else. Patch #2 addresses an actual Clang build issue of this nature, by
> replacing an occurrence of adrl with adr_l.
> 
> I have included my existing cleanup patches that were built on top of the
> adr_l macro, which replace several occurrences of open coded arithmetic to
> calculate runtime addresses based on link time virtual addresses stored
> in literals.
> 
> Note that all of these patches with the exception of #2 were reviewed or
> acked by Nico before, but given that this was a while ago (and the fact

Certainly it must have been, as I didn't remember much of it.

> that neither of us work for Linaro anymore), I have dropped these. Note
> that only patch #1 deviates significantly from the last version that I
> sent out, the remaining ones were just freshened up (and their commit
> logs slightly expanded).

Reviewed-by: Nicolas Pitre <nico@fluxnic.net>

> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Stefan Agner <stefan@agner.ch>
> Cc: Peter Smith <Peter.Smith@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> 
> Ard Biesheuvel (12):
>   ARM: assembler: introduce adr_l, ldr_l and str_l macros
>   ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
>   ARM: module: add support for place relative relocations
>   ARM: head-common.S: use PC-relative insn sequence for __proc_info
>   ARM: head-common.S: use PC-relative insn sequence for idmap creation
>   ARM: head.S: use PC-relative insn sequence for secondary_data
>   ARM: kernel: use relative references for UP/SMP alternatives
>   ARM: head: use PC-relative insn sequence for __smp_alt
>   ARM: sleep.S: use PC-relative insn sequence for
>     sleep_save_sp/mpidr_hash
>   ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
>   ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
>   ARM: kvm: replace open coded VA->PA calculations with adr_l call
> 
>  arch/arm/boot/compressed/head.S  | 18 +---
>  arch/arm/include/asm/assembler.h | 88 ++++++++++++++++++-
>  arch/arm/include/asm/elf.h       |  5 ++
>  arch/arm/include/asm/processor.h |  2 +-
>  arch/arm/kernel/head-common.S    | 22 ++---
>  arch/arm/kernel/head.S           | 90 +++++---------------
>  arch/arm/kernel/hyp-stub.S       | 27 +++---
>  arch/arm/kernel/module.c         | 20 ++++-
>  arch/arm/kernel/sleep.S          | 19 ++---
>  9 files changed, 159 insertions(+), 132 deletions(-)
> 
> -- 
> 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-14 14:06   ` Nicolas Pitre
  0 siblings, 0 replies; 62+ messages in thread
From: Nicolas Pitre @ 2020-09-14 14:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon,
	linux-arm-kernel

On Mon, 14 Sep 2020, Ard Biesheuvel wrote:

> This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> context of my KASLR proof of concept for 32-bit ARM.
> 
> A new use case came up, in the form of Clang, which does not implement
> the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> references that don't fit into a ARM adr instruction, we need something
> else. Patch #2 addresses an actual Clang build issue of this nature, by
> replacing an occurrence of adrl with adr_l.
> 
> I have included my existing cleanup patches that were built on top of the
> adr_l macro, which replace several occurrences of open coded arithmetic to
> calculate runtime addresses based on link time virtual addresses stored
> in literals.
> 
> Note that all of these patches with the exception of #2 were reviewed or
> acked by Nico before, but given that this was a while ago (and the fact

Certainly it must have been, as I didn't remember much of it.

> that neither of us work for Linaro anymore), I have dropped these. Note
> that only patch #1 deviates significantly from the last version that I
> sent out, the remaining ones were just freshened up (and their commit
> logs slightly expanded).

Reviewed-by: Nicolas Pitre <nico@fluxnic.net>

> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Stefan Agner <stefan@agner.ch>
> Cc: Peter Smith <Peter.Smith@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> 
> Ard Biesheuvel (12):
>   ARM: assembler: introduce adr_l, ldr_l and str_l macros
>   ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
>   ARM: module: add support for place relative relocations
>   ARM: head-common.S: use PC-relative insn sequence for __proc_info
>   ARM: head-common.S: use PC-relative insn sequence for idmap creation
>   ARM: head.S: use PC-relative insn sequence for secondary_data
>   ARM: kernel: use relative references for UP/SMP alternatives
>   ARM: head: use PC-relative insn sequence for __smp_alt
>   ARM: sleep.S: use PC-relative insn sequence for
>     sleep_save_sp/mpidr_hash
>   ARM: head.S: use PC-relative insn sequences for __fixup_pv_table
>   ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET
>   ARM: kvm: replace open coded VA->PA calculations with adr_l call
> 
>  arch/arm/boot/compressed/head.S  | 18 +---
>  arch/arm/include/asm/assembler.h | 88 ++++++++++++++++++-
>  arch/arm/include/asm/elf.h       |  5 ++
>  arch/arm/include/asm/processor.h |  2 +-
>  arch/arm/kernel/head-common.S    | 22 ++---
>  arch/arm/kernel/head.S           | 90 +++++---------------
>  arch/arm/kernel/hyp-stub.S       | 27 +++---
>  arch/arm/kernel/module.c         | 20 ++++-
>  arch/arm/kernel/sleep.S          | 19 ++---
>  9 files changed, 159 insertions(+), 132 deletions(-)
> 
> -- 
> 2.17.1
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/12] ARM: module: add support for place relative relocations
  2020-09-14 13:35     ` Nicolas Pitre
@ 2020-09-14 16:11       ` Russell King - ARM Linux admin
  -1 siblings, 0 replies; 62+ messages in thread
From: Russell King - ARM Linux admin @ 2020-09-14 16:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Ard Biesheuvel, linux-efi, linux-arm-kernel, Linus Walleij,
	Nick Desaulniers, Stefan Agner, Peter Smith, Marc Zyngier,
	Will Deacon

On Mon, Sep 14, 2020 at 09:35:41AM -0400, Nicolas Pitre wrote:
> On Mon, 14 Sep 2020, Ard Biesheuvel wrote:
> 
> > When using the new adr_l/ldr_l/str_l macros to refer to external symbols
> > from modules, the linker may emit place relative ELF relocations that
> > need to be fixed up by the module loader. So add support for these.
> 
> Just wondering if that capability requirement should be added to the 
> module signature somehow...?
> 
> Maybe not. The MODULE_ARCH_VERMAGIC definition only contains things that 
> are configurable for a given build.

It doesn't need to be. If a module contains a relocation we don't know
how to handle, it will fail to load.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 03/12] ARM: module: add support for place relative relocations
@ 2020-09-14 16:11       ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 62+ messages in thread
From: Russell King - ARM Linux admin @ 2020-09-14 16:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: linux-efi, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Peter Smith, Stefan Agner, Will Deacon, Ard Biesheuvel,
	linux-arm-kernel

On Mon, Sep 14, 2020 at 09:35:41AM -0400, Nicolas Pitre wrote:
> On Mon, 14 Sep 2020, Ard Biesheuvel wrote:
> 
> > When using the new adr_l/ldr_l/str_l macros to refer to external symbols
> > from modules, the linker may emit place relative ELF relocations that
> > need to be fixed up by the module loader. So add support for these.
> 
> Just wondering if that capability requirement should be added to the 
> module signature somehow...?
> 
> Maybe not. The MODULE_ARCH_VERMAGIC definition only contains things that 
> are configurable for a given build.

It doesn't need to be. If a module contains a relocation we don't know
how to handle, it will fail to load.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
  2020-09-14  9:56   ` Ard Biesheuvel
@ 2020-09-15  7:35     ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-15  7:35 UTC (permalink / raw)
  To: linux-efi
  Cc: Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Nick Desaulniers, Stefan Agner, Peter Smith, Marc Zyngier,
	Will Deacon

On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Like arm64, ARM supports position independent code sequences that
> produce symbol references with a greater reach than the ordinary
> adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
> only supported in ARM mode (and not at all when using Clang), having
> a adr_l macro like we do on arm64 is useful, and increases symmetry
> as well.
>
> Currently, we use open coded instruction sequences involving literals
> and arithmetic operations. Instead, we can use movw/movt pairs on v7
> CPUs, circumventing the D-cache entirely.
>
> E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:
>
>        movw         <reg>, #:lower16:<sym> - (1f + 8)
>        movt         <reg>, #:upper16:<sym> - (1f + 8)
>   1:   add          <reg>, <reg>, pc
>
> For older CPUs, we can emit the literal into a subsection, allowing it
> to be emitted out of line while retaining the ability to perform
> arithmetic on label offsets.
>
> E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:
>
>        ldr          <reg>, 2f
>   1:   add          <reg>, <reg>, pc
>        .subsection  1
>   2:   .long        <sym> - (1b + 8)
>        .previous
>
> This is allowed by the assembler because, unlike ordinary sections,
> subsections are combined into a single section in the object file, and
> so the label references are not true cross-section references that are
> visible as relocations. (Subsections have been available in binutils
> since 2004 at least, so they should not cause any issues with older
> toolchains.)
>
> So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
> all of which will use movw/movt pairs on v7 and later CPUs, and use
> PC-relative literals otherwise.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/include/asm/assembler.h | 84 ++++++++++++++++++++
>  1 file changed, 84 insertions(+)
>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index feac2c8b86f2..39e972eaaa3f 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -494,4 +494,88 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
>  #define _ASM_NOKPROBE(entry)
>  #endif
>
> +       .macro          __adldst_l, op, reg, sym, tmp, c
> +       .if             __LINUX_ARM_ARCH__ < 7
> +       ldr\c           \tmp, .La\@
> +       .subsection     1
> +       .align          2
> +.La\@: .long           \sym - .Lpc\@
> +       .previous
> +       .else
> +       .ifnb           \c
> + THUMB(        ittt            \c                      )
> +       .endif
> +       movw\c          \tmp, #:lower16:\sym - .Lpc\@
> +       movt\c          \tmp, #:upper16:\sym - .Lpc\@
> +       .endif
> +
> +#ifndef CONFIG_THUMB2_KERNEL
> +       .set            .Lpc\@, . + 8                   // PC bias
> +       .ifc            \op, add
> +       add\c           \reg, \tmp, pc
> +       .else
> +       \op\c           \reg, [\tmp, pc]

This should be

       \op\c           \reg, [pc, \tmp]

> +       .endif
> +#else
> +.Lb\@: add\c           \tmp, \tmp, pc
> +       /*
> +        * In Thumb-2 builds, the PC bias depends on whether we are currently
> +        * emitting into a .arm or a .thumb section. The size of the add opcode
> +        * above will be 2 bytes when emitting in Thumb mode and 4 bytes when
> +        * emitting in ARM mode, so let's use this to account for the bias.
> +        */
> +       .set            .Lpc\@, . + (. - .Lb\@)
> +
> +       .ifnc           \op, add
> +       \op\c           \reg, [\tmp]
> +       .endif
> +#endif
> +       .endm
> +
> +       /*
> +        * mov_l - move a constant value or [relocated] address into a register
> +        */
> +       .macro          mov_l, dst:req, imm:req
> +       .if             __LINUX_ARM_ARCH__ < 7
> +       ldr             \dst, =\imm
> +       .else
> +       movw            \dst, #:lower16:\imm
> +       movt            \dst, #:upper16:\imm
> +       .endif
> +       .endm
> +
> +       /*
> +        * adr_l - adr pseudo-op with unlimited range
> +        *
> +        * @dst: destination register
> +        * @sym: name of the symbol
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          adr_l, dst:req, sym:req, cond
> +       __adldst_l      add, \dst, \sym, \dst, \cond
> +       .endm
> +
> +       /*
> +        * ldr_l - ldr <literal> pseudo-op with unlimited range
> +        *
> +        * @dst: destination register
> +        * @sym: name of the symbol
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          ldr_l, dst:req, sym:req, cond
> +       __adldst_l      ldr, \dst, \sym, \dst, \cond
> +       .endm
> +
> +       /*
> +        * str_l - str <literal> pseudo-op with unlimited range
> +        *
> +        * @src: source register
> +        * @sym: name of the symbol
> +        * @tmp: mandatory scratch register
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          str_l, src:req, sym:req, tmp:req, cond
> +       __adldst_l      str, \src, \sym, \tmp, \cond
> +       .endm
> +
>  #endif /* __ASM_ASSEMBLER_H__ */
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
@ 2020-09-15  7:35     ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-15  7:35 UTC (permalink / raw)
  To: linux-efi
  Cc: Nicolas Pitre, Marc Zyngier, Linus Walleij, Nick Desaulniers,
	Russell King, Stefan Agner, Peter Smith, Will Deacon, Linux ARM

On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Like arm64, ARM supports position independent code sequences that
> produce symbol references with a greater reach than the ordinary
> adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
> only supported in ARM mode (and not at all when using Clang), having
> a adr_l macro like we do on arm64 is useful, and increases symmetry
> as well.
>
> Currently, we use open coded instruction sequences involving literals
> and arithmetic operations. Instead, we can use movw/movt pairs on v7
> CPUs, circumventing the D-cache entirely.
>
> E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:
>
>        movw         <reg>, #:lower16:<sym> - (1f + 8)
>        movt         <reg>, #:upper16:<sym> - (1f + 8)
>   1:   add          <reg>, <reg>, pc
>
> For older CPUs, we can emit the literal into a subsection, allowing it
> to be emitted out of line while retaining the ability to perform
> arithmetic on label offsets.
>
> E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:
>
>        ldr          <reg>, 2f
>   1:   add          <reg>, <reg>, pc
>        .subsection  1
>   2:   .long        <sym> - (1b + 8)
>        .previous
>
> This is allowed by the assembler because, unlike ordinary sections,
> subsections are combined into a single section in the object file, and
> so the label references are not true cross-section references that are
> visible as relocations. (Subsections have been available in binutils
> since 2004 at least, so they should not cause any issues with older
> toolchains.)
>
> So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
> all of which will use movw/movt pairs on v7 and later CPUs, and use
> PC-relative literals otherwise.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/include/asm/assembler.h | 84 ++++++++++++++++++++
>  1 file changed, 84 insertions(+)
>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index feac2c8b86f2..39e972eaaa3f 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -494,4 +494,88 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
>  #define _ASM_NOKPROBE(entry)
>  #endif
>
> +       .macro          __adldst_l, op, reg, sym, tmp, c
> +       .if             __LINUX_ARM_ARCH__ < 7
> +       ldr\c           \tmp, .La\@
> +       .subsection     1
> +       .align          2
> +.La\@: .long           \sym - .Lpc\@
> +       .previous
> +       .else
> +       .ifnb           \c
> + THUMB(        ittt            \c                      )
> +       .endif
> +       movw\c          \tmp, #:lower16:\sym - .Lpc\@
> +       movt\c          \tmp, #:upper16:\sym - .Lpc\@
> +       .endif
> +
> +#ifndef CONFIG_THUMB2_KERNEL
> +       .set            .Lpc\@, . + 8                   // PC bias
> +       .ifc            \op, add
> +       add\c           \reg, \tmp, pc
> +       .else
> +       \op\c           \reg, [\tmp, pc]

This should be

       \op\c           \reg, [pc, \tmp]

> +       .endif
> +#else
> +.Lb\@: add\c           \tmp, \tmp, pc
> +       /*
> +        * In Thumb-2 builds, the PC bias depends on whether we are currently
> +        * emitting into a .arm or a .thumb section. The size of the add opcode
> +        * above will be 2 bytes when emitting in Thumb mode and 4 bytes when
> +        * emitting in ARM mode, so let's use this to account for the bias.
> +        */
> +       .set            .Lpc\@, . + (. - .Lb\@)
> +
> +       .ifnc           \op, add
> +       \op\c           \reg, [\tmp]
> +       .endif
> +#endif
> +       .endm
> +
> +       /*
> +        * mov_l - move a constant value or [relocated] address into a register
> +        */
> +       .macro          mov_l, dst:req, imm:req
> +       .if             __LINUX_ARM_ARCH__ < 7
> +       ldr             \dst, =\imm
> +       .else
> +       movw            \dst, #:lower16:\imm
> +       movt            \dst, #:upper16:\imm
> +       .endif
> +       .endm
> +
> +       /*
> +        * adr_l - adr pseudo-op with unlimited range
> +        *
> +        * @dst: destination register
> +        * @sym: name of the symbol
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          adr_l, dst:req, sym:req, cond
> +       __adldst_l      add, \dst, \sym, \dst, \cond
> +       .endm
> +
> +       /*
> +        * ldr_l - ldr <literal> pseudo-op with unlimited range
> +        *
> +        * @dst: destination register
> +        * @sym: name of the symbol
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          ldr_l, dst:req, sym:req, cond
> +       __adldst_l      ldr, \dst, \sym, \dst, \cond
> +       .endm
> +
> +       /*
> +        * str_l - str <literal> pseudo-op with unlimited range
> +        *
> +        * @src: source register
> +        * @sym: name of the symbol
> +        * @tmp: mandatory scratch register
> +        * @cond: conditional opcode suffix
> +        */
> +       .macro          str_l, src:req, sym:req, tmp:req, cond
> +       __adldst_l      str, \src, \sym, \tmp, \cond
> +       .endm
> +
>  #endif /* __ASM_ASSEMBLER_H__ */
> --
> 2.17.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
  2020-09-15  7:35     ` Ard Biesheuvel
@ 2020-09-15 17:51       ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 17:51 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon

On Tue, Sep 15, 2020 at 12:35 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> > index feac2c8b86f2..39e972eaaa3f 100644
> > --- a/arch/arm/include/asm/assembler.h
> > +++ b/arch/arm/include/asm/assembler.h
> > @@ -494,4 +494,88 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
> >  #define _ASM_NOKPROBE(entry)
> >  #endif
> >
> > +       .macro          __adldst_l, op, reg, sym, tmp, c
> > +       .if             __LINUX_ARM_ARCH__ < 7
> > +       ldr\c           \tmp, .La\@
> > +       .subsection     1
> > +       .align          2
> > +.La\@: .long           \sym - .Lpc\@
> > +       .previous
> > +       .else
> > +       .ifnb           \c
> > + THUMB(        ittt            \c                      )
> > +       .endif
> > +       movw\c          \tmp, #:lower16:\sym - .Lpc\@
> > +       movt\c          \tmp, #:upper16:\sym - .Lpc\@
> > +       .endif
> > +
> > +#ifndef CONFIG_THUMB2_KERNEL
> > +       .set            .Lpc\@, . + 8                   // PC bias
> > +       .ifc            \op, add
> > +       add\c           \reg, \tmp, pc
> > +       .else
> > +       \op\c           \reg, [\tmp, pc]
>
> This should be
>
>        \op\c           \reg, [pc, \tmp]

That looks better.  I had tested this series yesterday and wasn't able
to build with either toolchains.  Was going to ask if I was using the
wrong base or if there was an issue with my version binutils.  All
good now, I'll try to hammer on this a bit.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros
@ 2020-09-15 17:51       ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 17:51 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, Will Deacon, Linux ARM

On Tue, Sep 15, 2020 at 12:35 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> > index feac2c8b86f2..39e972eaaa3f 100644
> > --- a/arch/arm/include/asm/assembler.h
> > +++ b/arch/arm/include/asm/assembler.h
> > @@ -494,4 +494,88 @@ THUMB(     orr     \reg , \reg , #PSR_T_BIT        )
> >  #define _ASM_NOKPROBE(entry)
> >  #endif
> >
> > +       .macro          __adldst_l, op, reg, sym, tmp, c
> > +       .if             __LINUX_ARM_ARCH__ < 7
> > +       ldr\c           \tmp, .La\@
> > +       .subsection     1
> > +       .align          2
> > +.La\@: .long           \sym - .Lpc\@
> > +       .previous
> > +       .else
> > +       .ifnb           \c
> > + THUMB(        ittt            \c                      )
> > +       .endif
> > +       movw\c          \tmp, #:lower16:\sym - .Lpc\@
> > +       movt\c          \tmp, #:upper16:\sym - .Lpc\@
> > +       .endif
> > +
> > +#ifndef CONFIG_THUMB2_KERNEL
> > +       .set            .Lpc\@, . + 8                   // PC bias
> > +       .ifc            \op, add
> > +       add\c           \reg, \tmp, pc
> > +       .else
> > +       \op\c           \reg, [\tmp, pc]
>
> This should be
>
>        \op\c           \reg, [pc, \tmp]

That looks better.  I had tested this series yesterday and wasn't able
to build with either toolchains.  Was going to ask if I was using the
wrong base or if there was an issue with my version binutils.  All
good now, I'll try to hammer on this a bit.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-14  9:56 ` Ard Biesheuvel
@ 2020-09-15 19:32   ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 19:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> context of my KASLR proof of concept for 32-bit ARM.
>
> A new use case came up, in the form of Clang, which does not implement
> the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> references that don't fit into a ARM adr instruction, we need something
> else. Patch #2 addresses an actual Clang build issue of this nature, by
> replacing an occurrence of adrl with adr_l.
>
> I have included my existing cleanup patches that were built on top of the
> adr_l macro, which replace several occurrences of open coded arithmetic to
> calculate runtime addresses based on link time virtual addresses stored
> in literals.
>
> Note that all of these patches with the exception of #2 were reviewed or
> acked by Nico before, but given that this was a while ago (and the fact
> that neither of us work for Linaro anymore), I have dropped these. Note
> that only patch #1 deviates significantly from the last version that I
> sent out, the remaining ones were just freshened up (and their commit
> logs slightly expanded).

Tested-by: Nick Desaulniers <ndesaulniers@google.com>

Thanks for the series, Ard.  I was able to compile and boot the
following with this series (and the fixup to 01/12 applied):

$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig

(So ARM v7/GCC, ARM v5,6,7/LLVM).  (Technically, the v6 is not
booting, but it's not related to this series and fails to boot without
the series as well.  Our CI on -next is red for an unrelated issue
which is masking the regression).

I was also able to build+boot the defconfig with v5 and v7 with some
configs disabled and a few hacks with LLVM_IAS=1.  This series allowed
me to get further in the build/testing, and I have a few new bugs to
go chase.  If anyone's interested:
https://github.com/ClangBuiltLinux/linux/issues/1154
https://github.com/ClangBuiltLinux/linux/issues/1155
so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm,
but we're steadily chipping away at it, and this series is a big help.
Lest it look like there's only kernel fixes in this area, Jian's
https://reviews.llvm.org/D69411 recently was a big help, specifically
for ARCH=arm LLVM_IAS=1.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-15 19:32   ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 19:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> context of my KASLR proof of concept for 32-bit ARM.
>
> A new use case came up, in the form of Clang, which does not implement
> the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> references that don't fit into a ARM adr instruction, we need something
> else. Patch #2 addresses an actual Clang build issue of this nature, by
> replacing an occurrence of adrl with adr_l.
>
> I have included my existing cleanup patches that were built on top of the
> adr_l macro, which replace several occurrences of open coded arithmetic to
> calculate runtime addresses based on link time virtual addresses stored
> in literals.
>
> Note that all of these patches with the exception of #2 were reviewed or
> acked by Nico before, but given that this was a while ago (and the fact
> that neither of us work for Linaro anymore), I have dropped these. Note
> that only patch #1 deviates significantly from the last version that I
> sent out, the remaining ones were just freshened up (and their commit
> logs slightly expanded).

Tested-by: Nick Desaulniers <ndesaulniers@google.com>

Thanks for the series, Ard.  I was able to compile and boot the
following with this series (and the fixup to 01/12 applied):

$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig

(So ARM v7/GCC, ARM v5,6,7/LLVM).  (Technically, the v6 is not
booting, but it's not related to this series and fails to boot without
the series as well.  Our CI on -next is red for an unrelated issue
which is masking the regression).

I was also able to build+boot the defconfig with v5 and v7 with some
configs disabled and a few hacks with LLVM_IAS=1.  This series allowed
me to get further in the build/testing, and I have a few new bugs to
go chase.  If anyone's interested:
https://github.com/ClangBuiltLinux/linux/issues/1154
https://github.com/ClangBuiltLinux/linux/issues/1155
so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm,
but we're steadily chipping away at it, and this series is a big help.
Lest it look like there's only kernel fixes in this area, Jian's
https://reviews.llvm.org/D69411 recently was a big help, specifically
for ARCH=arm LLVM_IAS=1.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-15 19:32   ` Nick Desaulniers
@ 2020-09-15 21:29     ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-15 21:29 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Tue, 15 Sep 2020 at 22:32, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> > context of my KASLR proof of concept for 32-bit ARM.
> >
> > A new use case came up, in the form of Clang, which does not implement
> > the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> > references that don't fit into a ARM adr instruction, we need something
> > else. Patch #2 addresses an actual Clang build issue of this nature, by
> > replacing an occurrence of adrl with adr_l.
> >
> > I have included my existing cleanup patches that were built on top of the
> > adr_l macro, which replace several occurrences of open coded arithmetic to
> > calculate runtime addresses based on link time virtual addresses stored
> > in literals.
> >
> > Note that all of these patches with the exception of #2 were reviewed or
> > acked by Nico before, but given that this was a while ago (and the fact
> > that neither of us work for Linaro anymore), I have dropped these. Note
> > that only patch #1 deviates significantly from the last version that I
> > sent out, the remaining ones were just freshened up (and their commit
> > logs slightly expanded).
>
> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>
> Thanks for the series, Ard.  I was able to compile and boot the
> following with this series (and the fixup to 01/12 applied):
>
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig
>
> (So ARM v7/GCC, ARM v5,6,7/LLVM).  (Technically, the v6 is not
> booting, but it's not related to this series and fails to boot without
> the series as well.  Our CI on -next is red for an unrelated issue
> which is masking the regression).
>

Excellent, thanks for testing. Do you have any coverage for Thumb2
builds as well? (CONFIG_THUMB2_KERNEL=y)

> I was also able to build+boot the defconfig with v5 and v7 with some
> configs disabled and a few hacks with LLVM_IAS=1.  This series allowed
> me to get further in the build/testing, and I have a few new bugs to
> go chase.  If anyone's interested:
> https://github.com/ClangBuiltLinux/linux/issues/1154
> https://github.com/ClangBuiltLinux/linux/issues/1155
> so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm,
> but we're steadily chipping away at it, and this series is a big help.
> Lest it look like there's only kernel fixes in this area, Jian's
> https://reviews.llvm.org/D69411 recently was a big help, specifically
> for ARCH=arm LLVM_IAS=1.
> --
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-15 21:29     ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-15 21:29 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Tue, 15 Sep 2020 at 22:32, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > This is a respin of the adr_l/ldr_l code I wrote some years ago in the
> > context of my KASLR proof of concept for 32-bit ARM.
> >
> > A new use case came up, in the form of Clang, which does not implement
> > the 'adrl' pseudo-instruction in its assembler, and so for PC-relative
> > references that don't fit into a ARM adr instruction, we need something
> > else. Patch #2 addresses an actual Clang build issue of this nature, by
> > replacing an occurrence of adrl with adr_l.
> >
> > I have included my existing cleanup patches that were built on top of the
> > adr_l macro, which replace several occurrences of open coded arithmetic to
> > calculate runtime addresses based on link time virtual addresses stored
> > in literals.
> >
> > Note that all of these patches with the exception of #2 were reviewed or
> > acked by Nico before, but given that this was a while ago (and the fact
> > that neither of us work for Linaro anymore), I have dropped these. Note
> > that only patch #1 deviates significantly from the last version that I
> > sent out, the remaining ones were just freshened up (and their commit
> > logs slightly expanded).
>
> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>
> Thanks for the series, Ard.  I was able to compile and boot the
> following with this series (and the fixup to 01/12 applied):
>
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig
>
> (So ARM v7/GCC, ARM v5,6,7/LLVM).  (Technically, the v6 is not
> booting, but it's not related to this series and fails to boot without
> the series as well.  Our CI on -next is red for an unrelated issue
> which is masking the regression).
>

Excellent, thanks for testing. Do you have any coverage for Thumb2
builds as well? (CONFIG_THUMB2_KERNEL=y)

> I was also able to build+boot the defconfig with v5 and v7 with some
> configs disabled and a few hacks with LLVM_IAS=1.  This series allowed
> me to get further in the build/testing, and I have a few new bugs to
> go chase.  If anyone's interested:
> https://github.com/ClangBuiltLinux/linux/issues/1154
> https://github.com/ClangBuiltLinux/linux/issues/1155
> so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm,
> but we're steadily chipping away at it, and this series is a big help.
> Lest it look like there's only kernel fixes in this area, Jian's
> https://reviews.llvm.org/D69411 recently was a big help, specifically
> for ARCH=arm LLVM_IAS=1.
> --
> Thanks,
> ~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-15 21:29     ` Ard Biesheuvel
@ 2020-09-15 23:31       ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 23:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Excellent, thanks for testing. Do you have any coverage for Thumb2
> builds as well? (CONFIG_THUMB2_KERNEL=y)

Ah, right, manually testing ARCH=arm defconfig with
CONFIG_THUMB2_KERNEL enabled via menuconfig:

Builds and boots for clang.

(Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
for an unrelated issue).

For GCC, I observe:

arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
(.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
symbol `vfp_kmode_exception' defined in .text.unlikely section in
arch/arm/vfp/vfpmodule.o

There doesn't seem to be a config option to work around that for now.
Though it's not caused by your series; if I drop your series, I can
still reproduce.

Is there a different config I should be testing rather than
defconfig+manual testing, like one of the existing configs? Looks like
only milbeaut_m10v_defconfig enable THUMB2 by default.  I should add
that to my CI and kernelCI for coverage with clang.
https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376

milbeaut_m10v_defconfig
builds with your series for clang.  Looks like that config may be
currently broken for GCC?
Looks like it doesn't boot with Clang, so I'll need to debug that.
Again, both of the two past sentences are regardless of your series.
So your series still LGTM.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-15 23:31       ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-15 23:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Excellent, thanks for testing. Do you have any coverage for Thumb2
> builds as well? (CONFIG_THUMB2_KERNEL=y)

Ah, right, manually testing ARCH=arm defconfig with
CONFIG_THUMB2_KERNEL enabled via menuconfig:

Builds and boots for clang.

(Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
for an unrelated issue).

For GCC, I observe:

arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
(.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
symbol `vfp_kmode_exception' defined in .text.unlikely section in
arch/arm/vfp/vfpmodule.o

There doesn't seem to be a config option to work around that for now.
Though it's not caused by your series; if I drop your series, I can
still reproduce.

Is there a different config I should be testing rather than
defconfig+manual testing, like one of the existing configs? Looks like
only milbeaut_m10v_defconfig enable THUMB2 by default.  I should add
that to my CI and kernelCI for coverage with clang.
https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376

milbeaut_m10v_defconfig
builds with your series for clang.  Looks like that config may be
currently broken for GCC?
Looks like it doesn't boot with Clang, so I'll need to debug that.
Again, both of the two past sentences are regardless of your series.
So your series still LGTM.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-15 23:31       ` Nick Desaulniers
@ 2020-09-16  5:54         ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-16  5:54 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > builds as well? (CONFIG_THUMB2_KERNEL=y)
>
> Ah, right, manually testing ARCH=arm defconfig with
> CONFIG_THUMB2_KERNEL enabled via menuconfig:
>
> Builds and boots for clang.
>
> (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> for an unrelated issue).
>
> For GCC, I observe:
>
> arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> symbol `vfp_kmode_exception' defined in .text.unlikely section in
> arch/arm/vfp/vfpmodule.o
>

Interesting. And this is using ld.bfd ?

> There doesn't seem to be a config option to work around that for now.
> Though it's not caused by your series; if I drop your series, I can
> still reproduce.
>
> Is there a different config I should be testing rather than
> defconfig+manual testing, like one of the existing configs? Looks like
> only milbeaut_m10v_defconfig enable THUMB2 by default.  I should add
> that to my CI and kernelCI for coverage with clang.
> https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376
>
> milbeaut_m10v_defconfig
> builds with your series for clang.  Looks like that config may be
> currently broken for GCC?
> Looks like it doesn't boot with Clang, so I'll need to debug that.
> Again, both of the two past sentences are regardless of your series.
> So your series still LGTM.

Cheers.

I usually build multi_v7_defconfig with/wihout CONFIG_LPAE x
with/without CONFIG_THUMB2_KERNEL (LPAE affects the size of
dma_addr_t, and uses a different page table format, so it is a useful
bit to flick in testing)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-16  5:54         ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-16  5:54 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > builds as well? (CONFIG_THUMB2_KERNEL=y)
>
> Ah, right, manually testing ARCH=arm defconfig with
> CONFIG_THUMB2_KERNEL enabled via menuconfig:
>
> Builds and boots for clang.
>
> (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> for an unrelated issue).
>
> For GCC, I observe:
>
> arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> symbol `vfp_kmode_exception' defined in .text.unlikely section in
> arch/arm/vfp/vfpmodule.o
>

Interesting. And this is using ld.bfd ?

> There doesn't seem to be a config option to work around that for now.
> Though it's not caused by your series; if I drop your series, I can
> still reproduce.
>
> Is there a different config I should be testing rather than
> defconfig+manual testing, like one of the existing configs? Looks like
> only milbeaut_m10v_defconfig enable THUMB2 by default.  I should add
> that to my CI and kernelCI for coverage with clang.
> https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376
>
> milbeaut_m10v_defconfig
> builds with your series for clang.  Looks like that config may be
> currently broken for GCC?
> Looks like it doesn't boot with Clang, so I'll need to debug that.
> Again, both of the two past sentences are regardless of your series.
> So your series still LGTM.

Cheers.

I usually build multi_v7_defconfig with/wihout CONFIG_LPAE x
with/without CONFIG_THUMB2_KERNEL (LPAE affects the size of
dma_addr_t, and uses a different page table format, so it is a useful
bit to flick in testing)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-16  5:54         ` Ard Biesheuvel
@ 2020-09-16 19:53           ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-16 19:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> >
> > Ah, right, manually testing ARCH=arm defconfig with
> > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> >
> > Builds and boots for clang.
> >
> > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > for an unrelated issue).
> >
> > For GCC, I observe:
> >
> > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > arch/arm/vfp/vfpmodule.o
> >
>
> Interesting. And this is using ld.bfd ?

$ arm-linux-gnueabihf-ld --version
GNU ld (GNU Binutils for Debian) 2.34

.text.unlikely reminds me of the sections created when profiling data
is present.  That's obviously not the case here, so maybe there's
other ways this section can be created by GCC?  I may have added a
patch recently for placing this section explicitly, which was a
prerequisite for Kees' series explicitly placing all sections.
https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
Maybe that can be improved?

IIRC, LLD is able to emit range extension thunks for these cases, but
BFD does not.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-16 19:53           ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-16 19:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> >
> > Ah, right, manually testing ARCH=arm defconfig with
> > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> >
> > Builds and boots for clang.
> >
> > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > for an unrelated issue).
> >
> > For GCC, I observe:
> >
> > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > arch/arm/vfp/vfpmodule.o
> >
>
> Interesting. And this is using ld.bfd ?

$ arm-linux-gnueabihf-ld --version
GNU ld (GNU Binutils for Debian) 2.34

.text.unlikely reminds me of the sections created when profiling data
is present.  That's obviously not the case here, so maybe there's
other ways this section can be created by GCC?  I may have added a
patch recently for placing this section explicitly, which was a
prerequisite for Kees' series explicitly placing all sections.
https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
Maybe that can be improved?

IIRC, LLD is able to emit range extension thunks for these cases, but
BFD does not.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-16 19:53           ` Nick Desaulniers
@ 2020-09-16 20:45             ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-16 20:45 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > >
> > > Ah, right, manually testing ARCH=arm defconfig with
> > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > >
> > > Builds and boots for clang.
> > >
> > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > for an unrelated issue).
> > >
> > > For GCC, I observe:
> > >
> > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > arch/arm/vfp/vfpmodule.o
> > >
> >
> > Interesting. And this is using ld.bfd ?
>
> $ arm-linux-gnueabihf-ld --version
> GNU ld (GNU Binutils for Debian) 2.34
>
> .text.unlikely reminds me of the sections created when profiling data
> is present.  That's obviously not the case here, so maybe there's
> other ways this section can be created by GCC?  I may have added a
> patch recently for placing this section explicitly, which was a
> prerequisite for Kees' series explicitly placing all sections.
> https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> Maybe that can be improved?
>
> IIRC, LLD is able to emit range extension thunks for these cases, but
> BFD does not.

ld.bfd will usually emit veneers for branches that turn out to be out
of range in the final link stage.

Does the following help?

diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 4fcff9f59947..f1468702fbc9 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
        ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
        and     r3, r3, #MODE_MASK      @ are supported in kernel mode
        teq     r3, #USR_MODE
+THUMB( it      ne                      )
        bne     vfp_kmode_exception     @ Returns through lr

        VFPFMRX r1, FPEXC               @ Is the VFP enabled?

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-16 20:45             ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-16 20:45 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > >
> > > Ah, right, manually testing ARCH=arm defconfig with
> > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > >
> > > Builds and boots for clang.
> > >
> > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > for an unrelated issue).
> > >
> > > For GCC, I observe:
> > >
> > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > arch/arm/vfp/vfpmodule.o
> > >
> >
> > Interesting. And this is using ld.bfd ?
>
> $ arm-linux-gnueabihf-ld --version
> GNU ld (GNU Binutils for Debian) 2.34
>
> .text.unlikely reminds me of the sections created when profiling data
> is present.  That's obviously not the case here, so maybe there's
> other ways this section can be created by GCC?  I may have added a
> patch recently for placing this section explicitly, which was a
> prerequisite for Kees' series explicitly placing all sections.
> https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> Maybe that can be improved?
>
> IIRC, LLD is able to emit range extension thunks for these cases, but
> BFD does not.

ld.bfd will usually emit veneers for branches that turn out to be out
of range in the final link stage.

Does the following help?

diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 4fcff9f59947..f1468702fbc9 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
        ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
        and     r3, r3, #MODE_MASK      @ are supported in kernel mode
        teq     r3, #USR_MODE
+THUMB( it      ne                      )
        bne     vfp_kmode_exception     @ Returns through lr

        VFPFMRX r1, FPEXC               @ Is the VFP enabled?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-16 20:45             ` Ard Biesheuvel
@ 2020-09-16 21:25               ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-16 21:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > >
> > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > > >
> > > > Ah, right, manually testing ARCH=arm defconfig with
> > > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > > >
> > > > Builds and boots for clang.
> > > >
> > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > > for an unrelated issue).
> > > >
> > > > For GCC, I observe:
> > > >
> > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > > arch/arm/vfp/vfpmodule.o
> > > >
> > >
> > > Interesting. And this is using ld.bfd ?
> >
> > $ arm-linux-gnueabihf-ld --version
> > GNU ld (GNU Binutils for Debian) 2.34
> >
> > .text.unlikely reminds me of the sections created when profiling data
> > is present.  That's obviously not the case here, so maybe there's
> > other ways this section can be created by GCC?  I may have added a
> > patch recently for placing this section explicitly, which was a
> > prerequisite for Kees' series explicitly placing all sections.
> > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> > Maybe that can be improved?
> >
> > IIRC, LLD is able to emit range extension thunks for these cases, but
> > BFD does not.
>
> ld.bfd will usually emit veneers for branches that turn out to be out
> of range in the final link stage.
>
> Does the following help?
>
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 4fcff9f59947..f1468702fbc9 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
>         ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
>         and     r3, r3, #MODE_MASK      @ are supported in kernel mode
>         teq     r3, #USR_MODE
> +THUMB( it      ne                      )
>         bne     vfp_kmode_exception     @ Returns through lr
>
>         VFPFMRX r1, FPEXC               @ Is the VFP enabled?

Yes, it does!  I'm curious why though?  I've seen the "it prefixes"
before (is that what they're called?), but I don't recall what they're
for.  How come that solves this issue?

(Feel free to use my tested by tag on a subsequent resulting patch
that uses that).  That fix is irrespective of this series, so you
should send it maybe separately?

Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
than one bug).  (or maybe one of my command line params to QEMU is
wrong).

Stepping through start_kernel(), the call to setup_arch() seems to
hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
never gets hit.  Looks like the deepest I can get is:

Looks like:
#0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
nid=-1) at mm/memblock.c:1601
#1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
align=<optimized out>) at ./include/linux/memblock.h:409
#2  cma_init_reserved_mem (base=1543503872, size=67108864,
order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
<dma_contiguous_default_area>) at mm/cma.c:190
#3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
fixed=false, name=0xc04b9240 "reserved",
    res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
#4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
order_per_bit=<optimized out>, name=<optimized out>,
res_cma=<optimized out>, fixed=<optimized out>,
    limit=<optimized out>, size=<optimized out>, base=<optimized out>)
at ./include/linux/cma.h:28
#5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
out>, limit=<optimized out>, res_cma=0xc07ccbdc
<dma_contiguous_default_area>, fixed=false)
    at kernel/dma/contiguous.c:201
#6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
kernel/dma/contiguous.c:171
#7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
<__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
#8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
arch/arm/kernel/setup.c:1132
#9  0xc06007d2 in start_kernel () at init/main.c:857

there's a call to memset that seems to hang.  I wonder if memset() was
defined in terms of __builtin_memset, which oft can result in infinite
loops?  (IIRC there's an AEABI related memset; this kind of thing has
hit android userspace before).

(gdb) layout asm

shows that the source call to memset is transformed into a call to mmioset.

(gdb) bt
#0  mmioset () at arch/arm/lib/memset.S:19
#1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
#2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
align=<optimized out>) at ./include/linux/memblock.h:409
#3  cma_init_reserved_mem (base=1543503872, size=67108864,
order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
<dma_contiguous_default_area>) at mm/cma.c:190
#4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
fixed=false, name=0xc04b9240 "reserved",
    res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
#5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
order_per_bit=<optimized out>, name=<optimized out>,
res_cma=<optimized out>, fixed=<optimized out>,
    limit=<optimized out>, size=<optimized out>, base=<optimized out>)
at ./include/linux/cma.h:28
#6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
out>, limit=<optimized out>, res_cma=0xc07ccbdc
<dma_contiguous_default_area>, fixed=false)
    at kernel/dma/contiguous.c:201
#7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
kernel/dma/contiguous.c:171
#8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
<__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
#9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
arch/arm/kernel/setup.c:1132
#10 0xc06007d2 in start_kernel () at init/main.c:857

Using `si` in gdb, it looks like we maybe hit an exception vector?
x   1202                .section .vectors, "ax", %progbits

                                            x
x   1203        .L__vectors_start:

                                            x
x   1204                W(b)    vector_rst

                                            x
x   1205                W(b)    vector_und

                                            x
x   1206                W(ldr)  pc, .L__vectors_start + 0x1000

                                            x
x  >1207                W(b)    vector_pabt

Is the last thing I see, then `si` stops working.  Not sure if `pabt`
is a recognizable acronym to anyone more familiar with ARM?

Happens regardless of your series, FWIW.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-16 21:25               ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-16 21:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > >
> > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > > >
> > > > Ah, right, manually testing ARCH=arm defconfig with
> > > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > > >
> > > > Builds and boots for clang.
> > > >
> > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > > for an unrelated issue).
> > > >
> > > > For GCC, I observe:
> > > >
> > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > > arch/arm/vfp/vfpmodule.o
> > > >
> > >
> > > Interesting. And this is using ld.bfd ?
> >
> > $ arm-linux-gnueabihf-ld --version
> > GNU ld (GNU Binutils for Debian) 2.34
> >
> > .text.unlikely reminds me of the sections created when profiling data
> > is present.  That's obviously not the case here, so maybe there's
> > other ways this section can be created by GCC?  I may have added a
> > patch recently for placing this section explicitly, which was a
> > prerequisite for Kees' series explicitly placing all sections.
> > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> > Maybe that can be improved?
> >
> > IIRC, LLD is able to emit range extension thunks for these cases, but
> > BFD does not.
>
> ld.bfd will usually emit veneers for branches that turn out to be out
> of range in the final link stage.
>
> Does the following help?
>
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 4fcff9f59947..f1468702fbc9 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
>         ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
>         and     r3, r3, #MODE_MASK      @ are supported in kernel mode
>         teq     r3, #USR_MODE
> +THUMB( it      ne                      )
>         bne     vfp_kmode_exception     @ Returns through lr
>
>         VFPFMRX r1, FPEXC               @ Is the VFP enabled?

Yes, it does!  I'm curious why though?  I've seen the "it prefixes"
before (is that what they're called?), but I don't recall what they're
for.  How come that solves this issue?

(Feel free to use my tested by tag on a subsequent resulting patch
that uses that).  That fix is irrespective of this series, so you
should send it maybe separately?

Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
than one bug).  (or maybe one of my command line params to QEMU is
wrong).

Stepping through start_kernel(), the call to setup_arch() seems to
hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
never gets hit.  Looks like the deepest I can get is:

Looks like:
#0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
nid=-1) at mm/memblock.c:1601
#1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
align=<optimized out>) at ./include/linux/memblock.h:409
#2  cma_init_reserved_mem (base=1543503872, size=67108864,
order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
<dma_contiguous_default_area>) at mm/cma.c:190
#3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
fixed=false, name=0xc04b9240 "reserved",
    res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
#4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
order_per_bit=<optimized out>, name=<optimized out>,
res_cma=<optimized out>, fixed=<optimized out>,
    limit=<optimized out>, size=<optimized out>, base=<optimized out>)
at ./include/linux/cma.h:28
#5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
out>, limit=<optimized out>, res_cma=0xc07ccbdc
<dma_contiguous_default_area>, fixed=false)
    at kernel/dma/contiguous.c:201
#6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
kernel/dma/contiguous.c:171
#7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
<__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
#8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
arch/arm/kernel/setup.c:1132
#9  0xc06007d2 in start_kernel () at init/main.c:857

there's a call to memset that seems to hang.  I wonder if memset() was
defined in terms of __builtin_memset, which oft can result in infinite
loops?  (IIRC there's an AEABI related memset; this kind of thing has
hit android userspace before).

(gdb) layout asm

shows that the source call to memset is transformed into a call to mmioset.

(gdb) bt
#0  mmioset () at arch/arm/lib/memset.S:19
#1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
#2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
align=<optimized out>) at ./include/linux/memblock.h:409
#3  cma_init_reserved_mem (base=1543503872, size=67108864,
order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
<dma_contiguous_default_area>) at mm/cma.c:190
#4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
fixed=false, name=0xc04b9240 "reserved",
    res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
#5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
order_per_bit=<optimized out>, name=<optimized out>,
res_cma=<optimized out>, fixed=<optimized out>,
    limit=<optimized out>, size=<optimized out>, base=<optimized out>)
at ./include/linux/cma.h:28
#6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
out>, limit=<optimized out>, res_cma=0xc07ccbdc
<dma_contiguous_default_area>, fixed=false)
    at kernel/dma/contiguous.c:201
#7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
kernel/dma/contiguous.c:171
#8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
<__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
#9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
arch/arm/kernel/setup.c:1132
#10 0xc06007d2 in start_kernel () at init/main.c:857

Using `si` in gdb, it looks like we maybe hit an exception vector?
x   1202                .section .vectors, "ax", %progbits

                                            x
x   1203        .L__vectors_start:

                                            x
x   1204                W(b)    vector_rst

                                            x
x   1205                W(b)    vector_und

                                            x
x   1206                W(ldr)  pc, .L__vectors_start + 0x1000

                                            x
x  >1207                W(b)    vector_pabt

Is the last thing I see, then `si` stops working.  Not sure if `pabt`
is a recognizable acronym to anyone more familiar with ARM?

Happens regardless of your series, FWIW.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-16 21:25               ` Nick Desaulniers
@ 2020-09-17  0:16                 ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-17  0:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai, song.bao.hua, Anders Roxell, rppt,
	Naresh Kamboju

On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> than one bug).  (or maybe one of my command line params to QEMU is
> wrong).
>
> Stepping through start_kernel(), the call to setup_arch() seems to
> hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
> never gets hit.  Looks like the deepest I can get is:
>
> Looks like:
> #0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
> nid=-1) at mm/memblock.c:1601
> #1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #2  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #9  0xc06007d2 in start_kernel () at init/main.c:857
>
> there's a call to memset that seems to hang.  I wonder if memset() was
> defined in terms of __builtin_memset, which oft can result in infinite
> loops?  (IIRC there's an AEABI related memset; this kind of thing has
> hit android userspace before).
>
> (gdb) layout asm
>
> shows that the source call to memset is transformed into a call to mmioset.
>
> (gdb) bt
> #0  mmioset () at arch/arm/lib/memset.S:19
> #1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
> out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
> #2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #3  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #10 0xc06007d2 in start_kernel () at init/main.c:857
>
> Using `si` in gdb, it looks like we maybe hit an exception vector?
> x   1202                .section .vectors, "ax", %progbits
>
>                                             x
> x   1203        .L__vectors_start:
>
>                                             x
> x   1204                W(b)    vector_rst
>
>                                             x
> x   1205                W(b)    vector_und
>
>                                             x
> x   1206                W(ldr)  pc, .L__vectors_start + 0x1000
>
>                                             x
> x  >1207                W(b)    vector_pabt
>
> Is the last thing I see, then `si` stops working.  Not sure if `pabt`
> is a recognizable acronym to anyone more familiar with ARM?
>
> Happens regardless of your series, FWIW.

It seems this is affecting the ARCH=arm defconfig (regardless of
toolchain) of linux-next today.  I know you've warned me about testing
on -next before...

Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/
? That looks arm64 specific though.  Maybe 32b ARM needs the same or a
similar fix?  Oh man, this boots, total shot in the dark:

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 45f9d5ec2360..7118b98c1f5f 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct
machine_desc *mdesc)
        early_init_fdt_reserve_self();
        early_init_fdt_scan_reserved_mem();

-       /* reserve memory for DMA contiguous allocations */
-       dma_contiguous_reserve(arm_dma_limit);
-
        arm_memblock_steal_permitted = false;
        memblock_dump_all();
 }
@@ -248,6 +245,9 @@ void __init bootmem_init(void)
         */
        sparse_init();

+       /* reserve memory for DMA contiguous allocations */
+       dma_contiguous_reserve(arm_dma_limit);
+
        /*
         * Now free the memory - free_area_init needs
         * the sparse mem_map arrays initialized by sparse_init()
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-17  0:16                 ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-17  0:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: song.bao.hua, linux-efi, Anders Roxell, Nicolas Pitre,
	Marc Zyngier, Linus Walleij, Russell King, Stefan Agner,
	Peter Smith, clang-built-linux, rppt, Jian Cai, Will Deacon,
	Naresh Kamboju, Linux ARM

On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> than one bug).  (or maybe one of my command line params to QEMU is
> wrong).
>
> Stepping through start_kernel(), the call to setup_arch() seems to
> hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
> never gets hit.  Looks like the deepest I can get is:
>
> Looks like:
> #0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
> nid=-1) at mm/memblock.c:1601
> #1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #2  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #9  0xc06007d2 in start_kernel () at init/main.c:857
>
> there's a call to memset that seems to hang.  I wonder if memset() was
> defined in terms of __builtin_memset, which oft can result in infinite
> loops?  (IIRC there's an AEABI related memset; this kind of thing has
> hit android userspace before).
>
> (gdb) layout asm
>
> shows that the source call to memset is transformed into a call to mmioset.
>
> (gdb) bt
> #0  mmioset () at arch/arm/lib/memset.S:19
> #1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
> out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
> #2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #3  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #10 0xc06007d2 in start_kernel () at init/main.c:857
>
> Using `si` in gdb, it looks like we maybe hit an exception vector?
> x   1202                .section .vectors, "ax", %progbits
>
>                                             x
> x   1203        .L__vectors_start:
>
>                                             x
> x   1204                W(b)    vector_rst
>
>                                             x
> x   1205                W(b)    vector_und
>
>                                             x
> x   1206                W(ldr)  pc, .L__vectors_start + 0x1000
>
>                                             x
> x  >1207                W(b)    vector_pabt
>
> Is the last thing I see, then `si` stops working.  Not sure if `pabt`
> is a recognizable acronym to anyone more familiar with ARM?
>
> Happens regardless of your series, FWIW.

It seems this is affecting the ARCH=arm defconfig (regardless of
toolchain) of linux-next today.  I know you've warned me about testing
on -next before...

Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/
? That looks arm64 specific though.  Maybe 32b ARM needs the same or a
similar fix?  Oh man, this boots, total shot in the dark:

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 45f9d5ec2360..7118b98c1f5f 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct
machine_desc *mdesc)
        early_init_fdt_reserve_self();
        early_init_fdt_scan_reserved_mem();

-       /* reserve memory for DMA contiguous allocations */
-       dma_contiguous_reserve(arm_dma_limit);
-
        arm_memblock_steal_permitted = false;
        memblock_dump_all();
 }
@@ -248,6 +245,9 @@ void __init bootmem_init(void)
         */
        sparse_init();

+       /* reserve memory for DMA contiguous allocations */
+       dma_contiguous_reserve(arm_dma_limit);
+
        /*
         * Now free the memory - free_area_init needs
         * the sparse mem_map arrays initialized by sparse_init()
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-17  0:16                 ` Nick Desaulniers
@ 2020-09-17  5:19                   ` Mike Rapoport
  -1 siblings, 0 replies; 62+ messages in thread
From: Mike Rapoport @ 2020-09-17  5:19 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Ard Biesheuvel, linux-efi, Linux ARM, Russell King,
	Linus Walleij, Nicolas Pitre, Stefan Agner, Peter Smith,
	Marc Zyngier, Will Deacon, clang-built-linux, Jian Cai,
	song.bao.hua, Anders Roxell, Naresh Kamboju, Mike Kravetz

(added Mike K.)

On Wed, Sep 16, 2020 at 05:16:32PM -0700, Nick Desaulniers wrote:
> On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:

...
 
> Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/
> ? That looks arm64 specific though.  Maybe 32b ARM needs the same or a
> similar fix?  Oh man, this boots, total shot in the dark:

The CMA change is the problem IMO and it's now removed from -mm and
-next trees. 

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 45f9d5ec2360..7118b98c1f5f 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct
> machine_desc *mdesc)
>         early_init_fdt_reserve_self();
>         early_init_fdt_scan_reserved_mem();
> 
> -       /* reserve memory for DMA contiguous allocations */
> -       dma_contiguous_reserve(arm_dma_limit);
> -
>         arm_memblock_steal_permitted = false;
>         memblock_dump_all();
>  }
> @@ -248,6 +245,9 @@ void __init bootmem_init(void)
>          */
>         sparse_init();
> 
> +       /* reserve memory for DMA contiguous allocations */
> +       dma_contiguous_reserve(arm_dma_limit);
> +

This might be too late for ARM because in paging_init() it calls
dma_contiguous_remap() which presumes that the CMA area is already
reserved.

dma_contiguous_remap() might be NOP, so your fix will boot until it
fails eventually :) 

>         /*
>          * Now free the memory - free_area_init needs
>          * the sparse mem_map arrays initialized by sparse_init()

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-17  5:19                   ` Mike Rapoport
  0 siblings, 0 replies; 62+ messages in thread
From: Mike Rapoport @ 2020-09-17  5:19 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: song.bao.hua, linux-efi, Anders Roxell, Nicolas Pitre,
	Marc Zyngier, Linus Walleij, Russell King, Stefan Agner,
	Peter Smith, clang-built-linux, Naresh Kamboju, Jian Cai,
	Will Deacon, Ard Biesheuvel, Linux ARM, Mike Kravetz

(added Mike K.)

On Wed, Sep 16, 2020 at 05:16:32PM -0700, Nick Desaulniers wrote:
> On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:

...
 
> Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/
> ? That looks arm64 specific though.  Maybe 32b ARM needs the same or a
> similar fix?  Oh man, this boots, total shot in the dark:

The CMA change is the problem IMO and it's now removed from -mm and
-next trees. 

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 45f9d5ec2360..7118b98c1f5f 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct
> machine_desc *mdesc)
>         early_init_fdt_reserve_self();
>         early_init_fdt_scan_reserved_mem();
> 
> -       /* reserve memory for DMA contiguous allocations */
> -       dma_contiguous_reserve(arm_dma_limit);
> -
>         arm_memblock_steal_permitted = false;
>         memblock_dump_all();
>  }
> @@ -248,6 +245,9 @@ void __init bootmem_init(void)
>          */
>         sparse_init();
> 
> +       /* reserve memory for DMA contiguous allocations */
> +       dma_contiguous_reserve(arm_dma_limit);
> +

This might be too late for ARM because in paging_init() it calls
dma_contiguous_remap() which presumes that the CMA area is already
reserved.

dma_contiguous_remap() might be NOP, so your fix will boot until it
fails eventually :) 

>         /*
>          * Now free the memory - free_area_init needs
>          * the sparse mem_map arrays initialized by sparse_init()

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-16 21:25               ` Nick Desaulniers
@ 2020-09-17  6:01                 ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-17  6:01 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai

On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > > >
> > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > > > >
> > > > > Ah, right, manually testing ARCH=arm defconfig with
> > > > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > > > >
> > > > > Builds and boots for clang.
> > > > >
> > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > > > for an unrelated issue).
> > > > >
> > > > > For GCC, I observe:
> > > > >
> > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > > > arch/arm/vfp/vfpmodule.o
> > > > >
> > > >
> > > > Interesting. And this is using ld.bfd ?
> > >
> > > $ arm-linux-gnueabihf-ld --version
> > > GNU ld (GNU Binutils for Debian) 2.34
> > >
> > > .text.unlikely reminds me of the sections created when profiling data
> > > is present.  That's obviously not the case here, so maybe there's
> > > other ways this section can be created by GCC?  I may have added a
> > > patch recently for placing this section explicitly, which was a
> > > prerequisite for Kees' series explicitly placing all sections.
> > > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> > > Maybe that can be improved?
> > >
> > > IIRC, LLD is able to emit range extension thunks for these cases, but
> > > BFD does not.
> >
> > ld.bfd will usually emit veneers for branches that turn out to be out
> > of range in the final link stage.
> >
> > Does the following help?
> >
> > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> > index 4fcff9f59947..f1468702fbc9 100644
> > --- a/arch/arm/vfp/vfphw.S
> > +++ b/arch/arm/vfp/vfphw.S
> > @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
> >         ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
> >         and     r3, r3, #MODE_MASK      @ are supported in kernel mode
> >         teq     r3, #USR_MODE
> > +THUMB( it      ne                      )
> >         bne     vfp_kmode_exception     @ Returns through lr
> >
> >         VFPFMRX r1, FPEXC               @ Is the VFP enabled?
>
> Yes, it does!  I'm curious why though?  I've seen the "it prefixes"
> before (is that what they're called?), but I don't recall what they're
> for.  How come that solves this issue?
>

It forces the use of an instruction encoding that has more space for
an immediate.

> (Feel free to use my tested by tag on a subsequent resulting patch
> that uses that).  That fix is irrespective of this series, so you
> should send it maybe separately?
>

I will, thanks.

> Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> than one bug).  (or maybe one of my command line params to QEMU is
> wrong).
>

I understand that this is actually an existing issue in -next, but in
general, why would you expect to be able to boot
milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
machine? (whatever it is) Or does QEMU emulate a milbeaut machine? If
not, better to stick with configs that are intended to boot on the
QEMU machine emulation that you are using.

Also, while I see the point of regression testing of -next, using it
as a base to test arbitrary series and then report failures against it
produces a lot of noise. -next is *not* a good base for development,
because you get everybody else's half baked crap as well. When you
test my stuff, please use a known good base so we're not off on a
goose chase every time.



> Stepping through start_kernel(), the call to setup_arch() seems to
> hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
> never gets hit.  Looks like the deepest I can get is:
>
> Looks like:
> #0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
> nid=-1) at mm/memblock.c:1601
> #1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #2  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #9  0xc06007d2 in start_kernel () at init/main.c:857
>
> there's a call to memset that seems to hang.  I wonder if memset() was
> defined in terms of __builtin_memset, which oft can result in infinite
> loops?  (IIRC there's an AEABI related memset; this kind of thing has
> hit android userspace before).
>
> (gdb) layout asm
>
> shows that the source call to memset is transformed into a call to mmioset.
>
> (gdb) bt
> #0  mmioset () at arch/arm/lib/memset.S:19
> #1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
> out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
> #2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #3  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #10 0xc06007d2 in start_kernel () at init/main.c:857
>
> Using `si` in gdb, it looks like we maybe hit an exception vector?
> x   1202                .section .vectors, "ax", %progbits
>
>                                             x
> x   1203        .L__vectors_start:
>
>                                             x
> x   1204                W(b)    vector_rst
>
>                                             x
> x   1205                W(b)    vector_und
>
>                                             x
> x   1206                W(ldr)  pc, .L__vectors_start + 0x1000
>
>                                             x
> x  >1207                W(b)    vector_pabt
>
> Is the last thing I see, then `si` stops working.  Not sure if `pabt`
> is a recognizable acronym to anyone more familiar with ARM?
>
> Happens regardless of your series, FWIW.
> --
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-17  6:01                 ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-17  6:01 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Jian Cai, Will Deacon, Linux ARM

On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > > >
> > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > >
> > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2
> > > > > > builds as well? (CONFIG_THUMB2_KERNEL=y)
> > > > >
> > > > > Ah, right, manually testing ARCH=arm defconfig with
> > > > > CONFIG_THUMB2_KERNEL enabled via menuconfig:
> > > > >
> > > > > Builds and boots for clang.
> > > > >
> > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u
> > > > > for an unrelated issue).
> > > > >
> > > > > For GCC, I observe:
> > > > >
> > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry':
> > > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against
> > > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in
> > > > > arch/arm/vfp/vfpmodule.o
> > > > >
> > > >
> > > > Interesting. And this is using ld.bfd ?
> > >
> > > $ arm-linux-gnueabihf-ld --version
> > > GNU ld (GNU Binutils for Debian) 2.34
> > >
> > > .text.unlikely reminds me of the sections created when profiling data
> > > is present.  That's obviously not the case here, so maybe there's
> > > other ways this section can be created by GCC?  I may have added a
> > > patch recently for placing this section explicitly, which was a
> > > prerequisite for Kees' series explicitly placing all sections.
> > > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/
> > > Maybe that can be improved?
> > >
> > > IIRC, LLD is able to emit range extension thunks for these cases, but
> > > BFD does not.
> >
> > ld.bfd will usually emit veneers for branches that turn out to be out
> > of range in the final link stage.
> >
> > Does the following help?
> >
> > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> > index 4fcff9f59947..f1468702fbc9 100644
> > --- a/arch/arm/vfp/vfphw.S
> > +++ b/arch/arm/vfp/vfphw.S
> > @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry)
> >         ldr     r3, [sp, #S_PSR]        @ Neither lazy restore nor FP exceptions
> >         and     r3, r3, #MODE_MASK      @ are supported in kernel mode
> >         teq     r3, #USR_MODE
> > +THUMB( it      ne                      )
> >         bne     vfp_kmode_exception     @ Returns through lr
> >
> >         VFPFMRX r1, FPEXC               @ Is the VFP enabled?
>
> Yes, it does!  I'm curious why though?  I've seen the "it prefixes"
> before (is that what they're called?), but I don't recall what they're
> for.  How come that solves this issue?
>

It forces the use of an instruction encoding that has more space for
an immediate.

> (Feel free to use my tested by tag on a subsequent resulting patch
> that uses that).  That fix is irrespective of this series, so you
> should send it maybe separately?
>

I will, thanks.

> Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> than one bug).  (or maybe one of my command line params to QEMU is
> wrong).
>

I understand that this is actually an existing issue in -next, but in
general, why would you expect to be able to boot
milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
machine? (whatever it is) Or does QEMU emulate a milbeaut machine? If
not, better to stick with configs that are intended to boot on the
QEMU machine emulation that you are using.

Also, while I see the point of regression testing of -next, using it
as a base to test arbitrary series and then report failures against it
produces a lot of noise. -next is *not* a good base for development,
because you get everybody else's half baked crap as well. When you
test my stuff, please use a known good base so we're not off on a
goose chase every time.



> Stepping through start_kernel(), the call to setup_arch() seems to
> hang in qemu.  For both GCC and Clang builds. A breakpoint in panic()
> never gets hit.  Looks like the deepest I can get is:
>
> Looks like:
> #0  memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0,
> nid=-1) at mm/memblock.c:1601
> #1  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #2  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #3  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #4  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #5  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #6  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #7  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #8  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #9  0xc06007d2 in start_kernel () at init/main.c:857
>
> there's a call to memset that seems to hang.  I wonder if memset() was
> defined in terms of __builtin_memset, which oft can result in infinite
> loops?  (IIRC there's an AEABI related memset; this kind of thing has
> hit android userspace before).
>
> (gdb) layout asm
>
> shows that the source call to memset is transformed into a call to mmioset.
>
> (gdb) bt
> #0  mmioset () at arch/arm/lib/memset.S:19
> #1  0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized
> out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602
> #2  0xc060f0b4 in memblock_alloc (size=<optimized out>,
> align=<optimized out>) at ./include/linux/memblock.h:409
> #3  cma_init_reserved_mem (base=1543503872, size=67108864,
> order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc
> <dma_contiguous_default_area>) at mm/cma.c:190
> #4  0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872,
> size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0,
> fixed=false, name=0xc04b9240 "reserved",
>     res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352
> #5  0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>,
> order_per_bit=<optimized out>, name=<optimized out>,
> res_cma=<optimized out>, fixed=<optimized out>,
>     limit=<optimized out>, size=<optimized out>, base=<optimized out>)
> at ./include/linux/cma.h:28
> #6  dma_contiguous_reserve_area (size=<optimized out>, base=<optimized
> out>, limit=<optimized out>, res_cma=0xc07ccbdc
> <dma_contiguous_default_area>, fixed=false)
>     at kernel/dma/contiguous.c:201
> #7  0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at
> kernel/dma/contiguous.c:171
> #8  0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8
> <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230
> #9  0xc060302c in setup_arch (cmdline_p=<optimized out>) at
> arch/arm/kernel/setup.c:1132
> #10 0xc06007d2 in start_kernel () at init/main.c:857
>
> Using `si` in gdb, it looks like we maybe hit an exception vector?
> x   1202                .section .vectors, "ax", %progbits
>
>                                             x
> x   1203        .L__vectors_start:
>
>                                             x
> x   1204                W(b)    vector_rst
>
>                                             x
> x   1205                W(b)    vector_und
>
>                                             x
> x   1206                W(ldr)  pc, .L__vectors_start + 0x1000
>
>                                             x
> x  >1207                W(b)    vector_pabt
>
> Is the last thing I see, then `si` stops working.  Not sure if `pabt`
> is a recognizable acronym to anyone more familiar with ARM?
>
> Happens regardless of your series, FWIW.
> --
> Thanks,
> ~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-17  6:01                 ` Ard Biesheuvel
@ 2020-09-18 20:03                   ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-18 20:03 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai, Nathan Chancellor, Masahiro Yamada

On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > than one bug).  (or maybe one of my command line params to QEMU is
> > wrong).
> >
>
> I understand that this is actually an existing issue in -next, but in
> general, why would you expect to be able to boot
> milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> machine?

We've been booting other configs in QEMU for a few years now, so I
don't see why yet another config would hurt.  Maybe there's some
hardware dependency, but I guess we'd find that out trying to boot it
in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
bad?  Maybe this is considered an antipattern, but you can see how if
we've been getting away with it for years then that would lead to such
expectations.

> (whatever it is) Or does QEMU emulate a milbeaut machine?

$ qemu-system-arm -machine help

doesn't print anything that looks like it, on initial glance.  Looks
like a socionext part:
https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf

> If
> not, better to stick with configs that are intended to boot on the
> QEMU machine emulation that you are using.

I can see in our CI that we've been building+boot testing
multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
while now without specifying any machine.  Is there a preferred
machine we should be using for those?  (It looks like qemu supports
ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)

> Also, while I see the point of regression testing of -next, using it
> as a base to test arbitrary series and then report failures against it
> produces a lot of noise. -next is *not* a good base for development,
> because you get everybody else's half baked crap as well.

Ack.

> When you
> test my stuff, please use a known good base so we're not off on a
> goose chase every time.

Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything*
Monsieur, here at the zoo, chasing the geese is not out of our
purview.  It's preferable to cleaning up after the elephants.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-18 20:03                   ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-18 20:03 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Nathan Chancellor, Jian Cai, Will Deacon, Masahiro Yamada,
	Linux ARM

On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > than one bug).  (or maybe one of my command line params to QEMU is
> > wrong).
> >
>
> I understand that this is actually an existing issue in -next, but in
> general, why would you expect to be able to boot
> milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> machine?

We've been booting other configs in QEMU for a few years now, so I
don't see why yet another config would hurt.  Maybe there's some
hardware dependency, but I guess we'd find that out trying to boot it
in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
bad?  Maybe this is considered an antipattern, but you can see how if
we've been getting away with it for years then that would lead to such
expectations.

> (whatever it is) Or does QEMU emulate a milbeaut machine?

$ qemu-system-arm -machine help

doesn't print anything that looks like it, on initial glance.  Looks
like a socionext part:
https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf

> If
> not, better to stick with configs that are intended to boot on the
> QEMU machine emulation that you are using.

I can see in our CI that we've been building+boot testing
multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
while now without specifying any machine.  Is there a preferred
machine we should be using for those?  (It looks like qemu supports
ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)

> Also, while I see the point of regression testing of -next, using it
> as a base to test arbitrary series and then report failures against it
> produces a lot of noise. -next is *not* a good base for development,
> because you get everybody else's half baked crap as well.

Ack.

> When you
> test my stuff, please use a known good base so we're not off on a
> goose chase every time.

Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything*
Monsieur, here at the zoo, chasing the geese is not out of our
purview.  It's preferable to cleaning up after the elephants.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-18 20:03                   ` Nick Desaulniers
@ 2020-09-18 20:44                     ` Ard Biesheuvel
  -1 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-18 20:44 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai, Nathan Chancellor, Masahiro Yamada

On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > > than one bug).  (or maybe one of my command line params to QEMU is
> > > wrong).
> > >
> >
> > I understand that this is actually an existing issue in -next, but in
> > general, why would you expect to be able to boot
> > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> > machine?
>
> We've been booting other configs in QEMU for a few years now, so I
> don't see why yet another config would hurt.  Maybe there's some
> hardware dependency, but I guess we'd find that out trying to boot it
> in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
> bad?  Maybe this is considered an antipattern, but you can see how if
> we've been getting away with it for years then that would lead to such
> expectations.
>
> > (whatever it is) Or does QEMU emulate a milbeaut machine?
>
> $ qemu-system-arm -machine help
>
> doesn't print anything that looks like it, on initial glance.  Looks
> like a socionext part:
> https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf
>
> > If
> > not, better to stick with configs that are intended to boot on the
> > QEMU machine emulation that you are using.
>
> I can see in our CI that we've been building+boot testing
> multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
> while now without specifying any machine.  Is there a preferred
> machine we should be using for those?  (It looks like qemu supports
> ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
> aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)
>

Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART,
and its defconfig does not include drivers for the PL011 or 8250/16550
UARTs that you typically find on other boards. So how on earth would
you expect to get any output at all if QEMU does not emulate this
exact machine?

In general, if you use QEMU/mach-virt, the only defconfigs you should
reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y.


> > Also, while I see the point of regression testing of -next, using it
> > as a base to test arbitrary series and then report failures against it
> > produces a lot of noise. -next is *not* a good base for development,
> > because you get everybody else's half baked crap as well.
>
> Ack.
>
> > When you
> > test my stuff, please use a known good base so we're not off on a
> > goose chase every time.
>
> Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything*
> Monsieur, here at the zoo, chasing the geese is not out of our
> purview.  It's preferable to cleaning up after the elephants.

:-)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-18 20:44                     ` Ard Biesheuvel
  0 siblings, 0 replies; 62+ messages in thread
From: Ard Biesheuvel @ 2020-09-18 20:44 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Nathan Chancellor, Jian Cai, Will Deacon, Masahiro Yamada,
	Linux ARM

On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > > than one bug).  (or maybe one of my command line params to QEMU is
> > > wrong).
> > >
> >
> > I understand that this is actually an existing issue in -next, but in
> > general, why would you expect to be able to boot
> > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> > machine?
>
> We've been booting other configs in QEMU for a few years now, so I
> don't see why yet another config would hurt.  Maybe there's some
> hardware dependency, but I guess we'd find that out trying to boot it
> in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
> bad?  Maybe this is considered an antipattern, but you can see how if
> we've been getting away with it for years then that would lead to such
> expectations.
>
> > (whatever it is) Or does QEMU emulate a milbeaut machine?
>
> $ qemu-system-arm -machine help
>
> doesn't print anything that looks like it, on initial glance.  Looks
> like a socionext part:
> https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf
>
> > If
> > not, better to stick with configs that are intended to boot on the
> > QEMU machine emulation that you are using.
>
> I can see in our CI that we've been building+boot testing
> multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
> while now without specifying any machine.  Is there a preferred
> machine we should be using for those?  (It looks like qemu supports
> ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
> aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)
>

Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART,
and its defconfig does not include drivers for the PL011 or 8250/16550
UARTs that you typically find on other boards. So how on earth would
you expect to get any output at all if QEMU does not emulate this
exact machine?

In general, if you use QEMU/mach-virt, the only defconfigs you should
reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y.


> > Also, while I see the point of regression testing of -next, using it
> > as a base to test arbitrary series and then report failures against it
> > produces a lot of noise. -next is *not* a good base for development,
> > because you get everybody else's half baked crap as well.
>
> Ack.
>
> > When you
> > test my stuff, please use a known good base so we're not off on a
> > goose chase every time.
>
> Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything*
> Monsieur, here at the zoo, chasing the geese is not out of our
> purview.  It's preferable to cleaning up after the elephants.

:-)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
  2020-09-18 20:44                     ` Ard Biesheuvel
@ 2020-09-18 21:06                       ` Nick Desaulniers
  -1 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-18 21:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Linux ARM, Russell King, Linus Walleij, Nicolas Pitre,
	Stefan Agner, Peter Smith, Marc Zyngier, Will Deacon,
	clang-built-linux, Jian Cai, Nathan Chancellor, Masahiro Yamada

On Fri, Sep 18, 2020 at 1:44 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > > > than one bug).  (or maybe one of my command line params to QEMU is
> > > > wrong).
> > > >
> > >
> > > I understand that this is actually an existing issue in -next, but in
> > > general, why would you expect to be able to boot
> > > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> > > machine?
> >
> > We've been booting other configs in QEMU for a few years now, so I
> > don't see why yet another config would hurt.  Maybe there's some
> > hardware dependency, but I guess we'd find that out trying to boot it
> > in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
> > bad?  Maybe this is considered an antipattern, but you can see how if
> > we've been getting away with it for years then that would lead to such
> > expectations.
> >
> > > (whatever it is) Or does QEMU emulate a milbeaut machine?
> >
> > $ qemu-system-arm -machine help
> >
> > doesn't print anything that looks like it, on initial glance.  Looks
> > like a socionext part:
> > https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf
> >
> > > If
> > > not, better to stick with configs that are intended to boot on the
> > > QEMU machine emulation that you are using.
> >
> > I can see in our CI that we've been building+boot testing
> > multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
> > while now without specifying any machine.  Is there a preferred
> > machine we should be using for those?  (It looks like qemu supports
> > ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
> > aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)
> >
>
> Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART,
> and its defconfig does not include drivers for the PL011 or 8250/16550
> UARTs that you typically find on other boards. So how on earth would
> you expect to get any output at all if QEMU does not emulate this
> exact machine?

breakpoints in panic()/printk(), lx-dmesg in GDB via
CONFIG_GDB_SCRIPTS=y generally works. :^)

I guess one thing I don't understand is how to check what UART or what
the name of the tty device would be.  I can grep for
"socionext,milbeaut-usio-uart" and see where it's defined, but I never
would have/still don't know how to find that. Please teach me how to
fish.  I understand the point of DT, and see

arch/arm/boot/dts/milbeaut-m10v.dtsi
76:                     compatible = "socionext,milbeaut-usio-uart";

but the comment about ttyUSI0 is confusing to me; that identifier
doesn't appear anywhere else in the kernel sources.

I generally have the same problem trying to run Pixel kernels in QEMU;
I don't understand how to determine which serial driver is being used,
and what the tty device would be named.  So I just enable the PLO11
driver.

Only last week I found that trying to use a shell as init without any
serial output can result in the shell exiting, panicking the kernel,
which doesn't print over serial...goose chase...

> In general, if you use QEMU/mach-virt, the only defconfigs you should
> reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y.

Looks like then that would only be multi_v7_defconfig.  How do we
continue to verify we can boot ARMv6 or ARMv5 under virtualization?
There are such machines in QEMU, but then no defconfigs in the kernel.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references
@ 2020-09-18 21:06                       ` Nick Desaulniers
  0 siblings, 0 replies; 62+ messages in thread
From: Nick Desaulniers @ 2020-09-18 21:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, Nicolas Pitre, Marc Zyngier, Linus Walleij,
	Russell King, Stefan Agner, Peter Smith, clang-built-linux,
	Nathan Chancellor, Jian Cai, Will Deacon, Masahiro Yamada,
	Linux ARM

On Fri, Sep 18, 2020 at 1:44 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to
> > > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more
> > > > than one bug).  (or maybe one of my command line params to QEMU is
> > > > wrong).
> > > >
> > >
> > > I understand that this is actually an existing issue in -next, but in
> > > general, why would you expect to be able to boot
> > > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10
> > > machine?
> >
> > We've been booting other configs in QEMU for a few years now, so I
> > don't see why yet another config would hurt.  Maybe there's some
> > hardware dependency, but I guess we'd find that out trying to boot it
> > in QEMU.  If it boots in QEMU, I guess not booting on metal wasn't so
> > bad?  Maybe this is considered an antipattern, but you can see how if
> > we've been getting away with it for years then that would lead to such
> > expectations.
> >
> > > (whatever it is) Or does QEMU emulate a milbeaut machine?
> >
> > $ qemu-system-arm -machine help
> >
> > doesn't print anything that looks like it, on initial glance.  Looks
> > like a socionext part:
> > https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf
> >
> > > If
> > > not, better to stick with configs that are intended to boot on the
> > > QEMU machine emulation that you are using.
> >
> > I can see in our CI that we've been building+boot testing
> > multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a
> > while now without specifying any machine.  Is there a preferred
> > machine we should be using for those?  (It looks like qemu supports
> > ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what
> > aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!)
> >
>
> Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART,
> and its defconfig does not include drivers for the PL011 or 8250/16550
> UARTs that you typically find on other boards. So how on earth would
> you expect to get any output at all if QEMU does not emulate this
> exact machine?

breakpoints in panic()/printk(), lx-dmesg in GDB via
CONFIG_GDB_SCRIPTS=y generally works. :^)

I guess one thing I don't understand is how to check what UART or what
the name of the tty device would be.  I can grep for
"socionext,milbeaut-usio-uart" and see where it's defined, but I never
would have/still don't know how to find that. Please teach me how to
fish.  I understand the point of DT, and see

arch/arm/boot/dts/milbeaut-m10v.dtsi
76:                     compatible = "socionext,milbeaut-usio-uart";

but the comment about ttyUSI0 is confusing to me; that identifier
doesn't appear anywhere else in the kernel sources.

I generally have the same problem trying to run Pixel kernels in QEMU;
I don't understand how to determine which serial driver is being used,
and what the tty device would be named.  So I just enable the PLO11
driver.

Only last week I found that trying to use a shell as init without any
serial output can result in the shell exiting, panicking the kernel,
which doesn't print over serial...goose chase...

> In general, if you use QEMU/mach-virt, the only defconfigs you should
> reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y.

Looks like then that would only be multi_v7_defconfig.  How do we
continue to verify we can boot ARMv6 or ARMv5 under virtualization?
There are such machines in QEMU, but then no defconfigs in the kernel.
-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2020-09-18 21:08 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14  9:56 [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references Ard Biesheuvel
2020-09-14  9:56 ` Ard Biesheuvel
2020-09-14  9:56 ` [PATCH 01/12] ARM: assembler: introduce adr_l, ldr_l and str_l macros Ard Biesheuvel
2020-09-14  9:56   ` Ard Biesheuvel
2020-09-15  7:35   ` Ard Biesheuvel
2020-09-15  7:35     ` Ard Biesheuvel
2020-09-15 17:51     ` Nick Desaulniers
2020-09-15 17:51       ` Nick Desaulniers
2020-09-14  9:56 ` [PATCH 02/12] ARM: efistub: replace adrl pseudo-op with adr_l macro invocation Ard Biesheuvel
2020-09-14  9:56   ` Ard Biesheuvel
2020-09-14  9:56 ` [PATCH 03/12] ARM: module: add support for place relative relocations Ard Biesheuvel
2020-09-14  9:56   ` Ard Biesheuvel
2020-09-14 13:35   ` Nicolas Pitre
2020-09-14 13:35     ` Nicolas Pitre
2020-09-14 16:11     ` Russell King - ARM Linux admin
2020-09-14 16:11       ` Russell King - ARM Linux admin
2020-09-14  9:56 ` [PATCH 04/12] ARM: head-common.S: use PC-relative insn sequence for __proc_info Ard Biesheuvel
2020-09-14  9:56   ` Ard Biesheuvel
2020-09-14  9:56 ` [PATCH 05/12] ARM: head-common.S: use PC-relative insn sequence for idmap creation Ard Biesheuvel
2020-09-14  9:56   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 06/12] ARM: head.S: use PC-relative insn sequence for secondary_data Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 07/12] ARM: kernel: use relative references for UP/SMP alternatives Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 08/12] ARM: head: use PC-relative insn sequence for __smp_alt Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 09/12] ARM: sleep.S: use PC-relative insn sequence for sleep_save_sp/mpidr_hash Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 10/12] ARM: head.S: use PC-relative insn sequences for __fixup_pv_table Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 11/12] ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14  9:57 ` [PATCH 12/12] ARM: kvm: replace open coded VA->PA calculations with adr_l call Ard Biesheuvel
2020-09-14  9:57   ` Ard Biesheuvel
2020-09-14 14:06 ` [PATCH 00/12] ARM: use adr_l/ldr_l macros for PC-relative references Nicolas Pitre
2020-09-14 14:06   ` Nicolas Pitre
2020-09-15 19:32 ` Nick Desaulniers
2020-09-15 19:32   ` Nick Desaulniers
2020-09-15 21:29   ` Ard Biesheuvel
2020-09-15 21:29     ` Ard Biesheuvel
2020-09-15 23:31     ` Nick Desaulniers
2020-09-15 23:31       ` Nick Desaulniers
2020-09-16  5:54       ` Ard Biesheuvel
2020-09-16  5:54         ` Ard Biesheuvel
2020-09-16 19:53         ` Nick Desaulniers
2020-09-16 19:53           ` Nick Desaulniers
2020-09-16 20:45           ` Ard Biesheuvel
2020-09-16 20:45             ` Ard Biesheuvel
2020-09-16 21:25             ` Nick Desaulniers
2020-09-16 21:25               ` Nick Desaulniers
2020-09-17  0:16               ` Nick Desaulniers
2020-09-17  0:16                 ` Nick Desaulniers
2020-09-17  5:19                 ` Mike Rapoport
2020-09-17  5:19                   ` Mike Rapoport
2020-09-17  6:01               ` Ard Biesheuvel
2020-09-17  6:01                 ` Ard Biesheuvel
2020-09-18 20:03                 ` Nick Desaulniers
2020-09-18 20:03                   ` Nick Desaulniers
2020-09-18 20:44                   ` Ard Biesheuvel
2020-09-18 20:44                     ` Ard Biesheuvel
2020-09-18 21:06                     ` Nick Desaulniers
2020-09-18 21:06                       ` Nick Desaulniers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.