* [PATCH v2 0/8] ARM kernel size fixes @ 2015-03-13 12:07 Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset Ard Biesheuvel ` (8 more replies) 0 siblings, 9 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel This series is a suggested approach to preventing linker failures on large kernels. It is somewhat unpolished, and posted for comments/testing primarily. The issues were found and reported by Arnd Bergmann, and these patches are loosely based on his initial approach to work around them. Changes since v1: - Updated PROCINFO patch (#1) to refer to the base of the struct by name, and simplify the calling code (rmk) - Updated b_far/bl_far patch (#3) to remove ARM/THUMB alternatives and use a conditionally defined PC_BIAS instead. Also added b_abs/bl_abs versions, which can only be used for absolute branches but can be implemented in fewer instructions. Added conditional branch support as well. - introduce (#6) and use (#7) the .text.fixup input section which gets emitted after each .text section for each .o - added patch #8 that allows the kallsyms data to be moved to .data Ard Biesheuvel (8): ARM: replace PROCINFO embedded branch with relative offset ARM: move HYP text to end of .text section ARM: add macro to perform far branches (b/bl) ARM: use bl_far to call __hyp_stub_install_secondary from the .data section ARM: move the .idmap.text section closer to .head.text asm-generic: introduce .text.fixup input section ARM: keep .text and .fixup regions together kallsyms: allow kallsyms data to reside in the .data section arch/arm/Kconfig | 1 + arch/arm/include/asm/assembler.h | 83 +++++++++++++++++++++++++++++++++++ arch/arm/include/asm/futex.h | 2 +- arch/arm/include/asm/uaccess.h | 10 ++--- arch/arm/include/asm/word-at-a-time.h | 2 +- arch/arm/kernel/entry-armv.S | 2 +- arch/arm/kernel/head.S | 14 +++--- arch/arm/kernel/sleep.S | 2 +- arch/arm/kernel/swp_emulate.c | 2 +- arch/arm/kernel/vmlinux.lds.S | 15 ++++--- arch/arm/kvm/init.S | 5 +-- arch/arm/kvm/interrupts.S | 4 +- arch/arm/lib/clear_user.S | 2 +- arch/arm/lib/copy_to_user.S | 2 +- arch/arm/lib/csumpartialcopyuser.S | 2 +- arch/arm/mm/alignment.c | 6 +-- arch/arm/mm/proc-arm1020.S | 4 +- arch/arm/mm/proc-arm1020e.S | 4 +- arch/arm/mm/proc-arm1022.S | 4 +- arch/arm/mm/proc-arm1026.S | 4 +- arch/arm/mm/proc-arm720.S | 4 +- arch/arm/mm/proc-arm740.S | 4 +- arch/arm/mm/proc-arm7tdmi.S | 4 +- arch/arm/mm/proc-arm920.S | 4 +- arch/arm/mm/proc-arm922.S | 4 +- arch/arm/mm/proc-arm925.S | 4 +- arch/arm/mm/proc-arm926.S | 4 +- arch/arm/mm/proc-arm940.S | 4 +- arch/arm/mm/proc-arm946.S | 4 +- arch/arm/mm/proc-arm9tdmi.S | 4 +- arch/arm/mm/proc-fa526.S | 4 +- arch/arm/mm/proc-feroceon.S | 5 ++- arch/arm/mm/proc-macros.S | 4 ++ arch/arm/mm/proc-mohawk.S | 4 +- arch/arm/mm/proc-sa110.S | 4 +- arch/arm/mm/proc-sa1100.S | 4 +- arch/arm/mm/proc-v6.S | 4 +- arch/arm/mm/proc-v7.S | 28 ++++++------ arch/arm/mm/proc-v7m.S | 4 +- arch/arm/mm/proc-xsc3.S | 4 +- arch/arm/mm/proc-xscale.S | 4 +- arch/arm/nwfpe/entry.S | 2 +- include/asm-generic/vmlinux.lds.h | 14 +++++- init/Kconfig | 4 ++ scripts/kallsyms.c | 2 +- 45 files changed, 200 insertions(+), 101 deletions(-) -- 1.8.3.2 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-04-19 16:59 ` Joachim Eastwood 2015-03-13 12:07 ` [PATCH v2 2/8] ARM: move HYP text to end of .text section Ard Biesheuvel ` (7 subsequent siblings) 8 siblings, 1 reply; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel This patch replaces the 'branch to setup()' instructions embedded in the PROCINFO structs with the offset to that setup function relative to the base of the struct. This preserves the position independent nature of that field, but uses a data item rather than an instruction. This is mainly done to prevent linker failures on large kernels, where the setup function is out of reach for the branch. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/kernel/head.S | 14 +++++++------- arch/arm/mm/proc-arm1020.S | 4 ++-- arch/arm/mm/proc-arm1020e.S | 4 ++-- arch/arm/mm/proc-arm1022.S | 4 ++-- arch/arm/mm/proc-arm1026.S | 4 ++-- arch/arm/mm/proc-arm720.S | 4 ++-- arch/arm/mm/proc-arm740.S | 4 ++-- arch/arm/mm/proc-arm7tdmi.S | 4 ++-- arch/arm/mm/proc-arm920.S | 4 ++-- arch/arm/mm/proc-arm922.S | 4 ++-- arch/arm/mm/proc-arm925.S | 4 ++-- arch/arm/mm/proc-arm926.S | 4 ++-- arch/arm/mm/proc-arm940.S | 4 ++-- arch/arm/mm/proc-arm946.S | 4 ++-- arch/arm/mm/proc-arm9tdmi.S | 4 ++-- arch/arm/mm/proc-fa526.S | 4 ++-- arch/arm/mm/proc-feroceon.S | 5 +++-- arch/arm/mm/proc-macros.S | 4 ++++ arch/arm/mm/proc-mohawk.S | 4 ++-- arch/arm/mm/proc-sa110.S | 4 ++-- arch/arm/mm/proc-sa1100.S | 4 ++-- arch/arm/mm/proc-v6.S | 4 ++-- arch/arm/mm/proc-v7.S | 28 ++++++++++++++-------------- arch/arm/mm/proc-v7m.S | 4 ++-- arch/arm/mm/proc-xsc3.S | 4 ++-- arch/arm/mm/proc-xscale.S | 4 ++-- 26 files changed, 72 insertions(+), 67 deletions(-) diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 01963273c07a..3637973a9708 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -138,9 +138,9 @@ ENTRY(stext) @ mmu has been enabled adr lr, BSYM(1f) @ return (PIC) address mov r8, r4 @ set TTBR1 to swapper_pg_dir - ARM( add pc, r10, #PROCINFO_INITFUNC ) - THUMB( add r12, r10, #PROCINFO_INITFUNC ) - THUMB( ret r12 ) + ldr r12, [r10, #PROCINFO_INITFUNC] + add r12, r12, r10 + ret r12 1: b __enable_mmu ENDPROC(stext) .ltorg @@ -386,10 +386,10 @@ ENTRY(secondary_startup) ldr r8, [r7, lr] @ get secondary_data.swapper_pg_dir adr lr, BSYM(__enable_mmu) @ return address mov r13, r12 @ __secondary_switched address - ARM( add pc, r10, #PROCINFO_INITFUNC ) @ initialise processor - @ (return control reg) - THUMB( add r12, r10, #PROCINFO_INITFUNC ) - THUMB( ret r12 ) + ldr r12, [r10, #PROCINFO_INITFUNC] + add r12, r12, r10 @ initialise processor + @ (return control reg) + ret r12 ENDPROC(secondary_startup) ENDPROC(secondary_startup_arm) diff --git a/arch/arm/mm/proc-arm1020.S b/arch/arm/mm/proc-arm1020.S index 86ee5d47ce3c..aa0519eed698 100644 --- a/arch/arm/mm/proc-arm1020.S +++ b/arch/arm/mm/proc-arm1020.S @@ -507,7 +507,7 @@ cpu_arm1020_name: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm1020_proc_info,#object __arm1020_proc_info: @@ -519,7 +519,7 @@ __arm1020_proc_info: .long PMD_TYPE_SECT | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm1020_setup + initfn __arm1020_setup, __arm1020_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm1020e.S b/arch/arm/mm/proc-arm1020e.S index a6331d78601f..bff4c7f70fd6 100644 --- a/arch/arm/mm/proc-arm1020e.S +++ b/arch/arm/mm/proc-arm1020e.S @@ -465,7 +465,7 @@ arm1020e_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm1020e_proc_info,#object __arm1020e_proc_info: @@ -479,7 +479,7 @@ __arm1020e_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm1020e_setup + initfn __arm1020e_setup, __arm1020e_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB | HWCAP_EDSP diff --git a/arch/arm/mm/proc-arm1022.S b/arch/arm/mm/proc-arm1022.S index a126b7a59928..dbb2413fe04d 100644 --- a/arch/arm/mm/proc-arm1022.S +++ b/arch/arm/mm/proc-arm1022.S @@ -448,7 +448,7 @@ arm1022_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm1022_proc_info,#object __arm1022_proc_info: @@ -462,7 +462,7 @@ __arm1022_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm1022_setup + initfn __arm1022_setup, __arm1022_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB | HWCAP_EDSP diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S index fc294067e977..0b37b2cef9d3 100644 --- a/arch/arm/mm/proc-arm1026.S +++ b/arch/arm/mm/proc-arm1026.S @@ -442,7 +442,7 @@ arm1026_crval: string cpu_arm1026_name, "ARM1026EJ-S" .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm1026_proc_info,#object __arm1026_proc_info: @@ -456,7 +456,7 @@ __arm1026_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm1026_setup + initfn __arm1026_setup, __arm1026_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_JAVA diff --git a/arch/arm/mm/proc-arm720.S b/arch/arm/mm/proc-arm720.S index 2baa66b3ac9b..3651cd70e418 100644 --- a/arch/arm/mm/proc-arm720.S +++ b/arch/arm/mm/proc-arm720.S @@ -186,7 +186,7 @@ arm720_crval: * See <asm/procinfo.h> for a definition of this structure. */ - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro arm720_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req, cpu_flush:req .type __\name\()_proc_info,#object @@ -203,7 +203,7 @@ __\name\()_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b \cpu_flush @ cpu_flush + initfn \cpu_flush, __\name\()_proc_info @ cpu_flush .long cpu_arch_name @ arch_name .long cpu_elf_name @ elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB @ elf_hwcap diff --git a/arch/arm/mm/proc-arm740.S b/arch/arm/mm/proc-arm740.S index ac1ea6b3bce4..024fb7732407 100644 --- a/arch/arm/mm/proc-arm740.S +++ b/arch/arm/mm/proc-arm740.S @@ -132,14 +132,14 @@ __arm740_setup: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm740_proc_info,#object __arm740_proc_info: .long 0x41807400 .long 0xfffffff0 .long 0 .long 0 - b __arm740_setup + initfn __arm740_setup, __arm740_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB | HWCAP_26BIT diff --git a/arch/arm/mm/proc-arm7tdmi.S b/arch/arm/mm/proc-arm7tdmi.S index bf6ba4bc30ff..25472d94426d 100644 --- a/arch/arm/mm/proc-arm7tdmi.S +++ b/arch/arm/mm/proc-arm7tdmi.S @@ -76,7 +76,7 @@ __arm7tdmi_setup: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro arm7tdmi_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req, \ extra_hwcaps=0 @@ -86,7 +86,7 @@ __\name\()_proc_info: .long \cpu_mask .long 0 .long 0 - b __arm7tdmi_setup + initfn __arm7tdmi_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_26BIT | ( \extra_hwcaps ) diff --git a/arch/arm/mm/proc-arm920.S b/arch/arm/mm/proc-arm920.S index 22bf8dde4f84..7a14bd4414c9 100644 --- a/arch/arm/mm/proc-arm920.S +++ b/arch/arm/mm/proc-arm920.S @@ -448,7 +448,7 @@ arm920_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm920_proc_info,#object __arm920_proc_info: @@ -464,7 +464,7 @@ __arm920_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm920_setup + initfn __arm920_setup, __arm920_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm922.S b/arch/arm/mm/proc-arm922.S index 0c6d5ac5a6d4..edccfcdcd551 100644 --- a/arch/arm/mm/proc-arm922.S +++ b/arch/arm/mm/proc-arm922.S @@ -426,7 +426,7 @@ arm922_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm922_proc_info,#object __arm922_proc_info: @@ -442,7 +442,7 @@ __arm922_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm922_setup + initfn __arm922_setup, __arm922_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm925.S b/arch/arm/mm/proc-arm925.S index c32d073282ea..ede8c54ab4aa 100644 --- a/arch/arm/mm/proc-arm925.S +++ b/arch/arm/mm/proc-arm925.S @@ -494,7 +494,7 @@ arm925_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro arm925_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req, cache .type __\name\()_proc_info,#object @@ -510,7 +510,7 @@ __\name\()_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm925_setup + initfn __arm925_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm926.S b/arch/arm/mm/proc-arm926.S index 252b2503038d..fb827c633693 100644 --- a/arch/arm/mm/proc-arm926.S +++ b/arch/arm/mm/proc-arm926.S @@ -474,7 +474,7 @@ arm926_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm926_proc_info,#object __arm926_proc_info: @@ -490,7 +490,7 @@ __arm926_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __arm926_setup + initfn __arm926_setup, __arm926_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_JAVA diff --git a/arch/arm/mm/proc-arm940.S b/arch/arm/mm/proc-arm940.S index e5212d489377..0a0b7a9167b6 100644 --- a/arch/arm/mm/proc-arm940.S +++ b/arch/arm/mm/proc-arm940.S @@ -354,14 +354,14 @@ __arm940_setup: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm940_proc_info,#object __arm940_proc_info: .long 0x41009400 .long 0xff00fff0 .long 0 - b __arm940_setup + initfn __arm940_setup, __arm940_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm946.S b/arch/arm/mm/proc-arm946.S index b3dd9b2d0b8e..c85b40d2117e 100644 --- a/arch/arm/mm/proc-arm946.S +++ b/arch/arm/mm/proc-arm946.S @@ -409,14 +409,14 @@ __arm946_setup: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __arm946_proc_info,#object __arm946_proc_info: .long 0x41009460 .long 0xff00fff0 .long 0 .long 0 - b __arm946_setup + initfn __arm946_setup, __arm946_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB diff --git a/arch/arm/mm/proc-arm9tdmi.S b/arch/arm/mm/proc-arm9tdmi.S index 8227322bbb8f..7fac8c612134 100644 --- a/arch/arm/mm/proc-arm9tdmi.S +++ b/arch/arm/mm/proc-arm9tdmi.S @@ -70,7 +70,7 @@ __arm9tdmi_setup: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro arm9tdmi_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req .type __\name\()_proc_info, #object @@ -79,7 +79,7 @@ __\name\()_proc_info: .long \cpu_mask .long 0 .long 0 - b __arm9tdmi_setup + initfn __arm9tdmi_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_THUMB | HWCAP_26BIT diff --git a/arch/arm/mm/proc-fa526.S b/arch/arm/mm/proc-fa526.S index c494886892ba..4001b73af4ee 100644 --- a/arch/arm/mm/proc-fa526.S +++ b/arch/arm/mm/proc-fa526.S @@ -190,7 +190,7 @@ fa526_cr1_set: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __fa526_proc_info,#object __fa526_proc_info: @@ -206,7 +206,7 @@ __fa526_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __fa526_setup + initfn __fa526_setup, __fa526_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF diff --git a/arch/arm/mm/proc-feroceon.S b/arch/arm/mm/proc-feroceon.S index 03a1b75f2e16..e494d6d6acbe 100644 --- a/arch/arm/mm/proc-feroceon.S +++ b/arch/arm/mm/proc-feroceon.S @@ -584,7 +584,7 @@ feroceon_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro feroceon_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req, cache:req .type __\name\()_proc_info,#object @@ -601,7 +601,8 @@ __\name\()_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __feroceon_setup + initfn __feroceon_setup, __\name\()_proc_info + .long __feroceon_setup .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index 082b9f2f7e90..0f13b5f9281e 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -331,3 +331,7 @@ ENTRY(\name\()_tlb_fns) .globl \x .equ \x, \y .endm + +.macro initfn, func, base + .long \func - \base +.endm diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S index 53d393455f13..d65edf717bf7 100644 --- a/arch/arm/mm/proc-mohawk.S +++ b/arch/arm/mm/proc-mohawk.S @@ -427,7 +427,7 @@ mohawk_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __88sv331x_proc_info,#object __88sv331x_proc_info: @@ -443,7 +443,7 @@ __88sv331x_proc_info: PMD_BIT4 | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __mohawk_setup + initfn __mohawk_setup, __88sv331x_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP diff --git a/arch/arm/mm/proc-sa110.S b/arch/arm/mm/proc-sa110.S index 8008a0461cf5..ee2ce496239f 100644 --- a/arch/arm/mm/proc-sa110.S +++ b/arch/arm/mm/proc-sa110.S @@ -199,7 +199,7 @@ sa110_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .type __sa110_proc_info,#object __sa110_proc_info: @@ -213,7 +213,7 @@ __sa110_proc_info: .long PMD_TYPE_SECT | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __sa110_setup + initfn __sa110_setup, __sa110_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_26BIT | HWCAP_FAST_MULT diff --git a/arch/arm/mm/proc-sa1100.S b/arch/arm/mm/proc-sa1100.S index 89f97ac648a9..222d5836f666 100644 --- a/arch/arm/mm/proc-sa1100.S +++ b/arch/arm/mm/proc-sa1100.S @@ -242,7 +242,7 @@ sa1100_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro sa1100_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req .type __\name\()_proc_info,#object @@ -257,7 +257,7 @@ __\name\()_proc_info: .long PMD_TYPE_SECT | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __sa1100_setup + initfn __sa1100_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_26BIT | HWCAP_FAST_MULT diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S index d0390f4b3f18..06d890a2342b 100644 --- a/arch/arm/mm/proc-v6.S +++ b/arch/arm/mm/proc-v6.S @@ -264,7 +264,7 @@ v6_crval: string cpu_elf_name, "v6" .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc /* * Match any ARMv6 processor core. @@ -287,7 +287,7 @@ __v6_proc_info: PMD_SECT_XN | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __v6_setup + initfn __v6_setup, __v6_proc_info .long cpu_arch_name .long cpu_elf_name /* See also feat_v6_fixup() for HWCAP_TLS */ diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S index 8b4ee5e81c14..6bdaa4cc1784 100644 --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -462,19 +462,19 @@ __v7_setup_stack: string cpu_elf_name, "v7" .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc /* * Standard v7 proc info content */ -.macro __v7_proc initfunc, mm_mmuflags = 0, io_mmuflags = 0, hwcaps = 0, proc_fns = v7_processor_functions +.macro __v7_proc name, initfunc, mm_mmuflags = 0, io_mmuflags = 0, hwcaps = 0, proc_fns = v7_processor_functions ALT_SMP(.long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AP_READ | \ PMD_SECT_AF | PMD_FLAGS_SMP | \mm_mmuflags) ALT_UP(.long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AP_READ | \ PMD_SECT_AF | PMD_FLAGS_UP | \mm_mmuflags) .long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ | PMD_SECT_AF | \io_mmuflags - W(b) \initfunc + initfn \initfunc, \name .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP | HWCAP_HALF | HWCAP_THUMB | HWCAP_FAST_MULT | \ @@ -494,7 +494,7 @@ __v7_setup_stack: __v7_ca5mp_proc_info: .long 0x410fc050 .long 0xff0ffff0 - __v7_proc __v7_ca5mp_setup + __v7_proc __v7_ca5mp_proc_info, __v7_ca5mp_setup .size __v7_ca5mp_proc_info, . - __v7_ca5mp_proc_info /* @@ -504,7 +504,7 @@ __v7_ca5mp_proc_info: __v7_ca9mp_proc_info: .long 0x410fc090 .long 0xff0ffff0 - __v7_proc __v7_ca9mp_setup, proc_fns = ca9mp_processor_functions + __v7_proc __v7_ca9mp_proc_info, __v7_ca9mp_setup, proc_fns = ca9mp_processor_functions .size __v7_ca9mp_proc_info, . - __v7_ca9mp_proc_info #endif /* CONFIG_ARM_LPAE */ @@ -517,7 +517,7 @@ __v7_ca9mp_proc_info: __v7_pj4b_proc_info: .long 0x560f5800 .long 0xff0fff00 - __v7_proc __v7_pj4b_setup, proc_fns = pj4b_processor_functions + __v7_proc __v7_pj4b_proc_info, __v7_pj4b_setup, proc_fns = pj4b_processor_functions .size __v7_pj4b_proc_info, . - __v7_pj4b_proc_info #endif @@ -528,7 +528,7 @@ __v7_pj4b_proc_info: __v7_cr7mp_proc_info: .long 0x410fc170 .long 0xff0ffff0 - __v7_proc __v7_cr7mp_setup + __v7_proc __v7_cr7mp_proc_info, __v7_cr7mp_setup .size __v7_cr7mp_proc_info, . - __v7_cr7mp_proc_info /* @@ -538,7 +538,7 @@ __v7_cr7mp_proc_info: __v7_ca7mp_proc_info: .long 0x410fc070 .long 0xff0ffff0 - __v7_proc __v7_ca7mp_setup + __v7_proc __v7_ca7mp_proc_info, __v7_ca7mp_setup .size __v7_ca7mp_proc_info, . - __v7_ca7mp_proc_info /* @@ -548,7 +548,7 @@ __v7_ca7mp_proc_info: __v7_ca12mp_proc_info: .long 0x410fc0d0 .long 0xff0ffff0 - __v7_proc __v7_ca12mp_setup + __v7_proc __v7_ca12mp_proc_info, __v7_ca12mp_setup .size __v7_ca12mp_proc_info, . - __v7_ca12mp_proc_info /* @@ -558,7 +558,7 @@ __v7_ca12mp_proc_info: __v7_ca15mp_proc_info: .long 0x410fc0f0 .long 0xff0ffff0 - __v7_proc __v7_ca15mp_setup + __v7_proc __v7_ca15mp_proc_info, __v7_ca15mp_setup .size __v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info /* @@ -568,7 +568,7 @@ __v7_ca15mp_proc_info: __v7_b15mp_proc_info: .long 0x420f00f0 .long 0xff0ffff0 - __v7_proc __v7_b15mp_setup + __v7_proc __v7_b15mp_proc_info, __v7_b15mp_setup .size __v7_b15mp_proc_info, . - __v7_b15mp_proc_info /* @@ -578,7 +578,7 @@ __v7_b15mp_proc_info: __v7_ca17mp_proc_info: .long 0x410fc0e0 .long 0xff0ffff0 - __v7_proc __v7_ca17mp_setup + __v7_proc __v7_ca17mp_proc_info, __v7_ca17mp_setup .size __v7_ca17mp_proc_info, . - __v7_ca17mp_proc_info /* @@ -594,7 +594,7 @@ __krait_proc_info: * do support them. They also don't indicate support for fused multiply * instructions even though they actually do support them. */ - __v7_proc __v7_setup, hwcaps = HWCAP_IDIV | HWCAP_VFPv4 + __v7_proc __krait_proc_info, __v7_setup, hwcaps = HWCAP_IDIV | HWCAP_VFPv4 .size __krait_proc_info, . - __krait_proc_info /* @@ -604,5 +604,5 @@ __krait_proc_info: __v7_proc_info: .long 0x000f0000 @ Required ID value .long 0x000f0000 @ Mask for ID - __v7_proc __v7_setup + __v7_proc __v7_proc_info, __v7_setup .size __v7_proc_info, . - __v7_proc_info diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S index d1e68b553d3b..e08e1f2bab76 100644 --- a/arch/arm/mm/proc-v7m.S +++ b/arch/arm/mm/proc-v7m.S @@ -135,7 +135,7 @@ __v7m_setup_stack_top: string cpu_elf_name "v7m" string cpu_v7m_name "ARMv7-M" - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc /* * Match any ARMv7-M processor core. @@ -146,7 +146,7 @@ __v7m_proc_info: .long 0x000f0000 @ Mask for ID .long 0 @ proc_info_list.__cpu_mm_mmu_flags .long 0 @ proc_info_list.__cpu_io_mmu_flags - b __v7m_setup @ proc_info_list.__cpu_flush + initfn __v7m_setup, __v7m_proc_info @ proc_info_list.__cpu_flush .long cpu_arch_name .long cpu_elf_name .long HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT diff --git a/arch/arm/mm/proc-xsc3.S b/arch/arm/mm/proc-xsc3.S index f8acdfece036..293dcc2c441f 100644 --- a/arch/arm/mm/proc-xsc3.S +++ b/arch/arm/mm/proc-xsc3.S @@ -499,7 +499,7 @@ xsc3_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro xsc3_proc_info name:req, cpu_val:req, cpu_mask:req .type __\name\()_proc_info,#object @@ -514,7 +514,7 @@ __\name\()_proc_info: .long PMD_TYPE_SECT | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __xsc3_setup + initfn __xsc3_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP diff --git a/arch/arm/mm/proc-xscale.S b/arch/arm/mm/proc-xscale.S index afa2b3c4df4a..b6bbfdb6dfdc 100644 --- a/arch/arm/mm/proc-xscale.S +++ b/arch/arm/mm/proc-xscale.S @@ -612,7 +612,7 @@ xscale_crval: .align - .section ".proc.info.init", #alloc, #execinstr + .section ".proc.info.init", #alloc .macro xscale_proc_info name:req, cpu_val:req, cpu_mask:req, cpu_name:req, cache .type __\name\()_proc_info,#object @@ -627,7 +627,7 @@ __\name\()_proc_info: .long PMD_TYPE_SECT | \ PMD_SECT_AP_WRITE | \ PMD_SECT_AP_READ - b __xscale_setup + initfn __xscale_setup, __\name\()_proc_info .long cpu_arch_name .long cpu_elf_name .long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-03-13 12:07 ` [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset Ard Biesheuvel @ 2015-04-19 16:59 ` Joachim Eastwood 2015-04-19 17:08 ` Russell King - ARM Linux 0 siblings, 1 reply; 22+ messages in thread From: Joachim Eastwood @ 2015-04-19 16:59 UTC (permalink / raw) To: linux-arm-kernel Hi Ard, On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > This patch replaces the 'branch to setup()' instructions embedded > in the PROCINFO structs with the offset to that setup function > relative to the base of the struct. This preserves the position > independent nature of that field, but uses a data item rather > than an instruction. > > This is mainly done to prevent linker failures on large kernels, > where the setup function is out of reach for the branch. This commit (bf35706f3d09 in Linus master) breaks booting on ARMv7-M. When I try to boot Linus master now on my NXP LPC4357 (Cortex-M4) dev kit I get the following message from u-boot. ## Booting kernel from Legacy Image at 29000000 ... Image Name: Linux Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 1412318 Bytes = 1.3 MB Load Address: 28008000 Entry Point: 28008001 Verifying Checksum ... OK Loading Kernel Image ... OK OK Starting kernel ... UNHANDLED EXCEPTION: HARD FAULT R0 = ffffffff R1 = 00001038 R2 = 281d8711 R3 = 00000000 R12 = 2822092c LR = 28008023 PC = 2822092e PSR = 21000000 Reverting bf35706f3d09 (plus fixing a small conflict) makes Linus master boot again. I am using the following compiler: gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09 The ARMv7-M machine that I am using is not upstream yet, but you can find the patch set on the mailing list. regards, Joachim Eastwood ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 16:59 ` Joachim Eastwood @ 2015-04-19 17:08 ` Russell King - ARM Linux 2015-04-19 17:41 ` Ard Biesheuvel 2015-04-19 19:24 ` Joachim Eastwood 0 siblings, 2 replies; 22+ messages in thread From: Russell King - ARM Linux @ 2015-04-19 17:08 UTC (permalink / raw) To: linux-arm-kernel On Sun, Apr 19, 2015 at 06:59:45PM +0200, Joachim Eastwood wrote: > Hi Ard, > On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > This patch replaces the 'branch to setup()' instructions embedded > > in the PROCINFO structs with the offset to that setup function > > relative to the base of the struct. This preserves the position > > independent nature of that field, but uses a data item rather > > than an instruction. > > > > This is mainly done to prevent linker failures on large kernels, > > where the setup function is out of reach for the branch. > > This commit (bf35706f3d09 in Linus master) breaks booting on ARMv7-M. > > When I try to boot Linus master now on my NXP LPC4357 (Cortex-M4) dev > kit I get the following message from u-boot. > ## Booting kernel from Legacy Image at 29000000 ... > Image Name: Linux > Image Type: ARM Linux Kernel Image (uncompressed) > Data Size: 1412318 Bytes = 1.3 MB > Load Address: 28008000 > Entry Point: 28008001 > Verifying Checksum ... OK > Loading Kernel Image ... OK > OK > > Starting kernel ... > > UNHANDLED EXCEPTION: HARD FAULT > R0 = ffffffff R1 = 00001038 > R2 = 281d8711 R3 = 00000000 > R12 = 2822092c LR = 28008023 > PC = 2822092e PSR = 21000000 > > Reverting bf35706f3d09 (plus fixing a small conflict) makes Linus > master boot again. > > I am using the following compiler: > gcc version 4.9.2 20140904 (prerelease) (crosstool-NG > linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09 > > The ARMv7-M machine that I am using is not upstream yet, but you can > find the patch set on the mailing list. Interesting... it works here with stock gcc 4.9.2. Maybe it's a bug in the Linaro gcc? Could you mail me (privately) your vmlinux file (the one in the root directory) for analysis please? Thanks. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 17:08 ` Russell King - ARM Linux @ 2015-04-19 17:41 ` Ard Biesheuvel 2015-04-19 19:28 ` Russell King - ARM Linux 2015-04-19 19:24 ` Joachim Eastwood 1 sibling, 1 reply; 22+ messages in thread From: Ard Biesheuvel @ 2015-04-19 17:41 UTC (permalink / raw) To: linux-arm-kernel > On 19 apr. 2015, at 19:08, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > >> On Sun, Apr 19, 2015 at 06:59:45PM +0200, Joachim Eastwood wrote: >> Hi Ard, >>> On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >>> This patch replaces the 'branch to setup()' instructions embedded >>> in the PROCINFO structs with the offset to that setup function >>> relative to the base of the struct. This preserves the position >>> independent nature of that field, but uses a data item rather >>> than an instruction. >>> >>> This is mainly done to prevent linker failures on large kernels, >>> where the setup function is out of reach for the branch. >> >> This commit (bf35706f3d09 in Linus master) breaks booting on ARMv7-M. >> >> When I try to boot Linus master now on my NXP LPC4357 (Cortex-M4) dev >> kit I get the following message from u-boot. >> ## Booting kernel from Legacy Image at 29000000 ... >> Image Name: Linux >> Image Type: ARM Linux Kernel Image (uncompressed) >> Data Size: 1412318 Bytes = 1.3 MB >> Load Address: 28008000 >> Entry Point: 28008001 >> Verifying Checksum ... OK >> Loading Kernel Image ... OK >> OK >> >> Starting kernel ... >> >> UNHANDLED EXCEPTION: HARD FAULT >> R0 = ffffffff R1 = 00001038 >> R2 = 281d8711 R3 = 00000000 >> R12 = 2822092c LR = 28008023 >> PC = 2822092e PSR = 21000000 >> >> Reverting bf35706f3d09 (plus fixing a small conflict) makes Linus >> master boot again. >> >> I am using the following compiler: >> gcc version 4.9.2 20140904 (prerelease) (crosstool-NG >> linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09 >> >> The ARMv7-M machine that I am using is not upstream yet, but you can >> find the patch set on the mailing list. > > Interesting... it works here with stock gcc 4.9.2. Maybe it's a bug in > the Linaro gcc? > > Could you mail me (privately) your vmlinux file (the one in the root > directory) for analysis please? I am away from my work pc so i can't check but i wonder if all setup functions are correctly annotated as thumb2 when built in thumb2 mode. If not, it would explain why a plain branch works but doing arithmetic on the address doesn't. Perhaps a bsym() around the setup function in question is sufficient to solve this? Ard. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 17:41 ` Ard Biesheuvel @ 2015-04-19 19:28 ` Russell King - ARM Linux 2015-04-19 19:45 ` Joachim Eastwood 2015-04-19 21:52 ` Ard Biesheuvel 0 siblings, 2 replies; 22+ messages in thread From: Russell King - ARM Linux @ 2015-04-19 19:28 UTC (permalink / raw) To: linux-arm-kernel On Sun, Apr 19, 2015 at 07:41:08PM +0200, Ard Biesheuvel wrote: > I am away from my work pc so i can't check but i wonder if all setup > functions are correctly annotated as thumb2 when built in thumb2 mode. > If not, it would explain why a plain branch works but doing arithmetic > on the address doesn't. Yes, it's a Thumb2 kernel, but more importantly, it's a nommu kernel, and the nommu code wasn't touched. So, the entry code looks like this: 28008000: f8df 9024 ldr.w r9, [pc, #36] ; 28008028 <__after_proc_init+0x4> 28008004: f8d9 9000 ldr.w r9, [r9] 28008008: f001 f926 bl 28009258 <__lookup_processor_type> 2800800c: ea5f 0a05 movs.w sl, r5 28008010: f001 8164 beq.w 280092dc <__error_p> 28008014: f8df d014 ldr.w sp, [pc, #20] ; 2800802c <__after_proc_init+0x8> 28008018: f20f 0e07 addw lr, pc, #7 2800801c: f10a 0c10 add.w ip, sl, #16 28008020: 46e7 mov pc, ip 28008022: e7ff b.n 28008024 <__after_proc_init> which results in us jumping to: 2822091c <__proc_info_begin>: 2822091c: 000f0000 andeq r0, pc, r0 28220920: 000f0000 andeq r0, pc, r0 ... 2822092c: fff5ce6d ; <UNDEFINED> instruction: 0xfff5ce6d ^^^ here. That's an offset from the beginning of the structure, which gives us an address of 0x2817d789, which would be correct: 2817d788 <__v7m_setup>: 2817d788: 4829 ldr r0, [pc, #164] ; (2817d830 <v7m_processor_functions+0x30>) 2817d78a: f8df c0a8 ldr.w ip, [pc, #168] ; 2817d834 <v7m_processor_functions+0x34> 2817d78e: f8c0 c008 str.w ip, [r0, #8] The patch below should resolve it - Joachim, please confirm: diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S index 455033110078..5925449f6f04 100644 --- a/arch/arm/kernel/head-nommu.S +++ b/arch/arm/kernel/head-nommu.S @@ -80,9 +80,9 @@ ENTRY(stext) ldr r13, =__mmap_switched @ address to jump to after @ initialising sctlr adr lr, BSYM(1f) @ return (PIC) address - ARM( add pc, r10, #PROCINFO_INITFUNC ) - THUMB( add r12, r10, #PROCINFO_INITFUNC ) - THUMB( ret r12 ) + ldr r12, [r10, #PROCINFO_INITFUNC] + add r12, r12, r10 + ret r12 1: b __after_proc_init ENDPROC(stext) @@ -117,9 +117,9 @@ ENTRY(secondary_startup) adr lr, BSYM(__after_proc_init) @ return address mov r13, r12 @ __secondary_switched address - ARM( add pc, r10, #PROCINFO_INITFUNC ) - THUMB( add r12, r10, #PROCINFO_INITFUNC ) - THUMB( ret r12 ) + ldr r12, [r10, #PROCINFO_INITFUNC] + add r12, r12, r10 + ret r12 ENDPROC(secondary_startup) ENTRY(__secondary_switched) -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 19:28 ` Russell King - ARM Linux @ 2015-04-19 19:45 ` Joachim Eastwood 2015-04-19 21:52 ` Ard Biesheuvel 1 sibling, 0 replies; 22+ messages in thread From: Joachim Eastwood @ 2015-04-19 19:45 UTC (permalink / raw) To: linux-arm-kernel On 19 April 2015 at 21:28, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sun, Apr 19, 2015 at 07:41:08PM +0200, Ard Biesheuvel wrote: >> I am away from my work pc so i can't check but i wonder if all setup >> functions are correctly annotated as thumb2 when built in thumb2 mode. >> If not, it would explain why a plain branch works but doing arithmetic >> on the address doesn't. > > Yes, it's a Thumb2 kernel, but more importantly, it's a nommu kernel, > and the nommu code wasn't touched. > > So, the entry code looks like this: > > 28008000: f8df 9024 ldr.w r9, [pc, #36] ; 28008028 <__after_proc_init+0x4> > 28008004: f8d9 9000 ldr.w r9, [r9] > 28008008: f001 f926 bl 28009258 <__lookup_processor_type> > 2800800c: ea5f 0a05 movs.w sl, r5 > 28008010: f001 8164 beq.w 280092dc <__error_p> > 28008014: f8df d014 ldr.w sp, [pc, #20] ; 2800802c <__after_proc_init+0x8> > 28008018: f20f 0e07 addw lr, pc, #7 > 2800801c: f10a 0c10 add.w ip, sl, #16 > 28008020: 46e7 mov pc, ip > 28008022: e7ff b.n 28008024 <__after_proc_init> > > which results in us jumping to: > > 2822091c <__proc_info_begin>: > 2822091c: 000f0000 andeq r0, pc, r0 > 28220920: 000f0000 andeq r0, pc, r0 > ... > 2822092c: fff5ce6d ; <UNDEFINED> instruction: 0xfff5ce6d > > ^^^ here. That's an offset from the beginning of the structure, which > gives us an address of 0x2817d789, which would be correct: > > 2817d788 <__v7m_setup>: > 2817d788: 4829 ldr r0, [pc, #164] ; (2817d830 <v7m_processor_functions+0x30>) > 2817d78a: f8df c0a8 ldr.w ip, [pc, #168] ; 2817d834 <v7m_processor_functions+0x34> > 2817d78e: f8c0 c008 str.w ip, [r0, #8] > > The patch below should resolve it - Joachim, please confirm: Yep, patch below makes Linus master boot again on my Cortex-M4 board. Tested-by: Joachim Eastwood <manabian@gmail.com> Thanks for debugging and fixing the problem Russell. regards, Joachim Eastwood > diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S > index 455033110078..5925449f6f04 100644 > --- a/arch/arm/kernel/head-nommu.S > +++ b/arch/arm/kernel/head-nommu.S > @@ -80,9 +80,9 @@ ENTRY(stext) > ldr r13, =__mmap_switched @ address to jump to after > @ initialising sctlr > adr lr, BSYM(1f) @ return (PIC) address > - ARM( add pc, r10, #PROCINFO_INITFUNC ) > - THUMB( add r12, r10, #PROCINFO_INITFUNC ) > - THUMB( ret r12 ) > + ldr r12, [r10, #PROCINFO_INITFUNC] > + add r12, r12, r10 > + ret r12 > 1: b __after_proc_init > ENDPROC(stext) > > @@ -117,9 +117,9 @@ ENTRY(secondary_startup) > > adr lr, BSYM(__after_proc_init) @ return address > mov r13, r12 @ __secondary_switched address > - ARM( add pc, r10, #PROCINFO_INITFUNC ) > - THUMB( add r12, r10, #PROCINFO_INITFUNC ) > - THUMB( ret r12 ) > + ldr r12, [r10, #PROCINFO_INITFUNC] > + add r12, r12, r10 > + ret r12 > ENDPROC(secondary_startup) > > ENTRY(__secondary_switched) > > > -- > FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up > according to speedtest.net. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 19:28 ` Russell King - ARM Linux 2015-04-19 19:45 ` Joachim Eastwood @ 2015-04-19 21:52 ` Ard Biesheuvel 1 sibling, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-04-19 21:52 UTC (permalink / raw) To: linux-arm-kernel On 19 April 2015 at 21:28, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sun, Apr 19, 2015 at 07:41:08PM +0200, Ard Biesheuvel wrote: >> I am away from my work pc so i can't check but i wonder if all setup >> functions are correctly annotated as thumb2 when built in thumb2 mode. >> If not, it would explain why a plain branch works but doing arithmetic >> on the address doesn't. > > Yes, it's a Thumb2 kernel, but more importantly, it's a nommu kernel, > and the nommu code wasn't touched. > Ah, my bad. I had no idea that code was duplicated elsewhere, but I guess grepping for PROCINFO_INITFUNC would have given me a strong hint. > So, the entry code looks like this: > > 28008000: f8df 9024 ldr.w r9, [pc, #36] ; 28008028 <__after_proc_init+0x4> > 28008004: f8d9 9000 ldr.w r9, [r9] > 28008008: f001 f926 bl 28009258 <__lookup_processor_type> > 2800800c: ea5f 0a05 movs.w sl, r5 > 28008010: f001 8164 beq.w 280092dc <__error_p> > 28008014: f8df d014 ldr.w sp, [pc, #20] ; 2800802c <__after_proc_init+0x8> > 28008018: f20f 0e07 addw lr, pc, #7 > 2800801c: f10a 0c10 add.w ip, sl, #16 > 28008020: 46e7 mov pc, ip OK, there's still a dodgy bit here. The issue I pointed out in my previous email actually does exist, i.e., the setup functions are not always annotated as thumb2 so the offset from the base of the struct may lack the thumb bit even if the function is coded in thumb2. This is caused by the fact that local labels lack this annotation, even if the function is emitted into a separate section in the same object file and references to it are resolved by the linker through relocations. Looking at a couple of procinfo entries from proc-v7.S, it turns out that the offset field (the 1st word on the 2nd line) indeed contains even values in some cases in a Thumb2 kernel c0771364 <__v7_ca9mp_proc_info>: c0771364: 410fc090 ff0ffff0 00011c0e 00000c02 ...A............ c0771374: ffaab634 c0773934 c077393a 00008097 4...49w.:9w..... ... c0771398 <__v7_ca8_proc_info>: c0771398: 410fc080 ff0ffff0 00011c0e 00000c02 ...A............ c07713a8: ffaab667 c0773934 c077393a 00008097 g...49w.:9w..... ... c07713cc <__v7_pj4b_proc_info>: c07713cc: 560f5800 ff0fff00 00011c0e 00000c02 .X.V............ c07713dc: ffaab5ee c0773934 c077393a 00008097 ....49w.:9w..... ... c0771400 <__v7_cr7mp_proc_info>: c0771400: 410fc170 ff0ffff0 00011c0e 00000c02 p..A............ c0771410: ffaab598 c0773934 c077393a 00008097 ....49w.:9w..... ... c0771434 <__v7_ca7mp_proc_info>: c0771434: 410fc070 ff0ffff0 00011c0e 00000c02 p..A............ c0771444: ffaab56a c0773934 c077393a 00008097 j...49w.:9w..... ... but we are getting lucky because the 'ret r12' instruction from head{-nommu}.S is emitted as 'mov pc, ip', which is a [for v7] deprecated method of performing a branch-to-register which doesn't incur a mode switch. In other words, if we'd use the architecturally correct 'bx ip' here, the code breaks. As far as I can tell, there are no such setup functions that could run on a Thumb2 capable CPU but are emitted in ARM code explicitly, so I think the fix could be as simple as diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S index c671f345266a..a4f6d74e9e21 100644 --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -333,7 +333,7 @@ ENTRY(\name\()_tlb_fns) .endm .macro initfn, func, base - .long \func - \base + .long BSYM(\func) - \base .endm /* ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset 2015-04-19 17:08 ` Russell King - ARM Linux 2015-04-19 17:41 ` Ard Biesheuvel @ 2015-04-19 19:24 ` Joachim Eastwood 1 sibling, 0 replies; 22+ messages in thread From: Joachim Eastwood @ 2015-04-19 19:24 UTC (permalink / raw) To: linux-arm-kernel On 19 April 2015 at 19:08, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sun, Apr 19, 2015 at 06:59:45PM +0200, Joachim Eastwood wrote: >> Hi Ard, >> On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> > This patch replaces the 'branch to setup()' instructions embedded >> > in the PROCINFO structs with the offset to that setup function >> > relative to the base of the struct. This preserves the position >> > independent nature of that field, but uses a data item rather >> > than an instruction. >> > >> > This is mainly done to prevent linker failures on large kernels, >> > where the setup function is out of reach for the branch. >> >> This commit (bf35706f3d09 in Linus master) breaks booting on ARMv7-M. >> >> When I try to boot Linus master now on my NXP LPC4357 (Cortex-M4) dev >> kit I get the following message from u-boot. >> ## Booting kernel from Legacy Image at 29000000 ... >> Image Name: Linux >> Image Type: ARM Linux Kernel Image (uncompressed) >> Data Size: 1412318 Bytes = 1.3 MB >> Load Address: 28008000 >> Entry Point: 28008001 >> Verifying Checksum ... OK >> Loading Kernel Image ... OK >> OK >> >> Starting kernel ... >> >> UNHANDLED EXCEPTION: HARD FAULT >> R0 = ffffffff R1 = 00001038 >> R2 = 281d8711 R3 = 00000000 >> R12 = 2822092c LR = 28008023 >> PC = 2822092e PSR = 21000000 >> >> Reverting bf35706f3d09 (plus fixing a small conflict) makes Linus >> master boot again. >> >> I am using the following compiler: >> gcc version 4.9.2 20140904 (prerelease) (crosstool-NG >> linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09 >> >> The ARMv7-M machine that I am using is not upstream yet, but you can >> find the patch set on the mailing list. > > Interesting... it works here with stock gcc 4.9.2. Maybe it's a bug in > the Linaro gcc? I tried the ARM crosscompiler from https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/ and it gives me the same result as the Linaro one. gcc version 4.9.0 (GCC) > Could you mail me (privately) your vmlinux file (the one in the root > directory) for analysis please? Sure (mail already sent). regards, Joachim Eastwood ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 2/8] ARM: move HYP text to end of .text section 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) Ard Biesheuvel ` (6 subsequent siblings) 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel The HYP text is essentially a separate binary from the kernel proper, so it can be moved away from the rest of the kernel. This helps prevent link failures due to branch relocations exceeding their range. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/kernel/vmlinux.lds.S | 8 ++++++-- arch/arm/kvm/init.S | 5 +---- arch/arm/kvm/interrupts.S | 4 +--- 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index b31aa73e8076..e3b9403bd2d6 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -22,11 +22,14 @@ ALIGN_FUNCTION(); \ VMLINUX_SYMBOL(__idmap_text_start) = .; \ *(.idmap.text) \ - VMLINUX_SYMBOL(__idmap_text_end) = .; \ + VMLINUX_SYMBOL(__idmap_text_end) = .; + +#define HYP_TEXT \ . = ALIGN(32); \ VMLINUX_SYMBOL(__hyp_idmap_text_start) = .; \ *(.hyp.idmap.text) \ - VMLINUX_SYMBOL(__hyp_idmap_text_end) = .; + VMLINUX_SYMBOL(__hyp_idmap_text_end) = .; \ + *(.hyp.text) #ifdef CONFIG_HOTPLUG_CPU #define ARM_CPU_DISCARD(x) @@ -118,6 +121,7 @@ SECTIONS . = ALIGN(4); *(.got) /* Global offset table */ ARM_CPU_KEEP(PROC_INFO) + HYP_TEXT } #ifdef CONFIG_DEBUG_RODATA diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 3988e72d16ff..7a377d36de5d 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -51,8 +51,7 @@ * Switches to the runtime PGD, set stack and vectors. */ - .text - .pushsection .hyp.idmap.text,"ax" + .section ".hyp.idmap.text", #alloc .align 5 __kvm_hyp_init: .globl __kvm_hyp_init @@ -155,5 +154,3 @@ target: @ We're now in the trampoline code, switch page tables .globl __kvm_hyp_init_end __kvm_hyp_init_end: - - .popsection diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 79caf79b304a..db22e9bedfcd 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -27,7 +27,7 @@ #include <asm/vfpmacros.h> #include "interrupts_head.S" - .text + .section ".hyp.text", #alloc __kvm_hyp_code_start: .globl __kvm_hyp_code_start @@ -316,8 +316,6 @@ THUMB( orr r2, r2, #PSR_T_BIT ) eret .endm - .text - .align 5 __kvm_hyp_vector: .globl __kvm_hyp_vector -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 2/8] ARM: move HYP text to end of .text section Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-13 16:40 ` Russell King - ARM Linux 2015-03-18 10:07 ` Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 4/8] ARM: use bl_far to call __hyp_stub_install_secondary from the .data section Ard Biesheuvel ` (5 subsequent siblings) 8 siblings, 2 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel These macros execute PC-relative branches, but with a larger reach than the 24 bits that are available in the b and bl opcodes. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/include/asm/assembler.h | 83 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h index f67fd3afebdf..2e7f55194782 100644 --- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -88,6 +88,17 @@ #endif /* + * The program counter is always ahead of the address of the currently + * executing instruction by PC_BIAS bytes, whose value differs depending + * on the execution mode. + */ +#ifdef CONFIG_THUMB2_KERNEL +#define PC_BIAS 4 +#else +#define PC_BIAS 8 +#endif + +/* * Enable and disable interrupts */ #if __LINUX_ARM_ARCH__ >= 6 @@ -108,6 +119,78 @@ .endm #endif + /* + * Macros to emit relative conditional branches that may exceed the + * range of the 24-bit immediate of the ordinary b/bl instructions. + * NOTE: this doesn't work with locally defined symbols, as they + * lack the ARM/Thumb annotation (even if they are annotated as + * functions) + */ + .macro b_far, target, tmpreg, c= +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) + movt\c \tmpreg, #:upper16:(\target - (8888f + PC_BIAS)) + movw\c \tmpreg, #:lower16:(\target - (8888f + PC_BIAS)) +8888: add\c pc, pc, \tmpreg +#else + ldr\c \tmpreg, 8889f +8888: add\c pc, pc, \tmpreg + .ifnb \c + b 8890f + .endif +8889: .long \target - (8888b + PC_BIAS) +8890: +#endif + .endm + + .macro bl_far, target, c= +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) + movt\c ip, #:upper16:(\target - (8887f + PC_BIAS)) + movw\c ip, #:lower16:(\target - (8887f + PC_BIAS)) +8887: add\c ip, ip, pc + blx\c ip +#else + adr\c lr, 8887f + b_far \target, ip, \c +8887: +#endif + .endm + + /* + * Macros to emit absolute conditional branches: these are preferred + * over the far variants above because they use fewer instructions + * and/or use implicit literals that the assembler can group together + * to optimize cache utilization. However, they can only be used to + * call functions at their link time address, which rules out early boot + * code that executes with the MMU off. + * The v7 variant uses a movt/movw pair to prevent potential D-cache + * stalls on the literal, so using these macros is preferred over using + * 'ldr pc, =XXX' directly (unless no scratch register is available) + * NOTE: this doesn't work with locally defined symbols, as they + * lack the ARM/Thumb annotation (even if they are annotated as + * functions) + */ + .macro b_abs, target, tmpreg, c= +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) + movt\c \tmpreg, #:upper16:\target + movw\c \tmpreg, #:lower16:\target + bx\c \tmpreg +#else + ldr\c pc, =\target +#endif + .endm + + .macro bl_abs, target, c= +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) + movt\c lr, #:upper16:\target + movw\c lr, #:lower16:\target + blx\c lr +#else + adr\c lr, BSYM(8886f) + ldr\c pc, =\target +8886: +#endif + .endm + .macro asm_trace_hardirqs_off #if defined(CONFIG_TRACE_IRQFLAGS) stmdb sp!, {r0-r3, ip, lr} -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) 2015-03-13 12:07 ` [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) Ard Biesheuvel @ 2015-03-13 16:40 ` Russell King - ARM Linux 2015-03-17 20:35 ` Ard Biesheuvel 2015-03-18 10:07 ` Ard Biesheuvel 1 sibling, 1 reply; 22+ messages in thread From: Russell King - ARM Linux @ 2015-03-13 16:40 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 13, 2015 at 01:07:27PM +0100, Ard Biesheuvel wrote: > + .macro bl_abs, target, c= > +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) > + movt\c lr, #:upper16:\target > + movw\c lr, #:lower16:\target > + blx\c lr So I've looked this up, and it's valid, which is surprising because BLX itself writes to LR - the read from LR must happen before BLX itself writes to LR. Thankfully, because of the pipelining, this is probably guaranteed. I wonder whether there will be any errata on this... maybe on non-ARM CPUs? It'll be interesting to find out what happens once we merge this... :) -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) 2015-03-13 16:40 ` Russell King - ARM Linux @ 2015-03-17 20:35 ` Ard Biesheuvel 0 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-17 20:35 UTC (permalink / raw) To: linux-arm-kernel On 13 March 2015 at 17:40, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Mar 13, 2015 at 01:07:27PM +0100, Ard Biesheuvel wrote: >> + .macro bl_abs, target, c= >> +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) >> + movt\c lr, #:upper16:\target >> + movw\c lr, #:lower16:\target >> + blx\c lr > > So I've looked this up, and it's valid, which is surprising because BLX > itself writes to LR - the read from LR must happen before BLX itself > writes to LR. Thankfully, because of the pipelining, this is probably > guaranteed. > I hadn't given it another thought, to be honest, as arithmetic instructions can also use the same register as input and output. But I suppose branch instructions don't go through all the ordinary pipeline stages > I wonder whether there will be any errata on this... maybe on non-ARM > CPUs? It'll be interesting to find out what happens once we merge > this... :) > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) 2015-03-13 12:07 ` [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) Ard Biesheuvel 2015-03-13 16:40 ` Russell King - ARM Linux @ 2015-03-18 10:07 ` Ard Biesheuvel 2015-03-19 9:01 ` [PATCH] " Ard Biesheuvel 1 sibling, 1 reply; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-18 10:07 UTC (permalink / raw) To: linux-arm-kernel On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > These macros execute PC-relative branches, but with a larger > reach than the 24 bits that are available in the b and bl opcodes. > > Acked-by: Nicolas Pitre <nico@linaro.org> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm/include/asm/assembler.h | 83 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 83 insertions(+) > > diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h > index f67fd3afebdf..2e7f55194782 100644 > --- a/arch/arm/include/asm/assembler.h > +++ b/arch/arm/include/asm/assembler.h > @@ -88,6 +88,17 @@ > #endif > > /* > + * The program counter is always ahead of the address of the currently > + * executing instruction by PC_BIAS bytes, whose value differs depending > + * on the execution mode. > + */ > +#ifdef CONFIG_THUMB2_KERNEL > +#define PC_BIAS 4 > +#else > +#define PC_BIAS 8 > +#endif > + > +/* > * Enable and disable interrupts > */ > #if __LINUX_ARM_ARCH__ >= 6 > @@ -108,6 +119,78 @@ > .endm > #endif > > + /* > + * Macros to emit relative conditional branches that may exceed the > + * range of the 24-bit immediate of the ordinary b/bl instructions. > + * NOTE: this doesn't work with locally defined symbols, as they > + * lack the ARM/Thumb annotation (even if they are annotated as > + * functions) > + */ > + .macro b_far, target, tmpreg, c= > +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) > + movt\c \tmpreg, #:upper16:(\target - (8888f + PC_BIAS)) > + movw\c \tmpreg, #:lower16:(\target - (8888f + PC_BIAS)) > +8888: add\c pc, pc, \tmpreg > +#else > + ldr\c \tmpreg, 8889f > +8888: add\c pc, pc, \tmpreg > + .ifnb \c > + b 8890f > + .endif > +8889: .long \target - (8888b + PC_BIAS) > +8890: > +#endif > + .endm Actually, I have found something better: add\c \tmpreg, pc, #:pc_g0_nc:\target - PC_BIAS add\c \tmpreg, \tmpreg, #:pc_g1_nc:\target - PC_BIAS + 4 add\c pc, \tmpreg, #:pc_g2:\target - PC_BIAS + 8 This uses a PC-relative group relocation to split the offset into 12-bit chunks and poke them into the add instructions This way, we don't need the literal at all. Note that add with pc as destination is ARM-only, so we should probably retain the v7 movw/movt regardless > + > + .macro bl_far, target, c= > +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) > + movt\c ip, #:upper16:(\target - (8887f + PC_BIAS)) > + movw\c ip, #:lower16:(\target - (8887f + PC_BIAS)) > +8887: add\c ip, ip, pc > + blx\c ip > +#else > + adr\c lr, 8887f > + b_far \target, ip, \c > +8887: > +#endif > + .endm > + > + /* > + * Macros to emit absolute conditional branches: these are preferred > + * over the far variants above because they use fewer instructions > + * and/or use implicit literals that the assembler can group together > + * to optimize cache utilization. However, they can only be used to > + * call functions at their link time address, which rules out early boot > + * code that executes with the MMU off. > + * The v7 variant uses a movt/movw pair to prevent potential D-cache > + * stalls on the literal, so using these macros is preferred over using > + * 'ldr pc, =XXX' directly (unless no scratch register is available) > + * NOTE: this doesn't work with locally defined symbols, as they > + * lack the ARM/Thumb annotation (even if they are annotated as > + * functions) > + */ > + .macro b_abs, target, tmpreg, c= > +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) > + movt\c \tmpreg, #:upper16:\target > + movw\c \tmpreg, #:lower16:\target > + bx\c \tmpreg > +#else > + ldr\c pc, =\target > +#endif > + .endm > + > + .macro bl_abs, target, c= > +#if defined(CONFIG_CPU_32v7) || defined(CONFIG_CPU_32v7M) > + movt\c lr, #:upper16:\target > + movw\c lr, #:lower16:\target > + blx\c lr > +#else > + adr\c lr, BSYM(8886f) > + ldr\c pc, =\target > +8886: > +#endif > + .endm > + > .macro asm_trace_hardirqs_off > #if defined(CONFIG_TRACE_IRQFLAGS) > stmdb sp!, {r0-r3, ip, lr} > -- > 1.8.3.2 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] ARM: add macro to perform far branches (b/bl) 2015-03-18 10:07 ` Ard Biesheuvel @ 2015-03-19 9:01 ` Ard Biesheuvel 0 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-19 9:01 UTC (permalink / raw) To: linux-arm-kernel OK, so this is what I came up with in the end. I dropped b_abs/bl_abs as they are not needed anymore, now that b_far/bl_far are emitted without any explicit or implicit literals. I updated the ARCH check so that movw/movt/ really only gets used on v7 targeted builds. I also updated the v7 variant to use bx instead of adding with the PC as destination register, as this is deprecated by the ARM ARM. --------------------8<----------------------- These macros execute PC-relative branches, but with a larger reach than the 24 bits that are available in the b and bl opcodes. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/include/asm/assembler.h | 44 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h index f67fd3afebdf..1b9a630f93e0 100644 --- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -88,6 +88,17 @@ #endif /* + * The program counter is always ahead of the address of the currently + * executing instruction by PC_BIAS bytes, whose value differs depending + * on the execution mode. + */ +#ifdef CONFIG_THUMB2_KERNEL +#define PC_BIAS 4 +#else +#define PC_BIAS 8 +#endif + +/* * Enable and disable interrupts */ #if __LINUX_ARM_ARCH__ >= 6 @@ -108,6 +119,39 @@ .endm #endif + /* + * Macros to emit relative conditional branches that may exceed the + * range of the 24-bit immediate of the ordinary b/bl instructions. + * NOTE: this doesn't work with locally defined symbols, as they + * lack the ARM/Thumb annotation (even if they are annotated as + * functions) + */ + .macro b_far, target, r, c=, b=bx +#if __LINUX_ARM_ARCH__ >= 7 + movt\c \r, #:upper16:(\target - (8888f + PC_BIAS)) + movw\c \r, #:lower16:(\target - (8888f + PC_BIAS)) +8888: add\c \r, \r, pc + \b\c \r +#else + /* + * Compute the PC-relative offset of \target. We need to correct for + * the bias when reading the PC at label 8888, and for the offset + * between the place of the read and the place of the relocation. + */ +8888: add\c \r, pc, #:pc_g0_nc:(\target - PC_BIAS + (. - 8888b)) + add\c \r, \r, #:pc_g1_nc:(\target - PC_BIAS + (. - 8888b)) + add\c pc, \r, #:pc_g2:(\target - PC_BIAS + (. - 8888b)) +#endif + .endm + + .macro bl_far, target, c= +#if __LINUX_ARM_ARCH__ < 7 + adr\c lr, 8887f +#endif + b_far \target, ip, \c, blx +8887: + .endm + .macro asm_trace_hardirqs_off #if defined(CONFIG_TRACE_IRQFLAGS) stmdb sp!, {r0-r3, ip, lr} -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 4/8] ARM: use bl_far to call __hyp_stub_install_secondary from the .data section 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (2 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 5/8] ARM: move the .idmap.text section closer to .head.text Ard Biesheuvel ` (4 subsequent siblings) 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/kernel/sleep.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S index e1e60e5a7a27..0ea3813fedce 100644 --- a/arch/arm/kernel/sleep.S +++ b/arch/arm/kernel/sleep.S @@ -128,7 +128,7 @@ ENDPROC(cpu_resume_after_mmu) ENTRY(cpu_resume) ARM_BE8(setend be) @ ensure we are in BE mode #ifdef CONFIG_ARM_VIRT_EXT - bl __hyp_stub_install_secondary + bl_far __hyp_stub_install_secondary #endif safe_svcmode_maskall r1 mov r1, #0 -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 5/8] ARM: move the .idmap.text section closer to .head.text 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (3 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 4/8] ARM: use bl_far to call __hyp_stub_install_secondary from the .data section Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 6/8] asm-generic: introduce .text.fixup input section Ard Biesheuvel ` (3 subsequent siblings) 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel This moves the .idmap.text section closer to .head.text, so that relative branches are less likely to go out of range if the kernel text gets bigger. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/kernel/vmlinux.lds.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index e3b9403bd2d6..2e7b2220ef5f 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -103,6 +103,7 @@ SECTIONS .text : { /* Real text segment */ _stext = .; /* Text and read-only data */ + IDMAP_TEXT __exception_text_start = .; *(.exception.text) __exception_text_end = .; @@ -111,7 +112,6 @@ SECTIONS SCHED_TEXT LOCK_TEXT KPROBES_TEXT - IDMAP_TEXT #ifdef CONFIG_MMU *(.fixup) #endif -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 6/8] asm-generic: introduce .text.fixup input section 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (4 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 5/8] ARM: move the .idmap.text section closer to .head.text Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-18 18:58 ` Arnd Bergmann 2015-03-13 12:07 ` [PATCH v2 7/8] ARM: keep .text and .fixup regions together Ard Biesheuvel ` (2 subsequent siblings) 8 siblings, 1 reply; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel This introduces a new .text.fixup input section that gets emitted together with the .text section for each input object file. Note that *(.text) *(.text.fixup) is not the same as *(.text .text.fixup) and we are looking for the latter, to ensure that fixup snippets that are assembled into a separate section in the object file do not end up out of range for the relative branch instructions it contains if the .text section itself grows very large. This helps prevent linker failures on large ARM kernels. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- include/asm-generic/vmlinux.lds.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index ac78910d7416..463231d5bfc7 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -401,7 +401,7 @@ #define TEXT_TEXT \ ALIGN_FUNCTION(); \ *(.text.hot) \ - *(.text) \ + *(.text .text.fixup) \ *(.ref.text) \ MEM_KEEP(init.text) \ MEM_KEEP(exit.text) \ -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 6/8] asm-generic: introduce .text.fixup input section 2015-03-13 12:07 ` [PATCH v2 6/8] asm-generic: introduce .text.fixup input section Ard Biesheuvel @ 2015-03-18 18:58 ` Arnd Bergmann 0 siblings, 0 replies; 22+ messages in thread From: Arnd Bergmann @ 2015-03-18 18:58 UTC (permalink / raw) To: linux-arm-kernel On Friday 13 March 2015, Ard Biesheuvel wrote: > This introduces a new .text.fixup input section that gets emitted > together with the .text section for each input object file. > > Note that > > *(.text) > *(.text.fixup) > > is not the same as > > *(.text .text.fixup) > > and we are looking for the latter, to ensure that fixup snippets that > are assembled into a separate section in the object file do not end > up out of range for the relative branch instructions it contains if > the .text section itself grows very large. > > This helps prevent linker failures on large ARM kernels. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Let's merge this together with the other patches rather than using the asm-generic git. > --- > include/asm-generic/vmlinux.lds.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > index ac78910d7416..463231d5bfc7 100644 > --- a/include/asm-generic/vmlinux.lds.h > +++ b/include/asm-generic/vmlinux.lds.h > @@ -401,7 +401,7 @@ > #define TEXT_TEXT \ > ALIGN_FUNCTION(); \ > *(.text.hot) \ > - *(.text) \ > + *(.text .text.fixup) \ > *(.ref.text) \ > MEM_KEEP(init.text) \ > MEM_KEEP(exit.text) \ > -- > 1.8.3.2 > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 7/8] ARM: keep .text and .fixup regions together 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (5 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 6/8] asm-generic: introduce .text.fixup input section Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 8/8] kallsyms: allow kallsyms data to reside in the .data section Ard Biesheuvel 2015-03-18 7:54 ` [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel This moves all fixup snippets to the .text.fixup section, which is a special section that gets emitted along with the .text section for each input object file, i.e., the snippets are kept much closer to the code they refer to, which helps prevent linker failure on large kernels. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/include/asm/futex.h | 2 +- arch/arm/include/asm/uaccess.h | 10 +++++----- arch/arm/include/asm/word-at-a-time.h | 2 +- arch/arm/kernel/entry-armv.S | 2 +- arch/arm/kernel/swp_emulate.c | 2 +- arch/arm/kernel/vmlinux.lds.S | 5 +---- arch/arm/lib/clear_user.S | 2 +- arch/arm/lib/copy_to_user.S | 2 +- arch/arm/lib/csumpartialcopyuser.S | 2 +- arch/arm/mm/alignment.c | 6 +++--- arch/arm/nwfpe/entry.S | 2 +- 11 files changed, 17 insertions(+), 20 deletions(-) diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h index 53e69dae796f..4e78065a16aa 100644 --- a/arch/arm/include/asm/futex.h +++ b/arch/arm/include/asm/futex.h @@ -13,7 +13,7 @@ " .align 3\n" \ " .long 1b, 4f, 2b, 4f\n" \ " .popsection\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "4: mov %0, " err_reg "\n" \ " b 3b\n" \ diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index ce0786efd26c..74b17d09ef7a 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -315,7 +315,7 @@ do { \ __asm__ __volatile__( \ "1: " TUSER(ldrb) " %1,[%2],#0\n" \ "2:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, %3\n" \ " mov %1, #0\n" \ @@ -351,7 +351,7 @@ do { \ __asm__ __volatile__( \ "1: " TUSER(ldr) " %1,[%2],#0\n" \ "2:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, %3\n" \ " mov %1, #0\n" \ @@ -397,7 +397,7 @@ do { \ __asm__ __volatile__( \ "1: " TUSER(strb) " %1,[%2],#0\n" \ "2:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, %3\n" \ " b 2b\n" \ @@ -430,7 +430,7 @@ do { \ __asm__ __volatile__( \ "1: " TUSER(str) " %1,[%2],#0\n" \ "2:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, %3\n" \ " b 2b\n" \ @@ -458,7 +458,7 @@ do { \ THUMB( "1: " TUSER(str) " " __reg_oper1 ", [%1]\n" ) \ THUMB( "2: " TUSER(str) " " __reg_oper0 ", [%1, #4]\n" ) \ "3:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "4: mov %0, %3\n" \ " b 3b\n" \ diff --git a/arch/arm/include/asm/word-at-a-time.h b/arch/arm/include/asm/word-at-a-time.h index a6d0a29861e7..5831dce4b51c 100644 --- a/arch/arm/include/asm/word-at-a-time.h +++ b/arch/arm/include/asm/word-at-a-time.h @@ -71,7 +71,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr) asm( "1: ldr %0, [%2]\n" "2:\n" - " .pushsection .fixup,\"ax\"\n" + " .pushsection .text.fixup,\"ax\"\n" " .align 2\n" "3: and %1, %2, #0x3\n" " bic %2, %2, #0x3\n" diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 672b21942fff..570306c49406 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -545,7 +545,7 @@ ENDPROC(__und_usr) /* * The out of line fixup for the ldrt instructions above. */ - .pushsection .fixup, "ax" + .pushsection .text.fixup, "ax" .align 2 4: str r4, [sp, #S_PC] @ retry current instruction ret r9 diff --git a/arch/arm/kernel/swp_emulate.c b/arch/arm/kernel/swp_emulate.c index afdd51e30bec..1361756782c7 100644 --- a/arch/arm/kernel/swp_emulate.c +++ b/arch/arm/kernel/swp_emulate.c @@ -42,7 +42,7 @@ " cmp %0, #0\n" \ " movne %0, %4\n" \ "2:\n" \ - " .section .fixup,\"ax\"\n" \ + " .section .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, %5\n" \ " b 2b\n" \ diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index 2e7b2220ef5f..82846f60e31e 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -77,7 +77,7 @@ SECTIONS ARM_EXIT_DISCARD(EXIT_DATA) EXIT_CALL #ifndef CONFIG_MMU - *(.fixup) + *(.text.fixup) *(__ex_table) #endif #ifndef CONFIG_SMP_ON_UP @@ -112,9 +112,6 @@ SECTIONS SCHED_TEXT LOCK_TEXT KPROBES_TEXT -#ifdef CONFIG_MMU - *(.fixup) -#endif *(.gnu.warning) *(.glue_7) *(.glue_7t) diff --git a/arch/arm/lib/clear_user.S b/arch/arm/lib/clear_user.S index 14a0d988c82c..1710fd7db2d5 100644 --- a/arch/arm/lib/clear_user.S +++ b/arch/arm/lib/clear_user.S @@ -47,7 +47,7 @@ USER( strnebt r2, [r0]) ENDPROC(__clear_user) ENDPROC(__clear_user_std) - .pushsection .fixup,"ax" + .pushsection .text.fixup,"ax" .align 0 9001: ldmfd sp!, {r0, pc} .popsection diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S index a9d3db16ecb5..9648b0675a3e 100644 --- a/arch/arm/lib/copy_to_user.S +++ b/arch/arm/lib/copy_to_user.S @@ -100,7 +100,7 @@ WEAK(__copy_to_user) ENDPROC(__copy_to_user) ENDPROC(__copy_to_user_std) - .pushsection .fixup,"ax" + .pushsection .text.fixup,"ax" .align 0 copy_abort_preamble ldmfd sp!, {r1, r2, r3} diff --git a/arch/arm/lib/csumpartialcopyuser.S b/arch/arm/lib/csumpartialcopyuser.S index 7d08b43d2c0e..1d0957e61f89 100644 --- a/arch/arm/lib/csumpartialcopyuser.S +++ b/arch/arm/lib/csumpartialcopyuser.S @@ -68,7 +68,7 @@ * so properly, we would have to add in whatever registers were loaded before * the fault, which, with the current asm above is not predictable. */ - .pushsection .fixup,"ax" + .pushsection .text.fixup,"ax" .align 4 9001: mov r4, #-EFAULT ldr r5, [sp, #8*4] @ *err_ptr diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c index 2c0c541c60ca..9769f1eefe3b 100644 --- a/arch/arm/mm/alignment.c +++ b/arch/arm/mm/alignment.c @@ -201,7 +201,7 @@ union offset_union { THUMB( "1: "ins" %1, [%2]\n" ) \ THUMB( " add %2, %2, #1\n" ) \ "2:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "3: mov %0, #1\n" \ " b 2b\n" \ @@ -261,7 +261,7 @@ union offset_union { " mov %1, %1, "NEXT_BYTE"\n" \ "2: "ins" %1, [%2]\n" \ "3:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "4: mov %0, #1\n" \ " b 3b\n" \ @@ -301,7 +301,7 @@ union offset_union { " mov %1, %1, "NEXT_BYTE"\n" \ "4: "ins" %1, [%2]\n" \ "5:\n" \ - " .pushsection .fixup,\"ax\"\n" \ + " .pushsection .text.fixup,\"ax\"\n" \ " .align 2\n" \ "6: mov %0, #1\n" \ " b 5b\n" \ diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S index 5d65be1f1e8a..71df43547659 100644 --- a/arch/arm/nwfpe/entry.S +++ b/arch/arm/nwfpe/entry.S @@ -113,7 +113,7 @@ next: @ to fault. Emit the appropriate exception gunk to fix things up. @ ??? For some reason, faults can happen at .Lx2 even with a @ plain LDR instruction. Weird, but it seems harmless. - .pushsection .fixup,"ax" + .pushsection .text.fixup,"ax" .align 2 .Lfix: ret r9 @ let the user eat segfaults .popsection -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 8/8] kallsyms: allow kallsyms data to reside in the .data section 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (6 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 7/8] ARM: keep .text and .fixup regions together Ard Biesheuvel @ 2015-03-13 12:07 ` Ard Biesheuvel 2015-03-18 7:54 ` [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-13 12:07 UTC (permalink / raw) To: linux-arm-kernel On architectures such as ARM, the default location of the kallsyms data in the rodata section may be problematic, as it then sits right between the .text and .init.text/.exit.text sections. This is usually not a problem, but as soon as the code size exceeds a certain threshold, the linker will start adding trampolines to ensure the two code regions can reach each other through ordinary relative branches. This causes inconsistencies between subsequent versions of the kallsyms data, causing the build to fail. This adds a Kconfig symbol that, when set, causes the kallsyms data regions to be moved to the .data section instead, which works around this problem. Cc: linux-arch at vger.kernel.org Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/Kconfig | 1 + include/asm-generic/vmlinux.lds.h | 12 +++++++++++- init/Kconfig | 4 ++++ scripts/kallsyms.c | 2 +- 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9f1f09a2bc9b..639e215bd9a1 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -11,6 +11,7 @@ config ARM select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF select ARCH_WANT_IPC_PARSE_VERSION + select ARCH_HAVE_KALLSYMS_IN_DATA_SECTION select BUILDTIME_EXTABLE_SORT if MMU select CLONE_BACKWARDS select CPU_PM if (SUSPEND || CPU_IDLE) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 463231d5bfc7..09f93bfaad0e 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -150,6 +150,14 @@ #define TRACE_SYSCALLS() #endif +#ifdef CONFIG_ARCH_HAVE_KALLSYMS_IN_DATA_SECTION +#define KALLSYMS_RODATA +#define KALLSYMS_DATA *(.kallsyms_data) +#else +#define KALLSYMS_RODATA *(.kallsyms_data) +#define KALLSYMS_DATA +#endif + #define ___OF_TABLE(cfg, name) _OF_TABLE_##cfg(name) #define __OF_TABLE(cfg, name) ___OF_TABLE(cfg, name) @@ -197,7 +205,8 @@ LIKELY_PROFILE() \ BRANCH_PROFILE() \ TRACE_PRINTKS() \ - TRACEPOINT_STR() + TRACEPOINT_STR() \ + KALLSYMS_DATA /* * Data section helpers @@ -234,6 +243,7 @@ .rodata : AT(ADDR(.rodata) - LOAD_OFFSET) { \ VMLINUX_SYMBOL(__start_rodata) = .; \ *(.rodata) *(.rodata.*) \ + KALLSYMS_RODATA \ *(__vermagic) /* Kernel version magic */ \ . = ALIGN(8); \ VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .; \ diff --git a/init/Kconfig b/init/Kconfig index 058e3671fa11..d6f4920f3487 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1410,6 +1410,10 @@ config KALLSYMS_ALL Say N unless you really need all symbols. +config ARCH_HAVE_KALLSYMS_IN_DATA_SECTION + bool + depends on KALLSYMS + config PRINTK default y bool "Enable support for printk" if EXPERT diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index c6d33bd15b04..b23682a967e0 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -333,7 +333,7 @@ static void write_src(void) printf("#define ALGN .align 4\n"); printf("#endif\n"); - printf("\t.section .rodata, \"a\"\n"); + printf("\t.section .kallsyms_data, \"a\"\n"); /* Provide proper symbols relocatability by their '_text' * relativeness. The symbol names cannot be used to construct -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 0/8] ARM kernel size fixes 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel ` (7 preceding siblings ...) 2015-03-13 12:07 ` [PATCH v2 8/8] kallsyms: allow kallsyms data to reside in the .data section Ard Biesheuvel @ 2015-03-18 7:54 ` Ard Biesheuvel 8 siblings, 0 replies; 22+ messages in thread From: Ard Biesheuvel @ 2015-03-18 7:54 UTC (permalink / raw) To: linux-arm-kernel On 13 March 2015 at 13:07, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > This series is a suggested approach to preventing linker failures on large > kernels. It is somewhat unpolished, and posted for comments/testing primarily. > > The issues were found and reported by Arnd Bergmann, and these patches are > loosely based on his initial approach to work around them. > > Changes since v1: > - Updated PROCINFO patch (#1) to refer to the base of the struct by name, and > simplify the calling code (rmk) > - Updated b_far/bl_far patch (#3) to remove ARM/THUMB alternatives and use a > conditionally defined PC_BIAS instead. Also added b_abs/bl_abs versions, > which can only be used for absolute branches but can be implemented in fewer > instructions. Added conditional branch support as well. > - introduce (#6) and use (#7) the .text.fixup input section which gets emitted > after each .text section for each .o > - added patch #8 that allows the kallsyms data to be moved to .data > I put the following ones in the patch tracker: > ARM: replace PROCINFO embedded branch with relative offset > ARM: add macro to perform far branches (b/bl) > ARM: use bl_far to call __hyp_stub_install_secondary from the .data > section > ARM: move the .idmap.text section closer to .head.text Arnd, may I have your ack on these if you think this approach is ok? > asm-generic: introduce .text.fixup input section > ARM: keep .text and .fixup regions together This one can be dropped and/or deferred. I don't need it to build Arnd's dotconfig-from-hell successfully, and there is a KVM patch under review that touches the same part of the linker script. > ARM: move HYP text to end of .text section This one has not been discussed at all, so let's defer for now > kallsyms: allow kallsyms data to reside in the .data section > Regards, Ard. > arch/arm/Kconfig | 1 + > arch/arm/include/asm/assembler.h | 83 +++++++++++++++++++++++++++++++++++ > arch/arm/include/asm/futex.h | 2 +- > arch/arm/include/asm/uaccess.h | 10 ++--- > arch/arm/include/asm/word-at-a-time.h | 2 +- > arch/arm/kernel/entry-armv.S | 2 +- > arch/arm/kernel/head.S | 14 +++--- > arch/arm/kernel/sleep.S | 2 +- > arch/arm/kernel/swp_emulate.c | 2 +- > arch/arm/kernel/vmlinux.lds.S | 15 ++++--- > arch/arm/kvm/init.S | 5 +-- > arch/arm/kvm/interrupts.S | 4 +- > arch/arm/lib/clear_user.S | 2 +- > arch/arm/lib/copy_to_user.S | 2 +- > arch/arm/lib/csumpartialcopyuser.S | 2 +- > arch/arm/mm/alignment.c | 6 +-- > arch/arm/mm/proc-arm1020.S | 4 +- > arch/arm/mm/proc-arm1020e.S | 4 +- > arch/arm/mm/proc-arm1022.S | 4 +- > arch/arm/mm/proc-arm1026.S | 4 +- > arch/arm/mm/proc-arm720.S | 4 +- > arch/arm/mm/proc-arm740.S | 4 +- > arch/arm/mm/proc-arm7tdmi.S | 4 +- > arch/arm/mm/proc-arm920.S | 4 +- > arch/arm/mm/proc-arm922.S | 4 +- > arch/arm/mm/proc-arm925.S | 4 +- > arch/arm/mm/proc-arm926.S | 4 +- > arch/arm/mm/proc-arm940.S | 4 +- > arch/arm/mm/proc-arm946.S | 4 +- > arch/arm/mm/proc-arm9tdmi.S | 4 +- > arch/arm/mm/proc-fa526.S | 4 +- > arch/arm/mm/proc-feroceon.S | 5 ++- > arch/arm/mm/proc-macros.S | 4 ++ > arch/arm/mm/proc-mohawk.S | 4 +- > arch/arm/mm/proc-sa110.S | 4 +- > arch/arm/mm/proc-sa1100.S | 4 +- > arch/arm/mm/proc-v6.S | 4 +- > arch/arm/mm/proc-v7.S | 28 ++++++------ > arch/arm/mm/proc-v7m.S | 4 +- > arch/arm/mm/proc-xsc3.S | 4 +- > arch/arm/mm/proc-xscale.S | 4 +- > arch/arm/nwfpe/entry.S | 2 +- > include/asm-generic/vmlinux.lds.h | 14 +++++- > init/Kconfig | 4 ++ > scripts/kallsyms.c | 2 +- > 45 files changed, 200 insertions(+), 101 deletions(-) > > -- > 1.8.3.2 > ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2015-04-19 21:52 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-03-13 12:07 [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 1/8] ARM: replace PROCINFO embedded branch with relative offset Ard Biesheuvel 2015-04-19 16:59 ` Joachim Eastwood 2015-04-19 17:08 ` Russell King - ARM Linux 2015-04-19 17:41 ` Ard Biesheuvel 2015-04-19 19:28 ` Russell King - ARM Linux 2015-04-19 19:45 ` Joachim Eastwood 2015-04-19 21:52 ` Ard Biesheuvel 2015-04-19 19:24 ` Joachim Eastwood 2015-03-13 12:07 ` [PATCH v2 2/8] ARM: move HYP text to end of .text section Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 3/8] ARM: add macro to perform far branches (b/bl) Ard Biesheuvel 2015-03-13 16:40 ` Russell King - ARM Linux 2015-03-17 20:35 ` Ard Biesheuvel 2015-03-18 10:07 ` Ard Biesheuvel 2015-03-19 9:01 ` [PATCH] " Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 4/8] ARM: use bl_far to call __hyp_stub_install_secondary from the .data section Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 5/8] ARM: move the .idmap.text section closer to .head.text Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 6/8] asm-generic: introduce .text.fixup input section Ard Biesheuvel 2015-03-18 18:58 ` Arnd Bergmann 2015-03-13 12:07 ` [PATCH v2 7/8] ARM: keep .text and .fixup regions together Ard Biesheuvel 2015-03-13 12:07 ` [PATCH v2 8/8] kallsyms: allow kallsyms data to reside in the .data section Ard Biesheuvel 2015-03-18 7:54 ` [PATCH v2 0/8] ARM kernel size fixes Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).