* [kernel-hardening] [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. v2 and up also implement physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.
Changes since v3:
- Implemented base relative kallsyms address tables. This saves 250 KB of
permanent .rodata (for a defconfig build), but more importantly, saves about
1.4 MB of __init data in the dynamic relocation table. This patch has been
picked up by akpm in the mean time, but it is reproduced here for completeness
- Reimplemented the KASLR init code in C. This has a couple of benefits, i.e.,
we can now parse the 'nokaslr' command line option in a timely fashion, and we
can pass the KASLR random seed via /chosen/kaslr-seed rather than via x1,
which means fewer changes to bootloaders. This is implemented using a two pass
approach, i.e., the kernel is booted without KASLR to a state where it can
invoke ordinary C code, and it re-enters the early boot code to recreate
the kernel mappings at an offset if it finds the prerequisite data in the FDT.
Note that this requires some mild refactoring to ensure that early_fixmap_init
and fixmap_remap_fdt can cope with being called twice.
- Added a patch to enable the inappropriately named huge-vmap feature for arm64,
which allows ioremap() (but not vmap or vmalloc) to use block mappings. This
should be an improvement in itself, but the significance for this series is
that it also allows the __init region to be unmapped entirely via
unmap_kernel_range() [which complains about block mappings without this
feature enabled]
- Split the implementation of CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE into
separate patches.
- Randomize the module region independently from the core kernel. It is chosen
such that it covers the [_stext, _etext] interval of the core kernel to avoid
using PLT entries unless we really have to.
- Update the module PLT patch to replace the O(n^2) searches with sorting, and
use a single .plt section for __init and ordinary code.
- Added panic notifiers to report the KASLR offset, and whether a mem= limit is
in effect.
- Replaced Mark Rutland's asm/elf.g split off patch with one that puts the C
declarations between #ifndef __ASSEMBLY__/#endif
- Incorporated feedback (and tags) from Mark Rutland and Matt Fleming
- Minor tweaks and fixes.
Changes since v2:
- Incorporated feedback from Marc Zyngier into the KVM patch (#5)
- Dropped the pgdir section and the patch that memblock_reserve()'s the kernel
sections at a smaller granularity. This is no longer necessary with the pgdir
section gone. This also fixes an issue spotted by James Morse where the fixmap
page tables are not zeroed correctly; these have been moved back to the .bss
section.
- Got rid of all ifdef'ery regarding the number of translation levels in the
changed .c files, by introducing new definitions in pgtable.h (#3, #6)
- Fixed KAsan support, which was broken by all earlier versions.
- Moved module region along with the virtually randomized kernel, so that module
addresses become unpredictable as well, and we only have to rely on veneers in
the PLTs when the module region is exhausted (which is somewhat more likely
since the module region is now shared with other uses of the vmalloc area)
- Added support for the 'nokaslr' command line option. This affects the
randomization performed by the stub, and results in a warning if passed while
the bootloader also presented a random seed for virtual KASLR in register x1.
- The .text/.rodata sections of the kernel are no longer aliased in the linear
region with a writable mapping.
- Added a separate image header flag for kernel images that may be loaded at any
2 MB aligned offset (+ TEXT_OFFSET)
- The KASLR displacement is now corrected if it results in the kernel image
intersecting a PUD/PMD boundary (4k and 16k/64k granule kernels, respectively)
- Split out UEFI stub random routines into separate patches.
- Implemented a weight based EFI random allocation routine so that each suitable
offset in available memory is equally likely to be selected (as suggested by
Kees Cook)
- Reused CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE instead of introducing
new Kconfig symbols to describe the same functionality.
- Reimplemented mem= logic so memory is clipped from the top first.
Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
randomization at load time. I have recycled some patches from this series:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
relation between kernel symbols and the linear mapping (on which the HYP
mapping is based), as these assumptions cease to be valid once we move the
kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
script to provide the kernel size using R_AARCH64_ABS32 relocation rather
than a R_AARCH64_ABS64 relocation, since those are always resolved at build
time. This allows me to get rid of the post-build perl script to swab header
values on BE kernels.
- Minor style tweaks.
Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
24 bytes per relocation entry. This results in considerable bloat (i.e., a
couple of MBs worth of relocation data in an .init section). However, no
build time postprocessing is required, we rely fully on the toolchain to
produce the image
- We have to rely on the bootloader to supply some randomness in
/chosen/kaslr-seed upon kernel entry. Since we have no decompressor, it is
simply not feasible to collect randomness in the head.S code path before
mapping the kernel and enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
UEFI systems is not universally available. A QEMU/KVM firmware image that
implements a pseudo-random version is available here:
http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
(requires access to PMCCNTR_EL0 and support for AES instructions)
See below for instructions how to run the pseudo-random version on real
hardware.
Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.
Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
Patch #3 introduces pte_offset_kimg(), pmd_offset_kimg() and pud_offset_kimg()
that allow statically allocated page tables (i.e., by fixmap and kasan) to be
traversed before the linear mapping is installed.
Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)
Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.
Patch #6 wires up the huge-vmap generic feature for arm64.
Patch #7 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.
Patch #8 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range. This is required for KASLR,
since modules may be loaded far away from the core kernel.
Patch #9 and #10 move arm64 to the a new generic relative version of the extable
implementation so that it no longer contains absolute addresses that require
fixing up at relocation time, but uses relative offsets instead.
Patch #11 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS64 relocations that are emitted for these fields are not
resolved at build time for PIE executables.
Patch #12 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.
Patch #13 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.
Patch #14 wraps C declarations in asm/elf.h inside #ifndef __ASSEMBLY__ so we
can include it in head.S
Patch #15 updates scripts/sortextable.c so it accepts ET_DYN (relocatable)
executables as well as ET_EXEC (static) executables.
Patch #16 implements base kallsyms base relative address tables.
Patch #17 implements CONFIG_RELOCATABLE, i.e., building vmlinux as a PIE
executable that self relocates at early boot time
Patch #18 implements the core KASLR, by taking randomness supplied in
/chosen/kaslr-seed and using it to move the kernel inside the vmalloc area,
and randomize the module region around it.
Patch #19 implements efi_get_random_bytes() based on the EFI_RNG_PROTOCOL
Patch #20 implements efi_random_alloc()
Patch #21 moves the allocation for the converted command line (UTF-16 to ASCII)
away from the base of memory. This is necessary since for parsing
Patch #22 implements the actual KASLR, by randomizing the kernel physical
address, and passing entropy in /chosen/kaslr-seed so that the kernel proper can
relocate itself virtually.
Ard Biesheuvel (22):
of/fdt: make memblock minimum physical address arch configurable
arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
arm64: pgtable: implement static [pte|pmd|pud]_offset variants
arm64: decouple early fixmap init from linear mapping
arm64: kvm: deal with kernel symbols outside of linear mapping
arm64: add support for ioremap() block mappings
arm64: move kernel image to base of vmalloc area
arm64: add support for module PLTs
extable: add support for relative extables to search and sort routines
arm64: switch to relative exception tables
arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
arm64: avoid dynamic relocations in early boot code
arm64: allow kernel Image to be loaded anywhere in physical memory
arm64: make asm/elf.h available to asm files
scripts/sortextable: add support for ET_DYN binaries
kallsyms: add support for relative offsets in kallsyms address table
arm64: add support for building the kernel as a relocate PIE binary
arm64: add support for kernel ASLR
efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
efi: stub: add implementation of efi_random_alloc()
efi: stub: use high allocation for converted command line
arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
Documentation/arm64/booting.txt | 20 +-
Documentation/features/vm/huge-vmap/arch-support.txt | 2 +-
arch/arm/include/asm/kvm_asm.h | 2 +
arch/arm/kvm/arm.c | 8 +-
arch/arm64/Kconfig | 41 ++++
arch/arm64/Makefile | 10 +-
arch/arm64/include/asm/assembler.h | 26 ++-
arch/arm64/include/asm/boot.h | 6 +
arch/arm64/include/asm/elf.h | 24 ++-
arch/arm64/include/asm/futex.h | 12 +-
arch/arm64/include/asm/kasan.h | 2 +-
arch/arm64/include/asm/kernel-pgtable.h | 11 ++
arch/arm64/include/asm/kvm_asm.h | 2 +
arch/arm64/include/asm/kvm_host.h | 8 +-
arch/arm64/include/asm/memory.h | 47 +++--
arch/arm64/include/asm/module.h | 11 ++
arch/arm64/include/asm/pgtable.h | 23 ++-
arch/arm64/include/asm/uaccess.h | 30 +--
arch/arm64/include/asm/word-at-a-time.h | 7 +-
arch/arm64/kernel/Makefile | 2 +
arch/arm64/kernel/armv8_deprecated.c | 7 +-
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 134 +++++++++++--
arch/arm64/kernel/image.h | 45 +++--
arch/arm64/kernel/kaslr.c | 169 ++++++++++++++++
arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
arch/arm64/kernel/module.c | 20 +-
arch/arm64/kernel/module.lds | 3 +
arch/arm64/kernel/setup.c | 29 +++
arch/arm64/kernel/vmlinux.lds.S | 20 +-
arch/arm64/kvm/hyp.S | 6 +-
arch/arm64/mm/dump.c | 12 +-
arch/arm64/mm/extable.c | 2 +-
arch/arm64/mm/init.c | 119 ++++++++++--
arch/arm64/mm/kasan_init.c | 18 +-
arch/arm64/mm/mmu.c | 186 +++++++++++++-----
arch/x86/include/asm/efi.h | 2 +
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/arm-stub.c | 40 ++--
drivers/firmware/efi/libstub/arm64-stub.c | 78 +++++---
drivers/firmware/efi/libstub/efi-stub-helper.c | 7 +-
drivers/firmware/efi/libstub/efistub.h | 7 +
drivers/firmware/efi/libstub/fdt.c | 9 +
drivers/firmware/efi/libstub/random.c | 135 +++++++++++++
drivers/of/fdt.c | 5 +-
include/linux/efi.h | 5 +-
init/Kconfig | 16 ++
kernel/kallsyms.c | 38 +++-
lib/extable.c | 50 ++++-
scripts/kallsyms.c | 88 ++++++++-
scripts/link-vmlinux.sh | 4 +
scripts/namespace.pl | 2 +
scripts/sortextable.c | 10 +-
53 files changed, 1496 insertions(+), 269 deletions(-)
create mode 100644 arch/arm64/kernel/kaslr.c
create mode 100644 arch/arm64/kernel/module-plts.c
create mode 100644 arch/arm64/kernel/module.lds
create mode 100644 drivers/firmware/efi/libstub/random.c
EFI_RNG_PROTOCOL on real hardware
=================================
To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
-- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.
FS0:\> rngdxe
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
-- Locate UEFI RNG Protocol : [Pass]
-- Call RNG->GetInfo() interface :
>> Supported RNG Algorithm (Count = 2) :
0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
1) E43176D7-B6E8-4827-B784-7FFDC4B68561
-- Call RNG->GetRNG() interface :
>> RNG with default algorithm : [Pass]
>> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>> RNG with SP800-90-CTR-256 : [Pass]
>> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>> RNG with X9.31-AES : [Fail - Status = Unsupported]
>> RNG with RAW Entropy : [Pass]
-- Random Number Generation Test with default RNG Algorithm (20 Rounds):
01) - 27
02) - 61E8
03) - 496FD8
04) - DDD793BF
05) - B6C37C8E23
06) - 4D183C604A96
07) - 9363311DB61298
08) - 5715A7294F4E436E
09) - F0D4D7BAA0DD52318E
10) - C88C6EBCF4C0474D87C3
11) - B5594602B482A643932172
12) - CA7573F704B2089B726B9CF1
13) - A93E9451CB533DCFBA87B97C33
14) - 45AA7B83DB6044F7BBAB031F0D24
15) - 3DD7A4D61F34ADCB400B5976730DCF
16) - 4DD168D21FAB8F59708330D6A9BEB021
17) - 4BBB225E61C465F174254159467E65939F
18) - 030A156C9616337A20070941E702827DA8E1
19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. v2 and up also implement physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.
Changes since v3:
- Implemented base relative kallsyms address tables. This saves 250 KB of
permanent .rodata (for a defconfig build), but more importantly, saves about
1.4 MB of __init data in the dynamic relocation table. This patch has been
picked up by akpm in the mean time, but it is reproduced here for completeness
- Reimplemented the KASLR init code in C. This has a couple of benefits, i.e.,
we can now parse the 'nokaslr' command line option in a timely fashion, and we
can pass the KASLR random seed via /chosen/kaslr-seed rather than via x1,
which means fewer changes to bootloaders. This is implemented using a two pass
approach, i.e., the kernel is booted without KASLR to a state where it can
invoke ordinary C code, and it re-enters the early boot code to recreate
the kernel mappings at an offset if it finds the prerequisite data in the FDT.
Note that this requires some mild refactoring to ensure that early_fixmap_init
and fixmap_remap_fdt can cope with being called twice.
- Added a patch to enable the inappropriately named huge-vmap feature for arm64,
which allows ioremap() (but not vmap or vmalloc) to use block mappings. This
should be an improvement in itself, but the significance for this series is
that it also allows the __init region to be unmapped entirely via
unmap_kernel_range() [which complains about block mappings without this
feature enabled]
- Split the implementation of CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE into
separate patches.
- Randomize the module region independently from the core kernel. It is chosen
such that it covers the [_stext, _etext] interval of the core kernel to avoid
using PLT entries unless we really have to.
- Update the module PLT patch to replace the O(n^2) searches with sorting, and
use a single .plt section for __init and ordinary code.
- Added panic notifiers to report the KASLR offset, and whether a mem= limit is
in effect.
- Replaced Mark Rutland's asm/elf.g split off patch with one that puts the C
declarations between #ifndef __ASSEMBLY__/#endif
- Incorporated feedback (and tags) from Mark Rutland and Matt Fleming
- Minor tweaks and fixes.
Changes since v2:
- Incorporated feedback from Marc Zyngier into the KVM patch (#5)
- Dropped the pgdir section and the patch that memblock_reserve()'s the kernel
sections at a smaller granularity. This is no longer necessary with the pgdir
section gone. This also fixes an issue spotted by James Morse where the fixmap
page tables are not zeroed correctly; these have been moved back to the .bss
section.
- Got rid of all ifdef'ery regarding the number of translation levels in the
changed .c files, by introducing new definitions in pgtable.h (#3, #6)
- Fixed KAsan support, which was broken by all earlier versions.
- Moved module region along with the virtually randomized kernel, so that module
addresses become unpredictable as well, and we only have to rely on veneers in
the PLTs when the module region is exhausted (which is somewhat more likely
since the module region is now shared with other uses of the vmalloc area)
- Added support for the 'nokaslr' command line option. This affects the
randomization performed by the stub, and results in a warning if passed while
the bootloader also presented a random seed for virtual KASLR in register x1.
- The .text/.rodata sections of the kernel are no longer aliased in the linear
region with a writable mapping.
- Added a separate image header flag for kernel images that may be loaded at any
2 MB aligned offset (+ TEXT_OFFSET)
- The KASLR displacement is now corrected if it results in the kernel image
intersecting a PUD/PMD boundary (4k and 16k/64k granule kernels, respectively)
- Split out UEFI stub random routines into separate patches.
- Implemented a weight based EFI random allocation routine so that each suitable
offset in available memory is equally likely to be selected (as suggested by
Kees Cook)
- Reused CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE instead of introducing
new Kconfig symbols to describe the same functionality.
- Reimplemented mem= logic so memory is clipped from the top first.
Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
randomization at load time. I have recycled some patches from this series:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
relation between kernel symbols and the linear mapping (on which the HYP
mapping is based), as these assumptions cease to be valid once we move the
kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
script to provide the kernel size using R_AARCH64_ABS32 relocation rather
than a R_AARCH64_ABS64 relocation, since those are always resolved at build
time. This allows me to get rid of the post-build perl script to swab header
values on BE kernels.
- Minor style tweaks.
Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
24 bytes per relocation entry. This results in considerable bloat (i.e., a
couple of MBs worth of relocation data in an .init section). However, no
build time postprocessing is required, we rely fully on the toolchain to
produce the image
- We have to rely on the bootloader to supply some randomness in
/chosen/kaslr-seed upon kernel entry. Since we have no decompressor, it is
simply not feasible to collect randomness in the head.S code path before
mapping the kernel and enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
UEFI systems is not universally available. A QEMU/KVM firmware image that
implements a pseudo-random version is available here:
http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
(requires access to PMCCNTR_EL0 and support for AES instructions)
See below for instructions how to run the pseudo-random version on real
hardware.
Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.
Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
Patch #3 introduces pte_offset_kimg(), pmd_offset_kimg() and pud_offset_kimg()
that allow statically allocated page tables (i.e., by fixmap and kasan) to be
traversed before the linear mapping is installed.
Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)
Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.
Patch #6 wires up the huge-vmap generic feature for arm64.
Patch #7 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.
Patch #8 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range. This is required for KASLR,
since modules may be loaded far away from the core kernel.
Patch #9 and #10 move arm64 to the a new generic relative version of the extable
implementation so that it no longer contains absolute addresses that require
fixing up at relocation time, but uses relative offsets instead.
Patch #11 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS64 relocations that are emitted for these fields are not
resolved at build time for PIE executables.
Patch #12 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.
Patch #13 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.
Patch #14 wraps C declarations in asm/elf.h inside #ifndef __ASSEMBLY__ so we
can include it in head.S
Patch #15 updates scripts/sortextable.c so it accepts ET_DYN (relocatable)
executables as well as ET_EXEC (static) executables.
Patch #16 implements base kallsyms base relative address tables.
Patch #17 implements CONFIG_RELOCATABLE, i.e., building vmlinux as a PIE
executable that self relocates at early boot time
Patch #18 implements the core KASLR, by taking randomness supplied in
/chosen/kaslr-seed and using it to move the kernel inside the vmalloc area,
and randomize the module region around it.
Patch #19 implements efi_get_random_bytes() based on the EFI_RNG_PROTOCOL
Patch #20 implements efi_random_alloc()
Patch #21 moves the allocation for the converted command line (UTF-16 to ASCII)
away from the base of memory. This is necessary since for parsing
Patch #22 implements the actual KASLR, by randomizing the kernel physical
address, and passing entropy in /chosen/kaslr-seed so that the kernel proper can
relocate itself virtually.
Ard Biesheuvel (22):
of/fdt: make memblock minimum physical address arch configurable
arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
arm64: pgtable: implement static [pte|pmd|pud]_offset variants
arm64: decouple early fixmap init from linear mapping
arm64: kvm: deal with kernel symbols outside of linear mapping
arm64: add support for ioremap() block mappings
arm64: move kernel image to base of vmalloc area
arm64: add support for module PLTs
extable: add support for relative extables to search and sort routines
arm64: switch to relative exception tables
arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
arm64: avoid dynamic relocations in early boot code
arm64: allow kernel Image to be loaded anywhere in physical memory
arm64: make asm/elf.h available to asm files
scripts/sortextable: add support for ET_DYN binaries
kallsyms: add support for relative offsets in kallsyms address table
arm64: add support for building the kernel as a relocate PIE binary
arm64: add support for kernel ASLR
efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
efi: stub: add implementation of efi_random_alloc()
efi: stub: use high allocation for converted command line
arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
Documentation/arm64/booting.txt | 20 +-
Documentation/features/vm/huge-vmap/arch-support.txt | 2 +-
arch/arm/include/asm/kvm_asm.h | 2 +
arch/arm/kvm/arm.c | 8 +-
arch/arm64/Kconfig | 41 ++++
arch/arm64/Makefile | 10 +-
arch/arm64/include/asm/assembler.h | 26 ++-
arch/arm64/include/asm/boot.h | 6 +
arch/arm64/include/asm/elf.h | 24 ++-
arch/arm64/include/asm/futex.h | 12 +-
arch/arm64/include/asm/kasan.h | 2 +-
arch/arm64/include/asm/kernel-pgtable.h | 11 ++
arch/arm64/include/asm/kvm_asm.h | 2 +
arch/arm64/include/asm/kvm_host.h | 8 +-
arch/arm64/include/asm/memory.h | 47 +++--
arch/arm64/include/asm/module.h | 11 ++
arch/arm64/include/asm/pgtable.h | 23 ++-
arch/arm64/include/asm/uaccess.h | 30 +--
arch/arm64/include/asm/word-at-a-time.h | 7 +-
arch/arm64/kernel/Makefile | 2 +
arch/arm64/kernel/armv8_deprecated.c | 7 +-
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 134 +++++++++++--
arch/arm64/kernel/image.h | 45 +++--
arch/arm64/kernel/kaslr.c | 169 ++++++++++++++++
arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
arch/arm64/kernel/module.c | 20 +-
arch/arm64/kernel/module.lds | 3 +
arch/arm64/kernel/setup.c | 29 +++
arch/arm64/kernel/vmlinux.lds.S | 20 +-
arch/arm64/kvm/hyp.S | 6 +-
arch/arm64/mm/dump.c | 12 +-
arch/arm64/mm/extable.c | 2 +-
arch/arm64/mm/init.c | 119 ++++++++++--
arch/arm64/mm/kasan_init.c | 18 +-
arch/arm64/mm/mmu.c | 186 +++++++++++++-----
arch/x86/include/asm/efi.h | 2 +
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/arm-stub.c | 40 ++--
drivers/firmware/efi/libstub/arm64-stub.c | 78 +++++---
drivers/firmware/efi/libstub/efi-stub-helper.c | 7 +-
drivers/firmware/efi/libstub/efistub.h | 7 +
drivers/firmware/efi/libstub/fdt.c | 9 +
drivers/firmware/efi/libstub/random.c | 135 +++++++++++++
drivers/of/fdt.c | 5 +-
include/linux/efi.h | 5 +-
init/Kconfig | 16 ++
kernel/kallsyms.c | 38 +++-
lib/extable.c | 50 ++++-
scripts/kallsyms.c | 88 ++++++++-
scripts/link-vmlinux.sh | 4 +
scripts/namespace.pl | 2 +
scripts/sortextable.c | 10 +-
53 files changed, 1496 insertions(+), 269 deletions(-)
create mode 100644 arch/arm64/kernel/kaslr.c
create mode 100644 arch/arm64/kernel/module-plts.c
create mode 100644 arch/arm64/kernel/module.lds
create mode 100644 drivers/firmware/efi/libstub/random.c
EFI_RNG_PROTOCOL on real hardware
=================================
To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
-- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.
FS0:\> rngdxe
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
-- Locate UEFI RNG Protocol : [Pass]
-- Call RNG->GetInfo() interface :
>> Supported RNG Algorithm (Count = 2) :
0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
1) E43176D7-B6E8-4827-B784-7FFDC4B68561
-- Call RNG->GetRNG() interface :
>> RNG with default algorithm : [Pass]
>> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>> RNG with SP800-90-CTR-256 : [Pass]
>> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>> RNG with X9.31-AES : [Fail - Status = Unsupported]
>> RNG with RAW Entropy : [Pass]
-- Random Number Generation Test with default RNG Algorithm (20 Rounds):
01) - 27
02) - 61E8
03) - 496FD8
04) - DDD793BF
05) - B6C37C8E23
06) - 4D183C604A96
07) - 9363311DB61298
08) - 5715A7294F4E436E
09) - F0D4D7BAA0DD52318E
10) - C88C6EBCF4C0474D87C3
11) - B5594602B482A643932172
12) - CA7573F704B2089B726B9CF1
13) - A93E9451CB533DCFBA87B97C33
14) - 45AA7B83DB6044F7BBAB031F0D24
15) - 3DD7A4D61F34ADCB400B5976730DCF
16) - 4DD168D21FAB8F59708330D6A9BEB021
17) - 4BBB225E61C465F174254159467E65939F
18) - 030A156C9616337A20070941E702827DA8E1
19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/of/fdt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
}
#ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR __pa(PAGE_OFFSET)
+#endif
#ifndef MAX_MEMBLOCK_ADDR
#define MAX_MEMBLOCK_ADDR ((phys_addr_t)~0)
#endif
void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
{
- const u64 phys_offset = __pa(PAGE_OFFSET);
+ const u64 phys_offset = MIN_MEMBLOCK_ADDR;
if (!PAGE_ALIGNED(base)) {
if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/of/fdt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
}
#ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR __pa(PAGE_OFFSET)
+#endif
#ifndef MAX_MEMBLOCK_ADDR
#define MAX_MEMBLOCK_ADDR ((phys_addr_t)~0)
#endif
void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
{
- const u64 phys_offset = __pa(PAGE_OFFSET);
+ const u64 phys_offset = MIN_MEMBLOCK_ADDR;
if (!PAGE_ALIGNED(base)) {
if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/of/fdt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
}
#ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR __pa(PAGE_OFFSET)
+#endif
#ifndef MAX_MEMBLOCK_ADDR
#define MAX_MEMBLOCK_ADDR ((phys_addr_t)~0)
#endif
void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
{
- const u64 phys_offset = __pa(PAGE_OFFSET);
+ const u64 phys_offset = MIN_MEMBLOCK_ADDR;
if (!PAGE_ALIGNED(base)) {
if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/memory.h | 10 ++++++++--
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/vmlinux.lds.S | 4 ++--
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END (PAGE_OFFSET)
+#define KIMAGE_VADDR (PAGE_OFFSET)
+#define MODULES_END (KIMAGE_VADDR)
#define MODULES_VADDR (MODULES_END - SZ_64M)
#define PCI_IO_END (MODULES_VADDR - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
*/
-#define __virt_to_phys(x) (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({ \
+ phys_addr_t __x = (phys_addr_t)(x); \
+ __x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
+ (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
/*
* Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
* Map the kernel image (starting with PHYS_OFFSET).
*/
mov x0, x26 // swapper_pg_dir
- mov x5, #PAGE_OFFSET
+ ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
ldr x6, =KERNEL_END // __va(KERNEL_END)
mov x3, x24 // phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
*(.discard.*)
}
- . = PAGE_OFFSET + TEXT_OFFSET;
+ . = KIMAGE_VADDR + TEXT_OFFSET;
.head.text : {
_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
*/
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/memory.h | 10 ++++++++--
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/vmlinux.lds.S | 4 ++--
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END (PAGE_OFFSET)
+#define KIMAGE_VADDR (PAGE_OFFSET)
+#define MODULES_END (KIMAGE_VADDR)
#define MODULES_VADDR (MODULES_END - SZ_64M)
#define PCI_IO_END (MODULES_VADDR - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
*/
-#define __virt_to_phys(x) (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({ \
+ phys_addr_t __x = (phys_addr_t)(x); \
+ __x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
+ (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
/*
* Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
* Map the kernel image (starting with PHYS_OFFSET).
*/
mov x0, x26 // swapper_pg_dir
- mov x5, #PAGE_OFFSET
+ ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
ldr x6, =KERNEL_END // __va(KERNEL_END)
mov x3, x24 // phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
*(.discard.*)
}
- . = PAGE_OFFSET + TEXT_OFFSET;
+ . = KIMAGE_VADDR + TEXT_OFFSET;
.head.text : {
_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
*/
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/memory.h | 10 ++++++++--
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/vmlinux.lds.S | 4 ++--
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END (PAGE_OFFSET)
+#define KIMAGE_VADDR (PAGE_OFFSET)
+#define MODULES_END (KIMAGE_VADDR)
#define MODULES_VADDR (MODULES_END - SZ_64M)
#define PCI_IO_END (MODULES_VADDR - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
*/
-#define __virt_to_phys(x) (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({ \
+ phys_addr_t __x = (phys_addr_t)(x); \
+ __x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
+ (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
/*
* Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
* Map the kernel image (starting with PHYS_OFFSET).
*/
mov x0, x26 // swapper_pg_dir
- mov x5, #PAGE_OFFSET
+ ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
ldr x6, =KERNEL_END // __va(KERNEL_END)
mov x3, x24 // phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
*(.discard.*)
}
- . = PAGE_OFFSET + TEXT_OFFSET;
+ . = KIMAGE_VADDR + TEXT_OFFSET;
.head.text : {
_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
*/
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr) ((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pud_page(pud) pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
#else
#define pud_page_paddr(pud) ({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pmd_set_fixmap_offset(pudp, addr) ((pmd_t *)pudp)
#define pmd_clear_fixmap()
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 2 */
#if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
#else
#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pud_set_fixmap_offset(pgdp, addr) ((pud_t *)pgdp)
#define pud_clear_fixmap()
+#define pud_offset_kimg(dir,addr) ((pud_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
#define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd_val(pgd))
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr) ((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pud_page(pud) pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
#else
#define pud_page_paddr(pud) ({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pmd_set_fixmap_offset(pudp, addr) ((pmd_t *)pudp)
#define pmd_clear_fixmap()
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 2 */
#if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
#else
#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pud_set_fixmap_offset(pgdp, addr) ((pud_t *)pgdp)
#define pud_clear_fixmap()
+#define pud_offset_kimg(dir,addr) ((pud_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
#define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd_val(pgd))
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr) ((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pud_page(pud) pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
#else
#define pud_page_paddr(pud) ({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pmd_set_fixmap_offset(pudp, addr) ((pmd_t *)pudp)
#define pmd_clear_fixmap()
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 2 */
#if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
#else
#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
#define pud_set_fixmap_offset(pgdp, addr) ((pud_t *)pgdp)
#define pud_clear_fixmap()
+#define pud_offset_kimg(dir,addr) ((pud_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
#define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd_val(pgd))
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.
Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
- return pud_offset(pgd, addr);
+ return pud_offset_kimg(pgd, addr);
}
static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
BUG_ON(pud_none(*pud) || pud_bad(*pud));
- return pmd_offset(pud, addr);
+ return pmd_offset_kimg(pud, addr);
}
static inline pte_t * fixmap_pte(unsigned long addr)
{
- pmd_t *pmd = fixmap_pmd(addr);
-
- BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
- return pte_offset_kernel(pmd, addr);
+ return &bm_pte[pte_index(addr)];
}
void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
pgd = pgd_offset_k(addr);
pgd_populate(&init_mm, pgd, bm_pud);
- pud = pud_offset(pgd, addr);
+ pud = fixmap_pud(addr);
pud_populate(&init_mm, pud, bm_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
/*
* The boot-ioremap range spans multiple pmds, for which
- * we are not preparted:
+ * we are not prepared:
*/
BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
!= (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.
Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
- return pud_offset(pgd, addr);
+ return pud_offset_kimg(pgd, addr);
}
static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
BUG_ON(pud_none(*pud) || pud_bad(*pud));
- return pmd_offset(pud, addr);
+ return pmd_offset_kimg(pud, addr);
}
static inline pte_t * fixmap_pte(unsigned long addr)
{
- pmd_t *pmd = fixmap_pmd(addr);
-
- BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
- return pte_offset_kernel(pmd, addr);
+ return &bm_pte[pte_index(addr)];
}
void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
pgd = pgd_offset_k(addr);
pgd_populate(&init_mm, pgd, bm_pud);
- pud = pud_offset(pgd, addr);
+ pud = fixmap_pud(addr);
pud_populate(&init_mm, pud, bm_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
/*
* The boot-ioremap range spans multiple pmds, for which
- * we are not preparted:
+ * we are not prepared:
*/
BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
!= (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.
Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/mm/mmu.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
- return pud_offset(pgd, addr);
+ return pud_offset_kimg(pgd, addr);
}
static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
BUG_ON(pud_none(*pud) || pud_bad(*pud));
- return pmd_offset(pud, addr);
+ return pmd_offset_kimg(pud, addr);
}
static inline pte_t * fixmap_pte(unsigned long addr)
{
- pmd_t *pmd = fixmap_pmd(addr);
-
- BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
- return pte_offset_kernel(pmd, addr);
+ return &bm_pte[pte_index(addr)];
}
void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
pgd = pgd_offset_k(addr);
pgd_populate(&init_mm, pgd, bm_pud);
- pud = pud_offset(pgd, addr);
+ pud = fixmap_pud(addr);
pud_populate(&init_mm, pud, bm_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
/*
* The boot-ioremap range spans multiple pmds, for which
- * we are not preparted:
+ * we are not prepared:
*/
BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
!= (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/include/asm/kvm_asm.h | 2 ++
arch/arm/kvm/arm.c | 8 +++++---
arch/arm64/include/asm/kvm_asm.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 8 +++++---
arch/arm64/kvm/hyp.S | 6 +++---
5 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
#define rr_lo_hi(a1, a2) a1, a2
#endif
+#define kvm_ksym_ref(kva) (kva)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
- vector_ptr = (unsigned long)__kvm_hyp_vector;
+ vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
/*
* Map the Hyp-code called directly from the host
*/
- err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+ err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+ kvm_ksym_ref(__kvm_hyp_code_end));
if (err) {
kvm_err("Cannot map world-switch code\n");
goto out_free_mappings;
}
- err = create_hyp_mappings(__start_rodata, __end_rodata);
+ err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+ kvm_ksym_ref(__end_rodata));
if (err) {
kvm_err("Cannot map rodata section\n");
goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
* Call initialization code, and switch to the full blown
* HYP code.
*/
- kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
- hyp_stack_ptr, vector_ptr);
+ __kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+ hyp_stack_ptr, vector_ptr);
}
static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
#endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
#include <asm/assembler.h>
/*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
*
* This is not really a variadic function in the classic C-way and care must
* be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
* used to implement __hyp_get_vectors in the same way as in
* arch/arm64/kernel/hyp_stub.S.
*/
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
hvc #0
ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/include/asm/kvm_asm.h | 2 ++
arch/arm/kvm/arm.c | 8 +++++---
arch/arm64/include/asm/kvm_asm.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 8 +++++---
arch/arm64/kvm/hyp.S | 6 +++---
5 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
#define rr_lo_hi(a1, a2) a1, a2
#endif
+#define kvm_ksym_ref(kva) (kva)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
- vector_ptr = (unsigned long)__kvm_hyp_vector;
+ vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
/*
* Map the Hyp-code called directly from the host
*/
- err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+ err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+ kvm_ksym_ref(__kvm_hyp_code_end));
if (err) {
kvm_err("Cannot map world-switch code\n");
goto out_free_mappings;
}
- err = create_hyp_mappings(__start_rodata, __end_rodata);
+ err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+ kvm_ksym_ref(__end_rodata));
if (err) {
kvm_err("Cannot map rodata section\n");
goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
* Call initialization code, and switch to the full blown
* HYP code.
*/
- kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
- hyp_stack_ptr, vector_ptr);
+ __kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+ hyp_stack_ptr, vector_ptr);
}
static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
#endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
#include <asm/assembler.h>
/*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
*
* This is not really a variadic function in the classic C-way and care must
* be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
* used to implement __hyp_get_vectors in the same way as in
* arch/arm64/kernel/hyp_stub.S.
*/
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
hvc #0
ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/include/asm/kvm_asm.h | 2 ++
arch/arm/kvm/arm.c | 8 +++++---
arch/arm64/include/asm/kvm_asm.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 8 +++++---
arch/arm64/kvm/hyp.S | 6 +++---
5 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
#define rr_lo_hi(a1, a2) a1, a2
#endif
+#define kvm_ksym_ref(kva) (kva)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
- vector_ptr = (unsigned long)__kvm_hyp_vector;
+ vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
/*
* Map the Hyp-code called directly from the host
*/
- err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+ err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+ kvm_ksym_ref(__kvm_hyp_code_end));
if (err) {
kvm_err("Cannot map world-switch code\n");
goto out_free_mappings;
}
- err = create_hyp_mappings(__start_rodata, __end_rodata);
+ err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+ kvm_ksym_ref(__end_rodata));
if (err) {
kvm_err("Cannot map rodata section\n");
goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
* Call initialization code, and switch to the full blown
* HYP code.
*/
- kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
- hyp_stack_ptr, vector_ptr);
+ __kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+ hyp_stack_ptr, vector_ptr);
}
static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
#endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
#include <asm/assembler.h>
/*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
*
* This is not really a variadic function in the classic C-way and care must
* be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
* used to implement __hyp_get_vectors in the same way as in
* arch/arm64/kernel/hyp_stub.S.
*/
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
hvc #0
ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* Re: [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:45 ` Marc Zyngier
-1 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
catalin.marinas, mark.rutland, leif.lindholm, keescook,
linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall, labbott, matt
On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
>
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:45 ` Marc Zyngier
0 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
catalin.marinas, mark.rutland, leif.lindholm, keescook,
linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall, labbott, matt
On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
>
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:45 ` Marc Zyngier
0 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
To: linux-arm-kernel
On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
>
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 06/22] arm64: add support for ioremap() block mappings
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/features/vm/huge-vmap/arch-support.txt | 2 +-
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/memory.h | 6 +++
arch/arm64/mm/mmu.c | 41 ++++++++++++++++++++
4 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
| alpha: | TODO |
| arc: | TODO |
| arm: | TODO |
- | arm64: | TODO |
+ | arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
+ select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
#define MT_S2_NORMAL 0xf
#define MT_S2_DEVICE_nGnRE 0x1
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER (PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER (PMD_SHIFT)
+#endif
+
#ifndef __ASSEMBLY__
extern phys_addr_t memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
return dt_virt;
}
+
+int __init arch_ioremap_pud_supported(void)
+{
+ /* only 4k granule supports level 1 block mappings */
+ return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+ return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PUD_MASK);
+ set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PMD_MASK);
+ set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+ if (!pud_sect(*pud))
+ return 0;
+ pud_clear(pud);
+ return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+ if (!pmd_sect(*pmd))
+ return 0;
+ pmd_clear(pmd);
+ return 1;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 06/22] arm64: add support for ioremap() block mappings
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/features/vm/huge-vmap/arch-support.txt | 2 +-
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/memory.h | 6 +++
arch/arm64/mm/mmu.c | 41 ++++++++++++++++++++
4 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
| alpha: | TODO |
| arc: | TODO |
| arm: | TODO |
- | arm64: | TODO |
+ | arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
+ select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
#define MT_S2_NORMAL 0xf
#define MT_S2_DEVICE_nGnRE 0x1
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER (PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER (PMD_SHIFT)
+#endif
+
#ifndef __ASSEMBLY__
extern phys_addr_t memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
return dt_virt;
}
+
+int __init arch_ioremap_pud_supported(void)
+{
+ /* only 4k granule supports level 1 block mappings */
+ return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+ return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PUD_MASK);
+ set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PMD_MASK);
+ set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+ if (!pud_sect(*pud))
+ return 0;
+ pud_clear(pud);
+ return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+ if (!pmd_sect(*pmd))
+ return 0;
+ pmd_clear(pmd);
+ return 1;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 06/22] arm64: add support for ioremap() block mappings
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/features/vm/huge-vmap/arch-support.txt | 2 +-
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/memory.h | 6 +++
arch/arm64/mm/mmu.c | 41 ++++++++++++++++++++
4 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
| alpha: | TODO |
| arc: | TODO |
| arm: | TODO |
- | arm64: | TODO |
+ | arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
+ select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
#define MT_S2_NORMAL 0xf
#define MT_S2_DEVICE_nGnRE 0x1
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER (PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER (PMD_SHIFT)
+#endif
+
#ifndef __ASSEMBLY__
extern phys_addr_t memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
return dt_virt;
}
+
+int __init arch_ioremap_pud_supported(void)
+{
+ /* only 4k granule supports level 1 block mappings */
+ return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+ return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PUD_MASK);
+ set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+ BUG_ON(phys & ~PMD_MASK);
+ set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+ return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+ if (!pud_sect(*pud))
+ return 0;
+ pud_clear(pud);
+ return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+ if (!pmd_sect(*pmd))
+ return 0;
+ pmd_clear(pmd);
+ return 1;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/kasan.h | 2 +-
arch/arm64/include/asm/memory.h | 21 +++--
arch/arm64/include/asm/pgtable.h | 10 +-
arch/arm64/mm/dump.c | 12 +--
arch/arm64/mm/init.c | 23 ++---
arch/arm64/mm/kasan_init.c | 18 +++-
arch/arm64/mm/mmu.c | 97 +++++++++++++-------
7 files changed, 116 insertions(+), 67 deletions(-)
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
* KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
*/
#define KASAN_SHADOW_START (VA_START)
-#define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
/*
* This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
* VA_START - the first kernel virtual address.
* TASK_SIZE - the maximum size of a user space task.
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
*/
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR (PAGE_OFFSET)
-#define MODULES_END (KIMAGE_VADDR)
-#define MODULES_VADDR (MODULES_END - SZ_64M)
-#define PCI_IO_END (MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR (MODULES_END)
+#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE (SZ_64M)
+#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
#define TASK_SIZE_64 (UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
#define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4))
/*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE (0)
+#endif
+
+/*
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
*
* VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
* (rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
-#ifndef CONFIG_KASAN
-#define VMALLOC_START (VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START (KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
};
enum address_markers_idx {
- VMALLOC_START_NR = 0,
+ MODULES_START_NR = 0,
+ MODULES_END_NR,
+ VMALLOC_START_NR,
VMALLOC_END_NR,
#ifdef CONFIG_SPARSEMEM_VMEMMAP
VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
FIXADDR_END_NR,
PCI_START_NR,
PCI_END_NR,
- MODULES_START_NR,
- MODUELS_END_NR,
KERNEL_SPACE_NR,
};
static struct addr_marker address_markers[] = {
+ { MODULES_VADDR, "Modules start" },
+ { MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() Area" },
{ VMALLOC_END, "vmalloc() End" },
#ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
{ FIXADDR_TOP, "Fixmap end" },
{ PCI_IO_START, "PCI I/O start" },
{ PCI_IO_END, "PCI I/O end" },
- { MODULES_VADDR, "Modules start" },
- { MODULES_END, "Modules end" },
- { PAGE_OFFSET, "Kernel Mapping" },
+ { PAGE_OFFSET, "Linear Mapping" },
{ -1, NULL },
};
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
#include <linux/swiotlb.h>
#include <asm/fixmap.h>
+#include <asm/kasan.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
#ifdef CONFIG_KASAN
" kasan : 0x%16lx - 0x%16lx (%6ld GB)\n"
#endif
+ " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
" vmalloc : 0x%16lx - 0x%16lx (%6ld GB)\n"
+ " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .data : 0x%p" " - 0x%p" " (%6ld KB)\n"
#ifdef CONFIG_SPARSEMEM_VMEMMAP
" vmemmap : 0x%16lx - 0x%16lx (%6ld GB maximum)\n"
" 0x%16lx - 0x%16lx (%6ld MB actual)\n"
#endif
" fixed : 0x%16lx - 0x%16lx (%6ld KB)\n"
" PCI I/O : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " memory : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .data : 0x%p" " - 0x%p" " (%6ld KB)\n",
+ " memory : 0x%16lx - 0x%16lx (%6ld MB)\n",
#ifdef CONFIG_KASAN
MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
#endif
+ MLM(MODULES_VADDR, MODULES_END),
MLG(VMALLOC_START, VMALLOC_END),
+ MLK_ROUNDUP(__init_begin, __init_end),
+ MLK_ROUNDUP(_text, _etext),
+ MLK_ROUNDUP(_sdata, _edata),
#ifdef CONFIG_SPARSEMEM_VMEMMAP
MLG((unsigned long)vmemmap,
(unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
#endif
MLK(FIXADDR_START, FIXADDR_TOP),
MLM(PCI_IO_START, PCI_IO_END),
- MLM(MODULES_VADDR, MODULES_END),
- MLM(PAGE_OFFSET, (unsigned long)high_memory),
- MLK_ROUNDUP(__init_begin, __init_end),
- MLK_ROUNDUP(_text, _etext),
- MLK_ROUNDUP(_sdata, _edata));
+ MLM(PAGE_OFFSET, (unsigned long)high_memory));
#undef MLK
#undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
void free_initmem(void)
{
- fixup_init();
free_initmem_default(0);
+ fixup_init();
}
#ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
#include <linux/start_kernel.h>
#include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
+#include <asm/sections.h>
#include <asm/tlbflush.h>
static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
if (pmd_none(*pmd))
pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
- pte = pte_offset_kernel(pmd, addr);
+ pte = pte_offset_kimg(pmd, addr);
do {
next = addr + PAGE_SIZE;
set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
if (pud_none(*pud))
pud_populate(&init_mm, pud, kasan_zero_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = pmd_offset_kimg(pud, addr);
do {
next = pmd_addr_end(addr, end);
kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
if (pgd_none(*pgd))
pgd_populate(&init_mm, pgd, kasan_zero_pud);
- pud = pud_offset(pgd, addr);
+ pud = pud_offset_kimg(pgd, addr);
do {
next = pud_addr_end(addr, end);
kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
void __init kasan_init(void)
{
+ u64 kimg_shadow_start, kimg_shadow_end;
struct memblock_region *reg;
+ kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+ kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
/*
* We are going to perform proper setup of shadow memory.
* At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
+ vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
- kasan_mem_to_shadow((void *)MODULES_VADDR));
+ (void *)kimg_shadow_start);
+ kasan_populate_zero_shadow((void *)kimg_shadow_end,
+ kasan_mem_to_shadow((void *)PAGE_OFFSET));
for_each_memblock(memory, reg) {
void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
EXPORT_SYMBOL(empty_zero_page);
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t vma_prot)
{
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
{
unsigned long kernel_start = __pa(_stext);
- unsigned long kernel_end = __pa(_end);
+ unsigned long kernel_end = __pa(_etext);
/*
- * The kernel itself is mapped at page granularity. Map all other
- * memory, making sure we don't overwrite the existing kernel mappings.
+ * Take care not to create a writable alias for the
+ * read-only text and rodata sections of the kernel image.
*/
- /* No overlap with the kernel. */
+ /* No overlap with the kernel text */
if (end < kernel_start || start >= kernel_end) {
__create_pgd_mapping(pgd, start, __phys_to_virt(start),
end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
}
/*
- * This block overlaps the kernel mapping. Map the portion(s) which
+ * This block overlaps the kernel text mapping. Map the portion(s) which
* don't overlap.
*/
if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
}
}
-#ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void)
{
+ if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+ return;
+
create_mapping_late(__pa(_stext), (unsigned long)_stext,
(unsigned long)_etext - (unsigned long)_stext,
PAGE_KERNEL_ROX);
-
}
-#endif
void fixup_init(void)
{
- create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
- (unsigned long)__init_end - (unsigned long)__init_begin,
- PAGE_KERNEL);
+ /*
+ * Unmap the __init region but leave the VM area in place. This
+ * prevents the region from being reused for kernel modules, which
+ * is not supported by kallsyms.
+ */
+ unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
}
static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
- pgprot_t prot)
+ pgprot_t prot, struct vm_struct *vma)
{
phys_addr_t pa_start = __pa(va_start);
unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
early_pgtable_alloc);
+
+ vma->addr = va_start;
+ vma->phys_addr = pa_start;
+ vma->size = size;
+ vma->flags = VM_MAP;
+ vma->caller = map_kernel_chunk;
+
+ vm_area_add_early(vma);
}
/*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
*/
static void __init map_kernel(pgd_t *pgd)
{
+ static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
- map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+ map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+ map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+ &vmlinux_init);
+ map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
- /*
- * The fixmap falls in a separate pgd to the kernel, and doesn't live
- * in the carveout for the swapper_pg_dir. We can simply re-use the
- * existing dir for the fixmap.
- */
- set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+ if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+ /*
+ * The fixmap falls in a separate pgd to the kernel, and doesn't
+ * live in the carveout for the swapper_pg_dir. We can simply
+ * re-use the existing dir for the fixmap.
+ */
+ set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+ *pgd_offset_k(FIXADDR_START));
+ } else if (CONFIG_PGTABLE_LEVELS > 3) {
+ /*
+ * The fixmap shares its top level pgd entry with the kernel
+ * mapping. This can really only occur when we are running
+ * with 16k/4 levels, so we can simply reuse the pud level
+ * entry instead.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+ __pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+ pud_clear_fixmap();
+ } else {
+ BUG();
+ }
kasan_copy_shadow(pgd);
}
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
}
#endif /* CONFIG_SPARSEMEM_VMEMMAP */
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
static inline pud_t * fixmap_pud(unsigned long addr)
{
pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- pgd_populate(&init_mm, pgd, bm_pud);
- pud = fixmap_pud(addr);
+ if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ /*
+ * We only end up here if the kernel mapping and the fixmap
+ * share the top level pgd entry, which should only happen on
+ * 16k/4 levels configurations.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ pud = pud_offset_kimg(pgd, addr);
+ } else {
+ pgd_populate(&init_mm, pgd, bm_pud);
+ pud = fixmap_pud(addr);
+ }
pud_populate(&init_mm, pud, bm_pmd);
pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/kasan.h | 2 +-
arch/arm64/include/asm/memory.h | 21 +++--
arch/arm64/include/asm/pgtable.h | 10 +-
arch/arm64/mm/dump.c | 12 +--
arch/arm64/mm/init.c | 23 ++---
arch/arm64/mm/kasan_init.c | 18 +++-
arch/arm64/mm/mmu.c | 97 +++++++++++++-------
7 files changed, 116 insertions(+), 67 deletions(-)
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
* KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
*/
#define KASAN_SHADOW_START (VA_START)
-#define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
/*
* This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
* VA_START - the first kernel virtual address.
* TASK_SIZE - the maximum size of a user space task.
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
*/
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR (PAGE_OFFSET)
-#define MODULES_END (KIMAGE_VADDR)
-#define MODULES_VADDR (MODULES_END - SZ_64M)
-#define PCI_IO_END (MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR (MODULES_END)
+#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE (SZ_64M)
+#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
#define TASK_SIZE_64 (UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
#define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4))
/*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE (0)
+#endif
+
+/*
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
*
* VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
* (rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
-#ifndef CONFIG_KASAN
-#define VMALLOC_START (VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START (KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
};
enum address_markers_idx {
- VMALLOC_START_NR = 0,
+ MODULES_START_NR = 0,
+ MODULES_END_NR,
+ VMALLOC_START_NR,
VMALLOC_END_NR,
#ifdef CONFIG_SPARSEMEM_VMEMMAP
VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
FIXADDR_END_NR,
PCI_START_NR,
PCI_END_NR,
- MODULES_START_NR,
- MODUELS_END_NR,
KERNEL_SPACE_NR,
};
static struct addr_marker address_markers[] = {
+ { MODULES_VADDR, "Modules start" },
+ { MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() Area" },
{ VMALLOC_END, "vmalloc() End" },
#ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
{ FIXADDR_TOP, "Fixmap end" },
{ PCI_IO_START, "PCI I/O start" },
{ PCI_IO_END, "PCI I/O end" },
- { MODULES_VADDR, "Modules start" },
- { MODULES_END, "Modules end" },
- { PAGE_OFFSET, "Kernel Mapping" },
+ { PAGE_OFFSET, "Linear Mapping" },
{ -1, NULL },
};
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
#include <linux/swiotlb.h>
#include <asm/fixmap.h>
+#include <asm/kasan.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
#ifdef CONFIG_KASAN
" kasan : 0x%16lx - 0x%16lx (%6ld GB)\n"
#endif
+ " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
" vmalloc : 0x%16lx - 0x%16lx (%6ld GB)\n"
+ " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .data : 0x%p" " - 0x%p" " (%6ld KB)\n"
#ifdef CONFIG_SPARSEMEM_VMEMMAP
" vmemmap : 0x%16lx - 0x%16lx (%6ld GB maximum)\n"
" 0x%16lx - 0x%16lx (%6ld MB actual)\n"
#endif
" fixed : 0x%16lx - 0x%16lx (%6ld KB)\n"
" PCI I/O : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " memory : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .data : 0x%p" " - 0x%p" " (%6ld KB)\n",
+ " memory : 0x%16lx - 0x%16lx (%6ld MB)\n",
#ifdef CONFIG_KASAN
MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
#endif
+ MLM(MODULES_VADDR, MODULES_END),
MLG(VMALLOC_START, VMALLOC_END),
+ MLK_ROUNDUP(__init_begin, __init_end),
+ MLK_ROUNDUP(_text, _etext),
+ MLK_ROUNDUP(_sdata, _edata),
#ifdef CONFIG_SPARSEMEM_VMEMMAP
MLG((unsigned long)vmemmap,
(unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
#endif
MLK(FIXADDR_START, FIXADDR_TOP),
MLM(PCI_IO_START, PCI_IO_END),
- MLM(MODULES_VADDR, MODULES_END),
- MLM(PAGE_OFFSET, (unsigned long)high_memory),
- MLK_ROUNDUP(__init_begin, __init_end),
- MLK_ROUNDUP(_text, _etext),
- MLK_ROUNDUP(_sdata, _edata));
+ MLM(PAGE_OFFSET, (unsigned long)high_memory));
#undef MLK
#undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
void free_initmem(void)
{
- fixup_init();
free_initmem_default(0);
+ fixup_init();
}
#ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
#include <linux/start_kernel.h>
#include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
+#include <asm/sections.h>
#include <asm/tlbflush.h>
static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
if (pmd_none(*pmd))
pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
- pte = pte_offset_kernel(pmd, addr);
+ pte = pte_offset_kimg(pmd, addr);
do {
next = addr + PAGE_SIZE;
set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
if (pud_none(*pud))
pud_populate(&init_mm, pud, kasan_zero_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = pmd_offset_kimg(pud, addr);
do {
next = pmd_addr_end(addr, end);
kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
if (pgd_none(*pgd))
pgd_populate(&init_mm, pgd, kasan_zero_pud);
- pud = pud_offset(pgd, addr);
+ pud = pud_offset_kimg(pgd, addr);
do {
next = pud_addr_end(addr, end);
kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
void __init kasan_init(void)
{
+ u64 kimg_shadow_start, kimg_shadow_end;
struct memblock_region *reg;
+ kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+ kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
/*
* We are going to perform proper setup of shadow memory.
* At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
+ vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
- kasan_mem_to_shadow((void *)MODULES_VADDR));
+ (void *)kimg_shadow_start);
+ kasan_populate_zero_shadow((void *)kimg_shadow_end,
+ kasan_mem_to_shadow((void *)PAGE_OFFSET));
for_each_memblock(memory, reg) {
void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
EXPORT_SYMBOL(empty_zero_page);
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t vma_prot)
{
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
{
unsigned long kernel_start = __pa(_stext);
- unsigned long kernel_end = __pa(_end);
+ unsigned long kernel_end = __pa(_etext);
/*
- * The kernel itself is mapped at page granularity. Map all other
- * memory, making sure we don't overwrite the existing kernel mappings.
+ * Take care not to create a writable alias for the
+ * read-only text and rodata sections of the kernel image.
*/
- /* No overlap with the kernel. */
+ /* No overlap with the kernel text */
if (end < kernel_start || start >= kernel_end) {
__create_pgd_mapping(pgd, start, __phys_to_virt(start),
end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
}
/*
- * This block overlaps the kernel mapping. Map the portion(s) which
+ * This block overlaps the kernel text mapping. Map the portion(s) which
* don't overlap.
*/
if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
}
}
-#ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void)
{
+ if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+ return;
+
create_mapping_late(__pa(_stext), (unsigned long)_stext,
(unsigned long)_etext - (unsigned long)_stext,
PAGE_KERNEL_ROX);
-
}
-#endif
void fixup_init(void)
{
- create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
- (unsigned long)__init_end - (unsigned long)__init_begin,
- PAGE_KERNEL);
+ /*
+ * Unmap the __init region but leave the VM area in place. This
+ * prevents the region from being reused for kernel modules, which
+ * is not supported by kallsyms.
+ */
+ unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
}
static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
- pgprot_t prot)
+ pgprot_t prot, struct vm_struct *vma)
{
phys_addr_t pa_start = __pa(va_start);
unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
early_pgtable_alloc);
+
+ vma->addr = va_start;
+ vma->phys_addr = pa_start;
+ vma->size = size;
+ vma->flags = VM_MAP;
+ vma->caller = map_kernel_chunk;
+
+ vm_area_add_early(vma);
}
/*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
*/
static void __init map_kernel(pgd_t *pgd)
{
+ static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
- map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+ map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+ map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+ &vmlinux_init);
+ map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
- /*
- * The fixmap falls in a separate pgd to the kernel, and doesn't live
- * in the carveout for the swapper_pg_dir. We can simply re-use the
- * existing dir for the fixmap.
- */
- set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+ if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+ /*
+ * The fixmap falls in a separate pgd to the kernel, and doesn't
+ * live in the carveout for the swapper_pg_dir. We can simply
+ * re-use the existing dir for the fixmap.
+ */
+ set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+ *pgd_offset_k(FIXADDR_START));
+ } else if (CONFIG_PGTABLE_LEVELS > 3) {
+ /*
+ * The fixmap shares its top level pgd entry with the kernel
+ * mapping. This can really only occur when we are running
+ * with 16k/4 levels, so we can simply reuse the pud level
+ * entry instead.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+ __pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+ pud_clear_fixmap();
+ } else {
+ BUG();
+ }
kasan_copy_shadow(pgd);
}
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
}
#endif /* CONFIG_SPARSEMEM_VMEMMAP */
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
static inline pud_t * fixmap_pud(unsigned long addr)
{
pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- pgd_populate(&init_mm, pgd, bm_pud);
- pud = fixmap_pud(addr);
+ if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ /*
+ * We only end up here if the kernel mapping and the fixmap
+ * share the top level pgd entry, which should only happen on
+ * 16k/4 levels configurations.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ pud = pud_offset_kimg(pgd, addr);
+ } else {
+ pgd_populate(&init_mm, pgd, bm_pud);
+ pud = fixmap_pud(addr);
+ }
pud_populate(&init_mm, pud, bm_pmd);
pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/kasan.h | 2 +-
arch/arm64/include/asm/memory.h | 21 +++--
arch/arm64/include/asm/pgtable.h | 10 +-
arch/arm64/mm/dump.c | 12 +--
arch/arm64/mm/init.c | 23 ++---
arch/arm64/mm/kasan_init.c | 18 +++-
arch/arm64/mm/mmu.c | 97 +++++++++++++-------
7 files changed, 116 insertions(+), 67 deletions(-)
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
* KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
*/
#define KASAN_SHADOW_START (VA_START)
-#define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
/*
* This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
* VA_START - the first kernel virtual address.
* TASK_SIZE - the maximum size of a user space task.
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
*/
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) << VA_BITS)
#define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR (PAGE_OFFSET)
-#define MODULES_END (KIMAGE_VADDR)
-#define MODULES_VADDR (MODULES_END - SZ_64M)
-#define PCI_IO_END (MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR (MODULES_END)
+#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE (SZ_64M)
+#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
#define TASK_SIZE_64 (UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
#define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4))
/*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE (0)
+#endif
+
+/*
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
*
* VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
* (rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
-#ifndef CONFIG_KASAN
-#define VMALLOC_START (VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START (KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
};
enum address_markers_idx {
- VMALLOC_START_NR = 0,
+ MODULES_START_NR = 0,
+ MODULES_END_NR,
+ VMALLOC_START_NR,
VMALLOC_END_NR,
#ifdef CONFIG_SPARSEMEM_VMEMMAP
VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
FIXADDR_END_NR,
PCI_START_NR,
PCI_END_NR,
- MODULES_START_NR,
- MODUELS_END_NR,
KERNEL_SPACE_NR,
};
static struct addr_marker address_markers[] = {
+ { MODULES_VADDR, "Modules start" },
+ { MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() Area" },
{ VMALLOC_END, "vmalloc() End" },
#ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
{ FIXADDR_TOP, "Fixmap end" },
{ PCI_IO_START, "PCI I/O start" },
{ PCI_IO_END, "PCI I/O end" },
- { MODULES_VADDR, "Modules start" },
- { MODULES_END, "Modules end" },
- { PAGE_OFFSET, "Kernel Mapping" },
+ { PAGE_OFFSET, "Linear Mapping" },
{ -1, NULL },
};
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
#include <linux/swiotlb.h>
#include <asm/fixmap.h>
+#include <asm/kasan.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
#ifdef CONFIG_KASAN
" kasan : 0x%16lx - 0x%16lx (%6ld GB)\n"
#endif
+ " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
" vmalloc : 0x%16lx - 0x%16lx (%6ld GB)\n"
+ " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
+ " .data : 0x%p" " - 0x%p" " (%6ld KB)\n"
#ifdef CONFIG_SPARSEMEM_VMEMMAP
" vmemmap : 0x%16lx - 0x%16lx (%6ld GB maximum)\n"
" 0x%16lx - 0x%16lx (%6ld MB actual)\n"
#endif
" fixed : 0x%16lx - 0x%16lx (%6ld KB)\n"
" PCI I/O : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " modules : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " memory : 0x%16lx - 0x%16lx (%6ld MB)\n"
- " .init : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .text : 0x%p" " - 0x%p" " (%6ld KB)\n"
- " .data : 0x%p" " - 0x%p" " (%6ld KB)\n",
+ " memory : 0x%16lx - 0x%16lx (%6ld MB)\n",
#ifdef CONFIG_KASAN
MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
#endif
+ MLM(MODULES_VADDR, MODULES_END),
MLG(VMALLOC_START, VMALLOC_END),
+ MLK_ROUNDUP(__init_begin, __init_end),
+ MLK_ROUNDUP(_text, _etext),
+ MLK_ROUNDUP(_sdata, _edata),
#ifdef CONFIG_SPARSEMEM_VMEMMAP
MLG((unsigned long)vmemmap,
(unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
#endif
MLK(FIXADDR_START, FIXADDR_TOP),
MLM(PCI_IO_START, PCI_IO_END),
- MLM(MODULES_VADDR, MODULES_END),
- MLM(PAGE_OFFSET, (unsigned long)high_memory),
- MLK_ROUNDUP(__init_begin, __init_end),
- MLK_ROUNDUP(_text, _etext),
- MLK_ROUNDUP(_sdata, _edata));
+ MLM(PAGE_OFFSET, (unsigned long)high_memory));
#undef MLK
#undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
void free_initmem(void)
{
- fixup_init();
free_initmem_default(0);
+ fixup_init();
}
#ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
#include <linux/start_kernel.h>
#include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
+#include <asm/sections.h>
#include <asm/tlbflush.h>
static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
if (pmd_none(*pmd))
pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
- pte = pte_offset_kernel(pmd, addr);
+ pte = pte_offset_kimg(pmd, addr);
do {
next = addr + PAGE_SIZE;
set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
if (pud_none(*pud))
pud_populate(&init_mm, pud, kasan_zero_pmd);
- pmd = pmd_offset(pud, addr);
+ pmd = pmd_offset_kimg(pud, addr);
do {
next = pmd_addr_end(addr, end);
kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
if (pgd_none(*pgd))
pgd_populate(&init_mm, pgd, kasan_zero_pud);
- pud = pud_offset(pgd, addr);
+ pud = pud_offset_kimg(pgd, addr);
do {
next = pud_addr_end(addr, end);
kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
void __init kasan_init(void)
{
+ u64 kimg_shadow_start, kimg_shadow_end;
struct memblock_region *reg;
+ kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+ kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
/*
* We are going to perform proper setup of shadow memory.
* At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
+ vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
- kasan_mem_to_shadow((void *)MODULES_VADDR));
+ (void *)kimg_shadow_start);
+ kasan_populate_zero_shadow((void *)kimg_shadow_end,
+ kasan_mem_to_shadow((void *)PAGE_OFFSET));
for_each_memblock(memory, reg) {
void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
EXPORT_SYMBOL(empty_zero_page);
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t vma_prot)
{
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
{
unsigned long kernel_start = __pa(_stext);
- unsigned long kernel_end = __pa(_end);
+ unsigned long kernel_end = __pa(_etext);
/*
- * The kernel itself is mapped at page granularity. Map all other
- * memory, making sure we don't overwrite the existing kernel mappings.
+ * Take care not to create a writable alias for the
+ * read-only text and rodata sections of the kernel image.
*/
- /* No overlap with the kernel. */
+ /* No overlap with the kernel text */
if (end < kernel_start || start >= kernel_end) {
__create_pgd_mapping(pgd, start, __phys_to_virt(start),
end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
}
/*
- * This block overlaps the kernel mapping. Map the portion(s) which
+ * This block overlaps the kernel text mapping. Map the portion(s) which
* don't overlap.
*/
if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
}
}
-#ifdef CONFIG_DEBUG_RODATA
void mark_rodata_ro(void)
{
+ if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+ return;
+
create_mapping_late(__pa(_stext), (unsigned long)_stext,
(unsigned long)_etext - (unsigned long)_stext,
PAGE_KERNEL_ROX);
-
}
-#endif
void fixup_init(void)
{
- create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
- (unsigned long)__init_end - (unsigned long)__init_begin,
- PAGE_KERNEL);
+ /*
+ * Unmap the __init region but leave the VM area in place. This
+ * prevents the region from being reused for kernel modules, which
+ * is not supported by kallsyms.
+ */
+ unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
}
static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
- pgprot_t prot)
+ pgprot_t prot, struct vm_struct *vma)
{
phys_addr_t pa_start = __pa(va_start);
unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
early_pgtable_alloc);
+
+ vma->addr = va_start;
+ vma->phys_addr = pa_start;
+ vma->size = size;
+ vma->flags = VM_MAP;
+ vma->caller = map_kernel_chunk;
+
+ vm_area_add_early(vma);
}
/*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
*/
static void __init map_kernel(pgd_t *pgd)
{
+ static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
- map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
- map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+ map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+ map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+ &vmlinux_init);
+ map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
- /*
- * The fixmap falls in a separate pgd to the kernel, and doesn't live
- * in the carveout for the swapper_pg_dir. We can simply re-use the
- * existing dir for the fixmap.
- */
- set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+ if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+ /*
+ * The fixmap falls in a separate pgd to the kernel, and doesn't
+ * live in the carveout for the swapper_pg_dir. We can simply
+ * re-use the existing dir for the fixmap.
+ */
+ set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+ *pgd_offset_k(FIXADDR_START));
+ } else if (CONFIG_PGTABLE_LEVELS > 3) {
+ /*
+ * The fixmap shares its top level pgd entry with the kernel
+ * mapping. This can really only occur when we are running
+ * with 16k/4 levels, so we can simply reuse the pud level
+ * entry instead.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+ __pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+ pud_clear_fixmap();
+ } else {
+ BUG();
+ }
kasan_copy_shadow(pgd);
}
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
}
#endif /* CONFIG_SPARSEMEM_VMEMMAP */
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
static inline pud_t * fixmap_pud(unsigned long addr)
{
pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- pgd_populate(&init_mm, pgd, bm_pud);
- pud = fixmap_pud(addr);
+ if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ /*
+ * We only end up here if the kernel mapping and the fixmap
+ * share the top level pgd entry, which should only happen on
+ * 16k/4 levels configurations.
+ */
+ BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+ pud = pud_offset_kimg(pgd, addr);
+ } else {
+ pgd_populate(&init_mm, pgd, bm_pud);
+ pud = fixmap_pud(addr);
+ }
pud_populate(&init_mm, pud, bm_pmd);
pmd = fixmap_pmd(addr);
pmd_populate_kernel(&init_mm, pmd, bm_pte);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 08/22] arm64: add support for module PLTs
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)
For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:
relocs branches unique !local
.text 3925 3347 518 219
.init.text 11 8 7 1
.exit.text 4 4 4 1
.text.unlikely 81 67 36 17
('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)
IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
arch/arm64/Kconfig | 9 +
arch/arm64/Makefile | 6 +-
arch/arm64/include/asm/module.h | 11 ++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
arch/arm64/kernel/module.c | 12 ++
arch/arm64/kernel/module.lds | 3 +
7 files changed, 242 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
bool "Cortex-A53: 843419: A load or store might access an incorrect address"
depends on MODULES
default y
+ select ARM64_MODULE_CMODEL_LARGE
help
This option builds kernel modules using the large memory model in
order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
endmenu
+config ARM64_MODULE_CMODEL_LARGE
+ bool
+
+config ARM64_MODULE_PLTS
+ bool
+ select ARM64_MODULE_CMODEL_LARGE
+ select HAVE_MOD_ARCH_SPECIFIC
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
CHECKFLAGS += -D__aarch64__
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
KBUILD_CFLAGS_MODULE += -mcmodel=large
endif
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
# Default value
head-y := arch/arm64/kernel/head.o
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
#define MODULE_ARCH_VERMAGIC "aarch64"
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+ struct elf64_shdr *plt;
+ int plt_num_entries;
+ int plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym);
+
#endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
../../arm/kernel/opcodes.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+ /*
+ * A program that conforms to the AArch64 Procedure Call Standard
+ * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+ * IP1 (x17) may be inserted at any branch instruction that is
+ * exposed to a relocation that supports long branches. Since that
+ * is exactly what we are dealing with here, we are free to use x16
+ * as a scratch register in the PLT veneers.
+ */
+ __le32 mov0; /* movn x16, #0x.... */
+ __le32 mov1; /* movk x16, #0x...., lsl #16 */
+ __le32 mov2; /* movk x16, #0x...., lsl #32 */
+ __le32 br; /* br x16 */
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym)
+{
+ struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+ int i = mod->arch.plt_num_entries;
+ u64 val = sym->st_value + rela->r_addend;
+
+ /*
+ * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+ * which are listed in the ELF symtab section, but without a type
+ * or a size.
+ * So, similar to how the module loader uses the Elf64_Sym::st_value
+ * field to store the resolved addresses of undefined symbols, let's
+ * borrow the Elf64_Sym::st_size field (whose value is never used by
+ * the module loader, even for symbols that are defined) to record
+ * the address of a symbol's associated PLT entry as we emit it for a
+ * zero addend relocation (which is the only kind we have to deal with
+ * in practice). This allows us to find duplicates without having to
+ * go through the table every time.
+ */
+ if (rela->r_addend == 0 && sym->st_size != 0) {
+ BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+ return sym->st_size;
+ }
+
+ mod->arch.plt_num_entries++;
+ BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+ /*
+ * MOVK/MOVN/MOVZ opcode:
+ * +--------+------------+--------+-----------+-------------+---------+
+ * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+ * +--------+------------+--------+-----------+-------------+---------+
+ *
+ * Rd := 0x10 (x16)
+ * hw := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+ * opc := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+ * sf := 1 (64-bit variant)
+ */
+ plt[i] = (struct plt_entry){
+ cpu_to_le32(0x92800010 | (((~val ) & 0xffff)) << 5),
+ cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+ cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+ cpu_to_le32(0xd61f0200)
+ };
+
+ if (rela->r_addend == 0)
+ sym->st_size = (u64)&plt[i];
+
+ return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+ const Elf64_Rela *x = a, *y = b;
+ int i;
+
+ /* sort by type, symbol index and addend */
+ i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+ if (i == 0)
+ i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+ if (i == 0)
+ i = cmp_3way(x->r_addend, y->r_addend);
+ return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+ /*
+ * Entries are sorted by type, symbol index and addend. That means
+ * that, if a duplicate entry exists, it must be in the preceding
+ * slot.
+ */
+ return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+ unsigned int ret = 0;
+ Elf64_Sym *s;
+ int i;
+
+ for (i = 0; i < num; i++) {
+ switch (ELF64_R_TYPE(rela[i].r_info)) {
+ case R_AARCH64_JUMP26:
+ case R_AARCH64_CALL26:
+ /*
+ * We only have to consider branch targets that resolve
+ * to undefined symbols. This is not simply a heuristic,
+ * it is a fundamental limitation, since the PLT itself
+ * is part of the module, and needs to be within 128 MB
+ * as well, so modules can never grow beyond that limit.
+ */
+ s = syms + ELF64_R_SYM(rela[i].r_info);
+ if (s->st_shndx != SHN_UNDEF)
+ break;
+
+ /*
+ * Jump relocations with non-zero addends against
+ * undefined symbols are supported by the ELF spec, but
+ * do not occur in practice (e.g., 'jump n bytes past
+ * the entry point of undefined function symbol f').
+ * So we need to support them, but there is no need to
+ * take them into consideration when trying to optimize
+ * this code. So let's only check for duplicates when
+ * the addend is zero: this allows us to record the PLT
+ * entry address in the symbol table itself, rather than
+ * having to search the list for duplicates each time we
+ * emit one.
+ */
+ if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+ ret++;
+ break;
+ }
+ }
+ return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ char *secstrings, struct module *mod)
+{
+ unsigned long plt_max_entries = 0;
+ Elf64_Sym *syms = NULL;
+ int i;
+
+ /*
+ * Find the empty .plt section so we can expand it to store the PLT
+ * entries. Record the symtab address as well.
+ */
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+ mod->arch.plt = sechdrs + i;
+ else if (sechdrs[i].sh_type == SHT_SYMTAB)
+ syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+ }
+
+ if (!mod->arch.plt) {
+ pr_err("%s: module PLT section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+ if (!syms) {
+ pr_err("%s: module symtab section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+ int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+ Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+ if (sechdrs[i].sh_type != SHT_RELA)
+ continue;
+
+ /* ignore relocations that operate on non-exec sections */
+ if (!(dstsec->sh_flags & SHF_EXECINSTR))
+ continue;
+
+ /* sort by type, symbol index and addend */
+ sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+ plt_max_entries += count_plts(syms, rels, numrels);
+ }
+
+ mod->arch.plt->sh_type = SHT_NOBITS;
+ mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+ mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+ mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+ mod->arch.plt_num_entries = 0;
+ mod->arch.plt_max_entries = plt_max_entries;
+ return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
+ if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+ p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+ VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+ NUMA_NO_NODE, __builtin_return_address(0));
+
if (p && (kasan_module_alloc(p, size) < 0)) {
vfree(p);
return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_CALL26:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
AARCH64_INSN_IMM_26);
+
+ if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+ ovf == -ERANGE) {
+ val = module_emit_plt_entry(me, &rel[i], sym);
+ ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+ 26, AARCH64_INSN_IMM_26);
+ }
break;
default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+ .plt (NOLOAD) : { BYTE(0) }
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 08/22] arm64: add support for module PLTs
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)
For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:
relocs branches unique !local
.text 3925 3347 518 219
.init.text 11 8 7 1
.exit.text 4 4 4 1
.text.unlikely 81 67 36 17
('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)
IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
arch/arm64/Kconfig | 9 +
arch/arm64/Makefile | 6 +-
arch/arm64/include/asm/module.h | 11 ++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
arch/arm64/kernel/module.c | 12 ++
arch/arm64/kernel/module.lds | 3 +
7 files changed, 242 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
bool "Cortex-A53: 843419: A load or store might access an incorrect address"
depends on MODULES
default y
+ select ARM64_MODULE_CMODEL_LARGE
help
This option builds kernel modules using the large memory model in
order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
endmenu
+config ARM64_MODULE_CMODEL_LARGE
+ bool
+
+config ARM64_MODULE_PLTS
+ bool
+ select ARM64_MODULE_CMODEL_LARGE
+ select HAVE_MOD_ARCH_SPECIFIC
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
CHECKFLAGS += -D__aarch64__
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
KBUILD_CFLAGS_MODULE += -mcmodel=large
endif
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
# Default value
head-y := arch/arm64/kernel/head.o
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
#define MODULE_ARCH_VERMAGIC "aarch64"
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+ struct elf64_shdr *plt;
+ int plt_num_entries;
+ int plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym);
+
#endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
../../arm/kernel/opcodes.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+ /*
+ * A program that conforms to the AArch64 Procedure Call Standard
+ * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+ * IP1 (x17) may be inserted at any branch instruction that is
+ * exposed to a relocation that supports long branches. Since that
+ * is exactly what we are dealing with here, we are free to use x16
+ * as a scratch register in the PLT veneers.
+ */
+ __le32 mov0; /* movn x16, #0x.... */
+ __le32 mov1; /* movk x16, #0x...., lsl #16 */
+ __le32 mov2; /* movk x16, #0x...., lsl #32 */
+ __le32 br; /* br x16 */
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym)
+{
+ struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+ int i = mod->arch.plt_num_entries;
+ u64 val = sym->st_value + rela->r_addend;
+
+ /*
+ * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+ * which are listed in the ELF symtab section, but without a type
+ * or a size.
+ * So, similar to how the module loader uses the Elf64_Sym::st_value
+ * field to store the resolved addresses of undefined symbols, let's
+ * borrow the Elf64_Sym::st_size field (whose value is never used by
+ * the module loader, even for symbols that are defined) to record
+ * the address of a symbol's associated PLT entry as we emit it for a
+ * zero addend relocation (which is the only kind we have to deal with
+ * in practice). This allows us to find duplicates without having to
+ * go through the table every time.
+ */
+ if (rela->r_addend == 0 && sym->st_size != 0) {
+ BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+ return sym->st_size;
+ }
+
+ mod->arch.plt_num_entries++;
+ BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+ /*
+ * MOVK/MOVN/MOVZ opcode:
+ * +--------+------------+--------+-----------+-------------+---------+
+ * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+ * +--------+------------+--------+-----------+-------------+---------+
+ *
+ * Rd := 0x10 (x16)
+ * hw := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+ * opc := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+ * sf := 1 (64-bit variant)
+ */
+ plt[i] = (struct plt_entry){
+ cpu_to_le32(0x92800010 | (((~val ) & 0xffff)) << 5),
+ cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+ cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+ cpu_to_le32(0xd61f0200)
+ };
+
+ if (rela->r_addend == 0)
+ sym->st_size = (u64)&plt[i];
+
+ return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+ const Elf64_Rela *x = a, *y = b;
+ int i;
+
+ /* sort by type, symbol index and addend */
+ i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+ if (i == 0)
+ i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+ if (i == 0)
+ i = cmp_3way(x->r_addend, y->r_addend);
+ return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+ /*
+ * Entries are sorted by type, symbol index and addend. That means
+ * that, if a duplicate entry exists, it must be in the preceding
+ * slot.
+ */
+ return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+ unsigned int ret = 0;
+ Elf64_Sym *s;
+ int i;
+
+ for (i = 0; i < num; i++) {
+ switch (ELF64_R_TYPE(rela[i].r_info)) {
+ case R_AARCH64_JUMP26:
+ case R_AARCH64_CALL26:
+ /*
+ * We only have to consider branch targets that resolve
+ * to undefined symbols. This is not simply a heuristic,
+ * it is a fundamental limitation, since the PLT itself
+ * is part of the module, and needs to be within 128 MB
+ * as well, so modules can never grow beyond that limit.
+ */
+ s = syms + ELF64_R_SYM(rela[i].r_info);
+ if (s->st_shndx != SHN_UNDEF)
+ break;
+
+ /*
+ * Jump relocations with non-zero addends against
+ * undefined symbols are supported by the ELF spec, but
+ * do not occur in practice (e.g., 'jump n bytes past
+ * the entry point of undefined function symbol f').
+ * So we need to support them, but there is no need to
+ * take them into consideration when trying to optimize
+ * this code. So let's only check for duplicates when
+ * the addend is zero: this allows us to record the PLT
+ * entry address in the symbol table itself, rather than
+ * having to search the list for duplicates each time we
+ * emit one.
+ */
+ if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+ ret++;
+ break;
+ }
+ }
+ return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ char *secstrings, struct module *mod)
+{
+ unsigned long plt_max_entries = 0;
+ Elf64_Sym *syms = NULL;
+ int i;
+
+ /*
+ * Find the empty .plt section so we can expand it to store the PLT
+ * entries. Record the symtab address as well.
+ */
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+ mod->arch.plt = sechdrs + i;
+ else if (sechdrs[i].sh_type == SHT_SYMTAB)
+ syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+ }
+
+ if (!mod->arch.plt) {
+ pr_err("%s: module PLT section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+ if (!syms) {
+ pr_err("%s: module symtab section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+ int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+ Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+ if (sechdrs[i].sh_type != SHT_RELA)
+ continue;
+
+ /* ignore relocations that operate on non-exec sections */
+ if (!(dstsec->sh_flags & SHF_EXECINSTR))
+ continue;
+
+ /* sort by type, symbol index and addend */
+ sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+ plt_max_entries += count_plts(syms, rels, numrels);
+ }
+
+ mod->arch.plt->sh_type = SHT_NOBITS;
+ mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+ mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+ mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+ mod->arch.plt_num_entries = 0;
+ mod->arch.plt_max_entries = plt_max_entries;
+ return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
+ if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+ p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+ VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+ NUMA_NO_NODE, __builtin_return_address(0));
+
if (p && (kasan_module_alloc(p, size) < 0)) {
vfree(p);
return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_CALL26:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
AARCH64_INSN_IMM_26);
+
+ if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+ ovf == -ERANGE) {
+ val = module_emit_plt_entry(me, &rel[i], sym);
+ ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+ 26, AARCH64_INSN_IMM_26);
+ }
break;
default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+ .plt (NOLOAD) : { BYTE(0) }
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 08/22] arm64: add support for module PLTs
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)
For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:
relocs branches unique !local
.text 3925 3347 518 219
.init.text 11 8 7 1
.exit.text 4 4 4 1
.text.unlikely 81 67 36 17
('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)
IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
arch/arm64/Kconfig | 9 +
arch/arm64/Makefile | 6 +-
arch/arm64/include/asm/module.h | 11 ++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
arch/arm64/kernel/module.c | 12 ++
arch/arm64/kernel/module.lds | 3 +
7 files changed, 242 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
bool "Cortex-A53: 843419: A load or store might access an incorrect address"
depends on MODULES
default y
+ select ARM64_MODULE_CMODEL_LARGE
help
This option builds kernel modules using the large memory model in
order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
endmenu
+config ARM64_MODULE_CMODEL_LARGE
+ bool
+
+config ARM64_MODULE_PLTS
+ bool
+ select ARM64_MODULE_CMODEL_LARGE
+ select HAVE_MOD_ARCH_SPECIFIC
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
CHECKFLAGS += -D__aarch64__
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
KBUILD_CFLAGS_MODULE += -mcmodel=large
endif
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
# Default value
head-y := arch/arm64/kernel/head.o
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
#define MODULE_ARCH_VERMAGIC "aarch64"
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+ struct elf64_shdr *plt;
+ int plt_num_entries;
+ int plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym);
+
#endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
../../arm/kernel/opcodes.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+ /*
+ * A program that conforms to the AArch64 Procedure Call Standard
+ * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+ * IP1 (x17) may be inserted at any branch instruction that is
+ * exposed to a relocation that supports long branches. Since that
+ * is exactly what we are dealing with here, we are free to use x16
+ * as a scratch register in the PLT veneers.
+ */
+ __le32 mov0; /* movn x16, #0x.... */
+ __le32 mov1; /* movk x16, #0x...., lsl #16 */
+ __le32 mov2; /* movk x16, #0x...., lsl #32 */
+ __le32 br; /* br x16 */
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym)
+{
+ struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+ int i = mod->arch.plt_num_entries;
+ u64 val = sym->st_value + rela->r_addend;
+
+ /*
+ * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+ * which are listed in the ELF symtab section, but without a type
+ * or a size.
+ * So, similar to how the module loader uses the Elf64_Sym::st_value
+ * field to store the resolved addresses of undefined symbols, let's
+ * borrow the Elf64_Sym::st_size field (whose value is never used by
+ * the module loader, even for symbols that are defined) to record
+ * the address of a symbol's associated PLT entry as we emit it for a
+ * zero addend relocation (which is the only kind we have to deal with
+ * in practice). This allows us to find duplicates without having to
+ * go through the table every time.
+ */
+ if (rela->r_addend == 0 && sym->st_size != 0) {
+ BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+ return sym->st_size;
+ }
+
+ mod->arch.plt_num_entries++;
+ BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+ /*
+ * MOVK/MOVN/MOVZ opcode:
+ * +--------+------------+--------+-----------+-------------+---------+
+ * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+ * +--------+------------+--------+-----------+-------------+---------+
+ *
+ * Rd := 0x10 (x16)
+ * hw := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+ * opc := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+ * sf := 1 (64-bit variant)
+ */
+ plt[i] = (struct plt_entry){
+ cpu_to_le32(0x92800010 | (((~val ) & 0xffff)) << 5),
+ cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+ cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+ cpu_to_le32(0xd61f0200)
+ };
+
+ if (rela->r_addend == 0)
+ sym->st_size = (u64)&plt[i];
+
+ return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+ const Elf64_Rela *x = a, *y = b;
+ int i;
+
+ /* sort by type, symbol index and addend */
+ i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+ if (i == 0)
+ i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+ if (i == 0)
+ i = cmp_3way(x->r_addend, y->r_addend);
+ return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+ /*
+ * Entries are sorted by type, symbol index and addend. That means
+ * that, if a duplicate entry exists, it must be in the preceding
+ * slot.
+ */
+ return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+ unsigned int ret = 0;
+ Elf64_Sym *s;
+ int i;
+
+ for (i = 0; i < num; i++) {
+ switch (ELF64_R_TYPE(rela[i].r_info)) {
+ case R_AARCH64_JUMP26:
+ case R_AARCH64_CALL26:
+ /*
+ * We only have to consider branch targets that resolve
+ * to undefined symbols. This is not simply a heuristic,
+ * it is a fundamental limitation, since the PLT itself
+ * is part of the module, and needs to be within 128 MB
+ * as well, so modules can never grow beyond that limit.
+ */
+ s = syms + ELF64_R_SYM(rela[i].r_info);
+ if (s->st_shndx != SHN_UNDEF)
+ break;
+
+ /*
+ * Jump relocations with non-zero addends against
+ * undefined symbols are supported by the ELF spec, but
+ * do not occur in practice (e.g., 'jump n bytes past
+ * the entry point of undefined function symbol f').
+ * So we need to support them, but there is no need to
+ * take them into consideration when trying to optimize
+ * this code. So let's only check for duplicates when
+ * the addend is zero: this allows us to record the PLT
+ * entry address in the symbol table itself, rather than
+ * having to search the list for duplicates each time we
+ * emit one.
+ */
+ if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+ ret++;
+ break;
+ }
+ }
+ return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ char *secstrings, struct module *mod)
+{
+ unsigned long plt_max_entries = 0;
+ Elf64_Sym *syms = NULL;
+ int i;
+
+ /*
+ * Find the empty .plt section so we can expand it to store the PLT
+ * entries. Record the symtab address as well.
+ */
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+ mod->arch.plt = sechdrs + i;
+ else if (sechdrs[i].sh_type == SHT_SYMTAB)
+ syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+ }
+
+ if (!mod->arch.plt) {
+ pr_err("%s: module PLT section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+ if (!syms) {
+ pr_err("%s: module symtab section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+ int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+ Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+ if (sechdrs[i].sh_type != SHT_RELA)
+ continue;
+
+ /* ignore relocations that operate on non-exec sections */
+ if (!(dstsec->sh_flags & SHF_EXECINSTR))
+ continue;
+
+ /* sort by type, symbol index and addend */
+ sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+ plt_max_entries += count_plts(syms, rels, numrels);
+ }
+
+ mod->arch.plt->sh_type = SHT_NOBITS;
+ mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+ mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+ mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+ mod->arch.plt_num_entries = 0;
+ mod->arch.plt_max_entries = plt_max_entries;
+ return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
+ if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+ p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+ VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+ NUMA_NO_NODE, __builtin_return_address(0));
+
if (p && (kasan_module_alloc(p, size) < 0)) {
vfree(p);
return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_CALL26:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
AARCH64_INSN_IMM_26);
+
+ if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+ ovf == -ERANGE) {
+ val = module_emit_plt_entry(me, &rel[i], sym);
+ ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+ 26, AARCH64_INSN_IMM_26);
+ }
break;
default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+ .plt (NOLOAD) : { BYTE(0) }
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
lib/extable.c | 50 ++++++++++++++++----
1 file changed, 41 insertions(+), 9 deletions(-)
diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
#include <linux/sort.h>
#include <asm/uaccess.h>
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x) ((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+ return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
#ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+ struct exception_table_entry *x = a, *y = b, tmp;
+ int delta = b - a;
+
+ tmp = *x;
+ x->insn = y->insn + delta;
+ y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+ swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+ x->fixup = y->fixup + delta;
+ y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
/*
* The exception table needs to be sorted so that the binary
* search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
const struct exception_table_entry *x = a, *y = b;
/* avoid overflow */
- if (x->insn > y->insn)
+ if (ex_to_insn(x) > ex_to_insn(y))
return 1;
- if (x->insn < y->insn)
+ if (ex_to_insn(x) < ex_to_insn(y))
return -1;
return 0;
}
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
struct exception_table_entry *finish)
{
sort(start, finish - start, sizeof(struct exception_table_entry),
- cmp_ex, NULL);
+ cmp_ex, swap_ex);
}
#ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
void trim_init_extable(struct module *m)
{
/*trim the beginning*/
- while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+ while (m->num_exentries &&
+ within_module_init(ex_to_insn(&m->extable[0]), m)) {
m->extable++;
m->num_exentries--;
}
/*trim the end*/
while (m->num_exentries &&
- within_module_init(m->extable[m->num_exentries-1].insn, m))
+ within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+ m))
m->num_exentries--;
}
#endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
* careful, the distance between value and insn
* can be larger than MAX_LONG:
*/
- if (mid->insn < value)
+ if (ex_to_insn(mid) < value)
first = mid + 1;
- else if (mid->insn > value)
+ else if (ex_to_insn(mid) > value)
last = mid - 1;
else
return mid;
- }
- return NULL;
+ }
+ return NULL;
}
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
lib/extable.c | 50 ++++++++++++++++----
1 file changed, 41 insertions(+), 9 deletions(-)
diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
#include <linux/sort.h>
#include <asm/uaccess.h>
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x) ((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+ return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
#ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+ struct exception_table_entry *x = a, *y = b, tmp;
+ int delta = b - a;
+
+ tmp = *x;
+ x->insn = y->insn + delta;
+ y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+ swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+ x->fixup = y->fixup + delta;
+ y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
/*
* The exception table needs to be sorted so that the binary
* search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
const struct exception_table_entry *x = a, *y = b;
/* avoid overflow */
- if (x->insn > y->insn)
+ if (ex_to_insn(x) > ex_to_insn(y))
return 1;
- if (x->insn < y->insn)
+ if (ex_to_insn(x) < ex_to_insn(y))
return -1;
return 0;
}
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
struct exception_table_entry *finish)
{
sort(start, finish - start, sizeof(struct exception_table_entry),
- cmp_ex, NULL);
+ cmp_ex, swap_ex);
}
#ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
void trim_init_extable(struct module *m)
{
/*trim the beginning*/
- while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+ while (m->num_exentries &&
+ within_module_init(ex_to_insn(&m->extable[0]), m)) {
m->extable++;
m->num_exentries--;
}
/*trim the end*/
while (m->num_exentries &&
- within_module_init(m->extable[m->num_exentries-1].insn, m))
+ within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+ m))
m->num_exentries--;
}
#endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
* careful, the distance between value and insn
* can be larger than MAX_LONG:
*/
- if (mid->insn < value)
+ if (ex_to_insn(mid) < value)
first = mid + 1;
- else if (mid->insn > value)
+ else if (ex_to_insn(mid) > value)
last = mid - 1;
else
return mid;
- }
- return NULL;
+ }
+ return NULL;
}
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
lib/extable.c | 50 ++++++++++++++++----
1 file changed, 41 insertions(+), 9 deletions(-)
diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
#include <linux/sort.h>
#include <asm/uaccess.h>
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x) ((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+ return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
#ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+ struct exception_table_entry *x = a, *y = b, tmp;
+ int delta = b - a;
+
+ tmp = *x;
+ x->insn = y->insn + delta;
+ y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+ swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+ x->fixup = y->fixup + delta;
+ y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
/*
* The exception table needs to be sorted so that the binary
* search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
const struct exception_table_entry *x = a, *y = b;
/* avoid overflow */
- if (x->insn > y->insn)
+ if (ex_to_insn(x) > ex_to_insn(y))
return 1;
- if (x->insn < y->insn)
+ if (ex_to_insn(x) < ex_to_insn(y))
return -1;
return 0;
}
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
struct exception_table_entry *finish)
{
sort(start, finish - start, sizeof(struct exception_table_entry),
- cmp_ex, NULL);
+ cmp_ex, swap_ex);
}
#ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
void trim_init_extable(struct module *m)
{
/*trim the beginning*/
- while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+ while (m->num_exentries &&
+ within_module_init(ex_to_insn(&m->extable[0]), m)) {
m->extable++;
m->num_exentries--;
}
/*trim the end*/
while (m->num_exentries &&
- within_module_init(m->extable[m->num_exentries-1].insn, m))
+ within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+ m))
m->num_exentries--;
}
#endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
* careful, the distance between value and insn
* can be larger than MAX_LONG:
*/
- if (mid->insn < value)
+ if (ex_to_insn(mid) < value)
first = mid + 1;
- else if (mid->insn > value)
+ else if (ex_to_insn(mid) > value)
last = mid - 1;
else
return mid;
- }
- return NULL;
+ }
+ return NULL;
}
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 10/22] arm64: switch to relative exception tables
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.
This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 15 +++++++---
arch/arm64/include/asm/futex.h | 12 +++-----
arch/arm64/include/asm/uaccess.h | 30 +++++++++++---------
arch/arm64/include/asm/word-at-a-time.h | 7 ++---
arch/arm64/kernel/armv8_deprecated.c | 7 ++---
arch/arm64/mm/extable.c | 2 +-
scripts/sortextable.c | 2 +-
7 files changed, 38 insertions(+), 37 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
dmb \opt
.endm
+/*
+ * Emit an entry into the exception table
+ */
+ .macro _asm_extable, from, to
+ .pushsection __ex_table, "a"
+ .align 3
+ .long (\from - .), (\to - .)
+ .popsection
+ .endm
+
#define USER(l, x...) \
9999: x; \
- .section __ex_table,"a"; \
- .align 3; \
- .quad 9999b,l; \
- .previous
+ _asm_extable 9999b, l
/*
* Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection\n" \
-" .pushsection __ex_table,\"a\"\n" \
-" .align 3\n" \
-" .quad 1b, 4b, 2b, 4b\n" \
-" .popsection\n" \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
"4: mov %w0, %w6\n"
" b 3b\n"
" .popsection\n"
-" .pushsection __ex_table,\"a\"\n"
-" .align 3\n"
-" .quad 1b, 4b, 2b, 4b\n"
-" .popsection\n"
+ _ASM_EXTABLE(1b, 4b)
+ _ASM_EXTABLE(2b, 4b)
: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
#define VERIFY_WRITE 1
/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
*
* All the routines below use bits of fixup code that are out of line
* with the main instruction path. This means when everything is well,
@@ -50,9 +50,11 @@
struct exception_table_entry
{
- unsigned long insn, fixup;
+ int insn, fixup;
};
+#define ARCH_HAS_RELATIVE_EXTABLE
+
extern int fixup_exception(struct pt_regs *regs);
#define KERNEL_DS (-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
#define access_ok(type, addr, size) __range_ok(addr, size)
#define user_addr_max get_fs
+#define _ASM_EXTABLE(from, to) \
+ " .pushsection __ex_table, \"a\"\n" \
+ " .align 3\n" \
+ " .long (" #from " - .), (" #to " - .)\n" \
+ " .popsection\n"
+
/*
* The "__xxx" versions of the user access functions do not verify the address
* space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
" mov %1, #0\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err), "=&r" (x) \
: "r" (addr), "i" (-EFAULT))
@@ -190,10 +195,7 @@ do { \
"3: mov %w0, %3\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err) \
: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
#ifndef __ASM_WORD_AT_A_TIME_H
#define __ASM_WORD_AT_A_TIME_H
+#include <asm/uaccess.h>
+
#ifndef __AARCH64EB__
#include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
#endif
" b 2b\n"
" .popsection\n"
- " .pushsection __ex_table,\"a\"\n"
- " .align 3\n"
- " .quad 1b, 3b\n"
- " .popsection"
+ _ASM_EXTABLE(1b, 3b)
: "=&r" (ret), "=&r" (offset)
: "r" (addr), "Q" (*(unsigned long *)addr));
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection" \
- " .pushsection __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 0b, 4b\n" \
- " .quad 1b, 4b\n" \
- " .popsection\n" \
+ _ASM_EXTABLE(0b, 4b) \
+ _ASM_EXTABLE(1b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (res), "+r" (data), "=&r" (temp) \
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
fixup = search_exception_tables(instruction_pointer(regs));
if (fixup)
- regs->pc = fixup->fixup;
+ regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
return fixup != NULL;
}
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
case EM_386:
case EM_X86_64:
case EM_S390:
+ case EM_AARCH64:
custom_sort = sort_relative_table;
break;
case EM_ARCOMPACT:
case EM_ARCV2:
case EM_ARM:
- case EM_AARCH64:
case EM_MICROBLAZE:
case EM_MIPS:
case EM_XTENSA:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 10/22] arm64: switch to relative exception tables
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.
This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 15 +++++++---
arch/arm64/include/asm/futex.h | 12 +++-----
arch/arm64/include/asm/uaccess.h | 30 +++++++++++---------
arch/arm64/include/asm/word-at-a-time.h | 7 ++---
arch/arm64/kernel/armv8_deprecated.c | 7 ++---
arch/arm64/mm/extable.c | 2 +-
scripts/sortextable.c | 2 +-
7 files changed, 38 insertions(+), 37 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
dmb \opt
.endm
+/*
+ * Emit an entry into the exception table
+ */
+ .macro _asm_extable, from, to
+ .pushsection __ex_table, "a"
+ .align 3
+ .long (\from - .), (\to - .)
+ .popsection
+ .endm
+
#define USER(l, x...) \
9999: x; \
- .section __ex_table,"a"; \
- .align 3; \
- .quad 9999b,l; \
- .previous
+ _asm_extable 9999b, l
/*
* Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection\n" \
-" .pushsection __ex_table,\"a\"\n" \
-" .align 3\n" \
-" .quad 1b, 4b, 2b, 4b\n" \
-" .popsection\n" \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
"4: mov %w0, %w6\n"
" b 3b\n"
" .popsection\n"
-" .pushsection __ex_table,\"a\"\n"
-" .align 3\n"
-" .quad 1b, 4b, 2b, 4b\n"
-" .popsection\n"
+ _ASM_EXTABLE(1b, 4b)
+ _ASM_EXTABLE(2b, 4b)
: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
#define VERIFY_WRITE 1
/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
*
* All the routines below use bits of fixup code that are out of line
* with the main instruction path. This means when everything is well,
@@ -50,9 +50,11 @@
struct exception_table_entry
{
- unsigned long insn, fixup;
+ int insn, fixup;
};
+#define ARCH_HAS_RELATIVE_EXTABLE
+
extern int fixup_exception(struct pt_regs *regs);
#define KERNEL_DS (-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
#define access_ok(type, addr, size) __range_ok(addr, size)
#define user_addr_max get_fs
+#define _ASM_EXTABLE(from, to) \
+ " .pushsection __ex_table, \"a\"\n" \
+ " .align 3\n" \
+ " .long (" #from " - .), (" #to " - .)\n" \
+ " .popsection\n"
+
/*
* The "__xxx" versions of the user access functions do not verify the address
* space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
" mov %1, #0\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err), "=&r" (x) \
: "r" (addr), "i" (-EFAULT))
@@ -190,10 +195,7 @@ do { \
"3: mov %w0, %3\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err) \
: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
#ifndef __ASM_WORD_AT_A_TIME_H
#define __ASM_WORD_AT_A_TIME_H
+#include <asm/uaccess.h>
+
#ifndef __AARCH64EB__
#include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
#endif
" b 2b\n"
" .popsection\n"
- " .pushsection __ex_table,\"a\"\n"
- " .align 3\n"
- " .quad 1b, 3b\n"
- " .popsection"
+ _ASM_EXTABLE(1b, 3b)
: "=&r" (ret), "=&r" (offset)
: "r" (addr), "Q" (*(unsigned long *)addr));
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection" \
- " .pushsection __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 0b, 4b\n" \
- " .quad 1b, 4b\n" \
- " .popsection\n" \
+ _ASM_EXTABLE(0b, 4b) \
+ _ASM_EXTABLE(1b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (res), "+r" (data), "=&r" (temp) \
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
fixup = search_exception_tables(instruction_pointer(regs));
if (fixup)
- regs->pc = fixup->fixup;
+ regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
return fixup != NULL;
}
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
case EM_386:
case EM_X86_64:
case EM_S390:
+ case EM_AARCH64:
custom_sort = sort_relative_table;
break;
case EM_ARCOMPACT:
case EM_ARCV2:
case EM_ARM:
- case EM_AARCH64:
case EM_MICROBLAZE:
case EM_MIPS:
case EM_XTENSA:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 10/22] arm64: switch to relative exception tables
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.
This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 15 +++++++---
arch/arm64/include/asm/futex.h | 12 +++-----
arch/arm64/include/asm/uaccess.h | 30 +++++++++++---------
arch/arm64/include/asm/word-at-a-time.h | 7 ++---
arch/arm64/kernel/armv8_deprecated.c | 7 ++---
arch/arm64/mm/extable.c | 2 +-
scripts/sortextable.c | 2 +-
7 files changed, 38 insertions(+), 37 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
dmb \opt
.endm
+/*
+ * Emit an entry into the exception table
+ */
+ .macro _asm_extable, from, to
+ .pushsection __ex_table, "a"
+ .align 3
+ .long (\from - .), (\to - .)
+ .popsection
+ .endm
+
#define USER(l, x...) \
9999: x; \
- .section __ex_table,"a"; \
- .align 3; \
- .quad 9999b,l; \
- .previous
+ _asm_extable 9999b, l
/*
* Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection\n" \
-" .pushsection __ex_table,\"a\"\n" \
-" .align 3\n" \
-" .quad 1b, 4b, 2b, 4b\n" \
-" .popsection\n" \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
"4: mov %w0, %w6\n"
" b 3b\n"
" .popsection\n"
-" .pushsection __ex_table,\"a\"\n"
-" .align 3\n"
-" .quad 1b, 4b, 2b, 4b\n"
-" .popsection\n"
+ _ASM_EXTABLE(1b, 4b)
+ _ASM_EXTABLE(2b, 4b)
: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
#define VERIFY_WRITE 1
/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
*
* All the routines below use bits of fixup code that are out of line
* with the main instruction path. This means when everything is well,
@@ -50,9 +50,11 @@
struct exception_table_entry
{
- unsigned long insn, fixup;
+ int insn, fixup;
};
+#define ARCH_HAS_RELATIVE_EXTABLE
+
extern int fixup_exception(struct pt_regs *regs);
#define KERNEL_DS (-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
#define access_ok(type, addr, size) __range_ok(addr, size)
#define user_addr_max get_fs
+#define _ASM_EXTABLE(from, to) \
+ " .pushsection __ex_table, \"a\"\n" \
+ " .align 3\n" \
+ " .long (" #from " - .), (" #to " - .)\n" \
+ " .popsection\n"
+
/*
* The "__xxx" versions of the user access functions do not verify the address
* space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
" mov %1, #0\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err), "=&r" (x) \
: "r" (addr), "i" (-EFAULT))
@@ -190,10 +195,7 @@ do { \
"3: mov %w0, %3\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err) \
: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
#ifndef __ASM_WORD_AT_A_TIME_H
#define __ASM_WORD_AT_A_TIME_H
+#include <asm/uaccess.h>
+
#ifndef __AARCH64EB__
#include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
#endif
" b 2b\n"
" .popsection\n"
- " .pushsection __ex_table,\"a\"\n"
- " .align 3\n"
- " .quad 1b, 3b\n"
- " .popsection"
+ _ASM_EXTABLE(1b, 3b)
: "=&r" (ret), "=&r" (offset)
: "r" (addr), "Q" (*(unsigned long *)addr));
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection" \
- " .pushsection __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 0b, 4b\n" \
- " .quad 1b, 4b\n" \
- " .popsection\n" \
+ _ASM_EXTABLE(0b, 4b) \
+ _ASM_EXTABLE(1b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
: "=&r" (res), "+r" (data), "=&r" (temp) \
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
fixup = search_exception_tables(instruction_pointer(regs));
if (fixup)
- regs->pc = fixup->fixup;
+ regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
return fixup != NULL;
}
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
case EM_386:
case EM_X86_64:
case EM_S390:
+ case EM_AARCH64:
custom_sort = sort_relative_table;
break;
case EM_ARCOMPACT:
case EM_ARCV2:
case EM_ARM:
- case EM_AARCH64:
case EM_MICROBLAZE:
case EM_MIPS:
case EM_XTENSA:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.
So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 11 +++++++
arch/arm64/kernel/head.S | 6 ++--
arch/arm64/kernel/image.h | 32 ++++++++++++--------
3 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr .req x30 // link register
.size __pi_##x, . - x; \
ENDPROC(x)
+ /*
+ * Emit a 64-bit absolute little endian symbol reference in a way that
+ * ensures that it will be resolved at build time, even when building a
+ * PIE binary. This requires cooperation from the linker script, which
+ * must emit the lo32/hi32 halves individually.
+ */
+ .macro le64sym, sym
+ .long \sym\()_lo32
+ .long \sym\()_hi32
+ .endm
+
#endif /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
b stext // branch to kernel start, magic
.long 0 // reserved
#endif
- .quad _kernel_offset_le // Image load offset from start of RAM, little-endian
- .quad _kernel_size_le // Effective size of kernel image, little-endian
- .quad _kernel_flags_le // Informative flags, little-endian
+ le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
+ le64sym _kernel_size_le // Effective size of kernel image, little-endian
+ le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
* There aren't any ELF relocations we can use to endian-swap values known only
* at link time (e.g. the subtraction of two symbol addresses), so we must get
* the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
*/
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data) \
- ((((data) & 0x00000000000000ff) << 56) | \
- (((data) & 0x000000000000ff00) << 40) | \
- (((data) & 0x0000000000ff0000) << 24) | \
- (((data) & 0x00000000ff000000) << 8) | \
- (((data) & 0x000000ff00000000) >> 8) | \
- (((data) & 0x0000ff0000000000) >> 24) | \
- (((data) & 0x00ff000000000000) >> 40) | \
- (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data) \
+ ((((data) & 0x000000ff) << 24) | \
+ (((data) & 0x0000ff00) << 8) | \
+ (((data) & 0x00ff0000) >> 8) | \
+ (((data) & 0xff000000) >> 24))
#else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
#endif
+#define DEFINE_IMAGE_LE64(sym, data) \
+ sym##_lo32 = DATA_LE32((data) & 0xffffffff); \
+ sym##_hi32 = DATA_LE32((data) >> 32)
+
#ifdef CONFIG_CPU_BIG_ENDIAN
#define __HEAD_FLAG_BE 1
#else
@@ -58,9 +64,9 @@
* endian swapped in head.S, all are done here for consistency.
*/
#define HEAD_SYMBOLS \
- _kernel_size_le = DATA_LE64(_end - _text); \
- _kernel_offset_le = DATA_LE64(TEXT_OFFSET); \
- _kernel_flags_le = DATA_LE64(__HEAD_FLAGS);
+ DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text); \
+ DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET); \
+ DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
#ifdef CONFIG_EFI
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.
So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 11 +++++++
arch/arm64/kernel/head.S | 6 ++--
arch/arm64/kernel/image.h | 32 ++++++++++++--------
3 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr .req x30 // link register
.size __pi_##x, . - x; \
ENDPROC(x)
+ /*
+ * Emit a 64-bit absolute little endian symbol reference in a way that
+ * ensures that it will be resolved at build time, even when building a
+ * PIE binary. This requires cooperation from the linker script, which
+ * must emit the lo32/hi32 halves individually.
+ */
+ .macro le64sym, sym
+ .long \sym\()_lo32
+ .long \sym\()_hi32
+ .endm
+
#endif /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
b stext // branch to kernel start, magic
.long 0 // reserved
#endif
- .quad _kernel_offset_le // Image load offset from start of RAM, little-endian
- .quad _kernel_size_le // Effective size of kernel image, little-endian
- .quad _kernel_flags_le // Informative flags, little-endian
+ le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
+ le64sym _kernel_size_le // Effective size of kernel image, little-endian
+ le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
* There aren't any ELF relocations we can use to endian-swap values known only
* at link time (e.g. the subtraction of two symbol addresses), so we must get
* the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
*/
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data) \
- ((((data) & 0x00000000000000ff) << 56) | \
- (((data) & 0x000000000000ff00) << 40) | \
- (((data) & 0x0000000000ff0000) << 24) | \
- (((data) & 0x00000000ff000000) << 8) | \
- (((data) & 0x000000ff00000000) >> 8) | \
- (((data) & 0x0000ff0000000000) >> 24) | \
- (((data) & 0x00ff000000000000) >> 40) | \
- (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data) \
+ ((((data) & 0x000000ff) << 24) | \
+ (((data) & 0x0000ff00) << 8) | \
+ (((data) & 0x00ff0000) >> 8) | \
+ (((data) & 0xff000000) >> 24))
#else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
#endif
+#define DEFINE_IMAGE_LE64(sym, data) \
+ sym##_lo32 = DATA_LE32((data) & 0xffffffff); \
+ sym##_hi32 = DATA_LE32((data) >> 32)
+
#ifdef CONFIG_CPU_BIG_ENDIAN
#define __HEAD_FLAG_BE 1
#else
@@ -58,9 +64,9 @@
* endian swapped in head.S, all are done here for consistency.
*/
#define HEAD_SYMBOLS \
- _kernel_size_le = DATA_LE64(_end - _text); \
- _kernel_offset_le = DATA_LE64(TEXT_OFFSET); \
- _kernel_flags_le = DATA_LE64(__HEAD_FLAGS);
+ DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text); \
+ DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET); \
+ DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
#ifdef CONFIG_EFI
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.
So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/assembler.h | 11 +++++++
arch/arm64/kernel/head.S | 6 ++--
arch/arm64/kernel/image.h | 32 ++++++++++++--------
3 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr .req x30 // link register
.size __pi_##x, . - x; \
ENDPROC(x)
+ /*
+ * Emit a 64-bit absolute little endian symbol reference in a way that
+ * ensures that it will be resolved at build time, even when building a
+ * PIE binary. This requires cooperation from the linker script, which
+ * must emit the lo32/hi32 halves individually.
+ */
+ .macro le64sym, sym
+ .long \sym\()_lo32
+ .long \sym\()_hi32
+ .endm
+
#endif /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
b stext // branch to kernel start, magic
.long 0 // reserved
#endif
- .quad _kernel_offset_le // Image load offset from start of RAM, little-endian
- .quad _kernel_size_le // Effective size of kernel image, little-endian
- .quad _kernel_flags_le // Informative flags, little-endian
+ le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
+ le64sym _kernel_size_le // Effective size of kernel image, little-endian
+ le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
* There aren't any ELF relocations we can use to endian-swap values known only
* at link time (e.g. the subtraction of two symbol addresses), so we must get
* the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
*/
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data) \
- ((((data) & 0x00000000000000ff) << 56) | \
- (((data) & 0x000000000000ff00) << 40) | \
- (((data) & 0x0000000000ff0000) << 24) | \
- (((data) & 0x00000000ff000000) << 8) | \
- (((data) & 0x000000ff00000000) >> 8) | \
- (((data) & 0x0000ff0000000000) >> 24) | \
- (((data) & 0x00ff000000000000) >> 40) | \
- (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data) \
+ ((((data) & 0x000000ff) << 24) | \
+ (((data) & 0x0000ff00) << 8) | \
+ (((data) & 0x00ff0000) >> 8) | \
+ (((data) & 0xff000000) >> 24))
#else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
#endif
+#define DEFINE_IMAGE_LE64(sym, data) \
+ sym##_lo32 = DATA_LE32((data) & 0xffffffff); \
+ sym##_hi32 = DATA_LE32((data) >> 32)
+
#ifdef CONFIG_CPU_BIG_ENDIAN
#define __HEAD_FLAG_BE 1
#else
@@ -58,9 +64,9 @@
* endian swapped in head.S, all are done here for consistency.
*/
#define HEAD_SYMBOLS \
- _kernel_size_le = DATA_LE64(_end - _text); \
- _kernel_offset_le = DATA_LE64(TEXT_OFFSET); \
- _kernel_flags_le = DATA_LE64(__HEAD_FLAGS);
+ DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text); \
+ DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET); \
+ DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
#ifdef CONFIG_EFI
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.
So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 39 +++++++++++++-------
2 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
- ldr x21, =stext_offset
+ movz x21, #:abs_g0:stext_offset
add x21, x0, x21
/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
* in the entry routines.
*/
__HEAD
-
+_head:
/*
* DO NOT MODIFY. Image header expected by Linux boot-loaders.
*/
#ifdef CONFIG_EFI
-efi_head:
/*
* This add instruction has no meaningful effect except that
* its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
.byte 0x4d
.byte 0x64
#ifdef CONFIG_EFI
- .long pe_header - efi_head // Offset to the PE header.
+ .long pe_header - _head // Offset to the PE header.
#else
.word 0 // reserved
#endif
#ifdef CONFIG_EFI
.globl __efistub_stext_offset
- .set __efistub_stext_offset, stext - efi_head
+ .set __efistub_stext_offset, stext - _head
.align 3
pe_header:
.ascii "PE"
@@ -124,7 +123,7 @@ optional_header:
.long _end - stext // SizeOfCode
.long 0 // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
- .long __efistub_entry - efi_head // AddressOfEntryPoint
+ .long __efistub_entry - _head // AddressOfEntryPoint
.long __efistub_stext_offset // BaseOfCode
extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
.short 0 // MinorSubsystemVersion
.long 0 // Win32VersionValue
- .long _end - efi_head // SizeOfImage
+ .long _end - _head // SizeOfImage
// Everything before the kernel image is considered part of the header
.long __efistub_stext_offset // SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
* On return, the CPU will be ready for the MMU to be turned on and
* the TCR will have been set.
*/
- ldr x27, =__mmap_switched // address to jump to after
+ ldr x27, 0f // address to jump to after
// MMU has been enabled
adr_l lr, __enable_mmu // return (PIC) address
b __cpu_setup // initialise processor
ENDPROC(stext)
+ .align 3
+0: .quad __mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
- ldr x6, =KERNEL_END // __va(KERNEL_END)
+ ldr w6, kernel_img_size
+ add x6, x6, x5
mov x3, x24 // phys offset
create_block_map x0, x7, x3, x5, x6
@@ -408,6 +410,9 @@ __create_page_tables:
mov lr, x27
ret
ENDPROC(__create_page_tables)
+
+kernel_img_size:
+ .long _end - (_head - TEXT_OFFSET)
.ltorg
/*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ adr_l x8, vectors // load VBAR_EL1 with virtual
+ msr vbar_el1, x8 // vector table address
+ isb
+
// Clear BSS
adr_l x0, __bss_start
mov x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x21, =secondary_data
- ldr x27, =__secondary_switched // address to jump to after enabling the MMU
+ ldr x8, =KIMAGE_VADDR
+ ldr w9, 0f
+ sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
ENDPROC(secondary_startup)
+0: .long (_text - TEXT_OFFSET) - __secondary_switched
ENTRY(__secondary_switched)
- ldr x0, [x21] // get secondary_data.stack
+ adr_l x5, vectors
+ msr vbar_el1, x5
+ isb
+
+ ldr_l x0, secondary_data // get secondary_data.stack
mov sp, x0
and x0, x0, #~(THREAD_SIZE - 1)
msr sp_el0, x0 // save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne __no_granule_support
- ldr x5, =vectors
- msr vbar_el1, x5
msr ttbr0_el1, x25 // load TTBR0
msr ttbr1_el1, x26 // load TTBR1
isb
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.
So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 39 +++++++++++++-------
2 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
- ldr x21, =stext_offset
+ movz x21, #:abs_g0:stext_offset
add x21, x0, x21
/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
* in the entry routines.
*/
__HEAD
-
+_head:
/*
* DO NOT MODIFY. Image header expected by Linux boot-loaders.
*/
#ifdef CONFIG_EFI
-efi_head:
/*
* This add instruction has no meaningful effect except that
* its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
.byte 0x4d
.byte 0x64
#ifdef CONFIG_EFI
- .long pe_header - efi_head // Offset to the PE header.
+ .long pe_header - _head // Offset to the PE header.
#else
.word 0 // reserved
#endif
#ifdef CONFIG_EFI
.globl __efistub_stext_offset
- .set __efistub_stext_offset, stext - efi_head
+ .set __efistub_stext_offset, stext - _head
.align 3
pe_header:
.ascii "PE"
@@ -124,7 +123,7 @@ optional_header:
.long _end - stext // SizeOfCode
.long 0 // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
- .long __efistub_entry - efi_head // AddressOfEntryPoint
+ .long __efistub_entry - _head // AddressOfEntryPoint
.long __efistub_stext_offset // BaseOfCode
extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
.short 0 // MinorSubsystemVersion
.long 0 // Win32VersionValue
- .long _end - efi_head // SizeOfImage
+ .long _end - _head // SizeOfImage
// Everything before the kernel image is considered part of the header
.long __efistub_stext_offset // SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
* On return, the CPU will be ready for the MMU to be turned on and
* the TCR will have been set.
*/
- ldr x27, =__mmap_switched // address to jump to after
+ ldr x27, 0f // address to jump to after
// MMU has been enabled
adr_l lr, __enable_mmu // return (PIC) address
b __cpu_setup // initialise processor
ENDPROC(stext)
+ .align 3
+0: .quad __mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
- ldr x6, =KERNEL_END // __va(KERNEL_END)
+ ldr w6, kernel_img_size
+ add x6, x6, x5
mov x3, x24 // phys offset
create_block_map x0, x7, x3, x5, x6
@@ -408,6 +410,9 @@ __create_page_tables:
mov lr, x27
ret
ENDPROC(__create_page_tables)
+
+kernel_img_size:
+ .long _end - (_head - TEXT_OFFSET)
.ltorg
/*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ adr_l x8, vectors // load VBAR_EL1 with virtual
+ msr vbar_el1, x8 // vector table address
+ isb
+
// Clear BSS
adr_l x0, __bss_start
mov x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x21, =secondary_data
- ldr x27, =__secondary_switched // address to jump to after enabling the MMU
+ ldr x8, =KIMAGE_VADDR
+ ldr w9, 0f
+ sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
ENDPROC(secondary_startup)
+0: .long (_text - TEXT_OFFSET) - __secondary_switched
ENTRY(__secondary_switched)
- ldr x0, [x21] // get secondary_data.stack
+ adr_l x5, vectors
+ msr vbar_el1, x5
+ isb
+
+ ldr_l x0, secondary_data // get secondary_data.stack
mov sp, x0
and x0, x0, #~(THREAD_SIZE - 1)
msr sp_el0, x0 // save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne __no_granule_support
- ldr x5, =vectors
- msr vbar_el1, x5
msr ttbr0_el1, x25 // load TTBR0
msr ttbr1_el1, x26 // load TTBR1
isb
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.
So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 39 +++++++++++++-------
2 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
- ldr x21, =stext_offset
+ movz x21, #:abs_g0:stext_offset
add x21, x0, x21
/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
* in the entry routines.
*/
__HEAD
-
+_head:
/*
* DO NOT MODIFY. Image header expected by Linux boot-loaders.
*/
#ifdef CONFIG_EFI
-efi_head:
/*
* This add instruction has no meaningful effect except that
* its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
.byte 0x4d
.byte 0x64
#ifdef CONFIG_EFI
- .long pe_header - efi_head // Offset to the PE header.
+ .long pe_header - _head // Offset to the PE header.
#else
.word 0 // reserved
#endif
#ifdef CONFIG_EFI
.globl __efistub_stext_offset
- .set __efistub_stext_offset, stext - efi_head
+ .set __efistub_stext_offset, stext - _head
.align 3
pe_header:
.ascii "PE"
@@ -124,7 +123,7 @@ optional_header:
.long _end - stext // SizeOfCode
.long 0 // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
- .long __efistub_entry - efi_head // AddressOfEntryPoint
+ .long __efistub_entry - _head // AddressOfEntryPoint
.long __efistub_stext_offset // BaseOfCode
extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
.short 0 // MinorSubsystemVersion
.long 0 // Win32VersionValue
- .long _end - efi_head // SizeOfImage
+ .long _end - _head // SizeOfImage
// Everything before the kernel image is considered part of the header
.long __efistub_stext_offset // SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
* On return, the CPU will be ready for the MMU to be turned on and
* the TCR will have been set.
*/
- ldr x27, =__mmap_switched // address to jump to after
+ ldr x27, 0f // address to jump to after
// MMU has been enabled
adr_l lr, __enable_mmu // return (PIC) address
b __cpu_setup // initialise processor
ENDPROC(stext)
+ .align 3
+0: .quad __mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
create_pgd_entry x0, x5, x3, x6
- ldr x6, =KERNEL_END // __va(KERNEL_END)
+ ldr w6, kernel_img_size
+ add x6, x6, x5
mov x3, x24 // phys offset
create_block_map x0, x7, x3, x5, x6
@@ -408,6 +410,9 @@ __create_page_tables:
mov lr, x27
ret
ENDPROC(__create_page_tables)
+
+kernel_img_size:
+ .long _end - (_head - TEXT_OFFSET)
.ltorg
/*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ adr_l x8, vectors // load VBAR_EL1 with virtual
+ msr vbar_el1, x8 // vector table address
+ isb
+
// Clear BSS
adr_l x0, __bss_start
mov x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x21, =secondary_data
- ldr x27, =__secondary_switched // address to jump to after enabling the MMU
+ ldr x8, =KIMAGE_VADDR
+ ldr w9, 0f
+ sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
ENDPROC(secondary_startup)
+0: .long (_text - TEXT_OFFSET) - __secondary_switched
ENTRY(__secondary_switched)
- ldr x0, [x21] // get secondary_data.stack
+ adr_l x5, vectors
+ msr vbar_el1, x5
+ isb
+
+ ldr_l x0, secondary_data // get secondary_data.stack
mov sp, x0
and x0, x0, #~(THREAD_SIZE - 1)
msr sp_el0, x0 // save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne __no_granule_support
- ldr x5, =vectors
- msr vbar_el1, x5
msr ttbr0_el1, x25 // load TTBR0
msr ttbr1_el1, x26 // load TTBR1
isb
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.
This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.
Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.
Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/arm64/booting.txt | 20 ++--
arch/arm64/include/asm/boot.h | 6 ++
arch/arm64/include/asm/kernel-pgtable.h | 11 +++
arch/arm64/include/asm/kvm_asm.h | 2 +-
arch/arm64/include/asm/memory.h | 15 +--
arch/arm64/kernel/head.S | 6 +-
arch/arm64/kernel/image.h | 13 ++-
arch/arm64/mm/init.c | 96 +++++++++++++++++++-
arch/arm64/mm/mmu.c | 3 +
9 files changed, 150 insertions(+), 22 deletions(-)
diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
1 - 4K
2 - 16K
3 - 64K
- Bits 3-63: Reserved.
+ Bit 3: Kernel physical placement
+ 0 - 2MB aligned base should be as close as possible
+ to the base of DRAM, since memory below it is not
+ accessible
+ 1 - 2MB aligned base may be anywhere in physical
+ memory
+ Bits 4-63: Reserved.
- When image_size is zero, a bootloader should attempt to keep as much
memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
depending on selected features, and is effectively unbound.
The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
At least image_size bytes from the start of the image must be free for
use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
Any memory described to the kernel (even that below the start of the
image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
#define MIN_FDT_ALIGN 8
#define MAX_FDT_SIZE SZ_2M
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN SZ_2M
+
#endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
#define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
#endif
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN SZ_1G
+#endif
#endif /* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
-#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym) phys_to_virt((u64)&sym - kimage_voffset)
#ifndef __ASSEMBLY__
struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
#define __virt_to_phys(x) ({ \
phys_addr_t __x = (phys_addr_t)(x); \
__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
- (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+ (__x - kimage_voffset); })
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset))
/*
* Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the offset between the kernel virtual and physical mappings */
+extern u64 kimage_voffset;
+
/*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
*/
-#define MAX_MEMBLOCK_ADDR ({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR 0
+#define MAX_MEMBLOCK_ADDR U64_MAX
/*
* PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
and x4, x4, #~(THREAD_SIZE - 1)
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- str_l x24, memstart_addr, x6 // Save PHYS_OFFSET
+
+ ldr x4, =KIMAGE_VADDR // Save the offset between
+ sub x4, x4, x24 // the kernel virtual and
+ str_l x4, kimage_voffset, x5 // physical mappings
+
mov x29, #0
#ifdef CONFIG_KASAN
bl kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
sym##_hi32 = DATA_LE32((data) >> 32)
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE 1
+#define __HEAD_FLAG_BE 1
#else
-#define __HEAD_FLAG_BE 0
+#define __HEAD_FLAG_BE 0
#endif
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
- (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE 1
+
+#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
+ (__HEAD_FLAG_PAGE_SIZE << 1) | \
+ (__HEAD_FLAG_PHYS_BASE << 3))
/*
* These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
#include <linux/efi.h>
#include <linux/swiotlb.h>
+#include <asm/boot.h>
#include <asm/fixmap.h>
#include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
}
early_param("mem", early_mem);
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ * we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+ u64 mem_size, to_remove;
+ int i;
+
+again:
+ mem_size = memblock_phys_mem_size();
+ if (mem_size <= memory_limit || max <= min)
+ return;
+
+ to_remove = mem_size - memory_limit;
+
+ for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+ struct memblock_region *r = memblock.memory.regions + i;
+ u64 start = max(min, r->base);
+ u64 end = min(max, r->base + r->size);
+
+ if (start >= max || end <= min)
+ continue;
+
+ if (end > min) {
+ u64 size = min(to_remove, end - max(start, min));
+
+ memblock_remove(end - size, size);
+ } else {
+ memblock_remove(start, min(max - start, to_remove));
+ }
+ goto again;
+ }
+}
+
void __init arm64_memblock_init(void)
{
- memblock_enforce_memory_limit(memory_limit);
+ const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+ /*
+ * Select a suitable value for the base of physical memory.
+ */
+ memstart_addr = round_down(memblock_start_of_DRAM(),
+ ARM64_MEMSTART_ALIGN);
+
+ /*
+ * Remove the memory that we will not be able to cover
+ * with the linear mapping.
+ */
+ memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+ u64 kend = PAGE_ALIGN(__pa(_end));
+ u64 const sz_4g = 0x100000000UL;
+
+ /*
+ * Clip memory in order of preference:
+ * - above the kernel and above 4 GB
+ * - between 4 GB and the start of the kernel (if the kernel
+ * is loaded high in memory)
+ * - between the kernel and 4 GB (if the kernel is loaded
+ * low in memory)
+ * - below 4 GB
+ */
+ clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+ clip_mem_range(sz_4g, kbase);
+ clip_mem_range(kend, sz_4g);
+ clip_mem_range(0, min(kbase, sz_4g));
+ }
/*
* Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
__setup("keepinitrd", keepinitrd_setup);
#endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+ } else {
+ pr_emerg("Memory Limit: none\n");
+ }
+ return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+ .notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &mem_limit_notifier);
+ return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
/*
* Empty_zero_page is a special page that is used for zero-initialized data
* and COW.
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.
This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.
Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.
Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/arm64/booting.txt | 20 ++--
arch/arm64/include/asm/boot.h | 6 ++
arch/arm64/include/asm/kernel-pgtable.h | 11 +++
arch/arm64/include/asm/kvm_asm.h | 2 +-
arch/arm64/include/asm/memory.h | 15 +--
arch/arm64/kernel/head.S | 6 +-
arch/arm64/kernel/image.h | 13 ++-
arch/arm64/mm/init.c | 96 +++++++++++++++++++-
arch/arm64/mm/mmu.c | 3 +
9 files changed, 150 insertions(+), 22 deletions(-)
diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
1 - 4K
2 - 16K
3 - 64K
- Bits 3-63: Reserved.
+ Bit 3: Kernel physical placement
+ 0 - 2MB aligned base should be as close as possible
+ to the base of DRAM, since memory below it is not
+ accessible
+ 1 - 2MB aligned base may be anywhere in physical
+ memory
+ Bits 4-63: Reserved.
- When image_size is zero, a bootloader should attempt to keep as much
memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
depending on selected features, and is effectively unbound.
The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
At least image_size bytes from the start of the image must be free for
use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
Any memory described to the kernel (even that below the start of the
image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
#define MIN_FDT_ALIGN 8
#define MAX_FDT_SIZE SZ_2M
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN SZ_2M
+
#endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
#define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
#endif
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN SZ_1G
+#endif
#endif /* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
-#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym) phys_to_virt((u64)&sym - kimage_voffset)
#ifndef __ASSEMBLY__
struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
#define __virt_to_phys(x) ({ \
phys_addr_t __x = (phys_addr_t)(x); \
__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
- (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+ (__x - kimage_voffset); })
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset))
/*
* Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the offset between the kernel virtual and physical mappings */
+extern u64 kimage_voffset;
+
/*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
*/
-#define MAX_MEMBLOCK_ADDR ({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR 0
+#define MAX_MEMBLOCK_ADDR U64_MAX
/*
* PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
and x4, x4, #~(THREAD_SIZE - 1)
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- str_l x24, memstart_addr, x6 // Save PHYS_OFFSET
+
+ ldr x4, =KIMAGE_VADDR // Save the offset between
+ sub x4, x4, x24 // the kernel virtual and
+ str_l x4, kimage_voffset, x5 // physical mappings
+
mov x29, #0
#ifdef CONFIG_KASAN
bl kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
sym##_hi32 = DATA_LE32((data) >> 32)
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE 1
+#define __HEAD_FLAG_BE 1
#else
-#define __HEAD_FLAG_BE 0
+#define __HEAD_FLAG_BE 0
#endif
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
- (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE 1
+
+#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
+ (__HEAD_FLAG_PAGE_SIZE << 1) | \
+ (__HEAD_FLAG_PHYS_BASE << 3))
/*
* These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
#include <linux/efi.h>
#include <linux/swiotlb.h>
+#include <asm/boot.h>
#include <asm/fixmap.h>
#include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
}
early_param("mem", early_mem);
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ * we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+ u64 mem_size, to_remove;
+ int i;
+
+again:
+ mem_size = memblock_phys_mem_size();
+ if (mem_size <= memory_limit || max <= min)
+ return;
+
+ to_remove = mem_size - memory_limit;
+
+ for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+ struct memblock_region *r = memblock.memory.regions + i;
+ u64 start = max(min, r->base);
+ u64 end = min(max, r->base + r->size);
+
+ if (start >= max || end <= min)
+ continue;
+
+ if (end > min) {
+ u64 size = min(to_remove, end - max(start, min));
+
+ memblock_remove(end - size, size);
+ } else {
+ memblock_remove(start, min(max - start, to_remove));
+ }
+ goto again;
+ }
+}
+
void __init arm64_memblock_init(void)
{
- memblock_enforce_memory_limit(memory_limit);
+ const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+ /*
+ * Select a suitable value for the base of physical memory.
+ */
+ memstart_addr = round_down(memblock_start_of_DRAM(),
+ ARM64_MEMSTART_ALIGN);
+
+ /*
+ * Remove the memory that we will not be able to cover
+ * with the linear mapping.
+ */
+ memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+ u64 kend = PAGE_ALIGN(__pa(_end));
+ u64 const sz_4g = 0x100000000UL;
+
+ /*
+ * Clip memory in order of preference:
+ * - above the kernel and above 4 GB
+ * - between 4 GB and the start of the kernel (if the kernel
+ * is loaded high in memory)
+ * - between the kernel and 4 GB (if the kernel is loaded
+ * low in memory)
+ * - below 4 GB
+ */
+ clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+ clip_mem_range(sz_4g, kbase);
+ clip_mem_range(kend, sz_4g);
+ clip_mem_range(0, min(kbase, sz_4g));
+ }
/*
* Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
__setup("keepinitrd", keepinitrd_setup);
#endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+ } else {
+ pr_emerg("Memory Limit: none\n");
+ }
+ return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+ .notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &mem_limit_notifier);
+ return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
/*
* Empty_zero_page is a special page that is used for zero-initialized data
* and COW.
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.
This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.
Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.
Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
Documentation/arm64/booting.txt | 20 ++--
arch/arm64/include/asm/boot.h | 6 ++
arch/arm64/include/asm/kernel-pgtable.h | 11 +++
arch/arm64/include/asm/kvm_asm.h | 2 +-
arch/arm64/include/asm/memory.h | 15 +--
arch/arm64/kernel/head.S | 6 +-
arch/arm64/kernel/image.h | 13 ++-
arch/arm64/mm/init.c | 96 +++++++++++++++++++-
arch/arm64/mm/mmu.c | 3 +
9 files changed, 150 insertions(+), 22 deletions(-)
diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
1 - 4K
2 - 16K
3 - 64K
- Bits 3-63: Reserved.
+ Bit 3: Kernel physical placement
+ 0 - 2MB aligned base should be as close as possible
+ to the base of DRAM, since memory below it is not
+ accessible
+ 1 - 2MB aligned base may be anywhere in physical
+ memory
+ Bits 4-63: Reserved.
- When image_size is zero, a bootloader should attempt to keep as much
memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
depending on selected features, and is effectively unbound.
The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
At least image_size bytes from the start of the image must be free for
use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
Any memory described to the kernel (even that below the start of the
image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
#define MIN_FDT_ALIGN 8
#define MAX_FDT_SIZE SZ_2M
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN SZ_2M
+
#endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
#define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
#endif
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN SZ_1G
+#endif
#endif /* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
-#define kvm_ksym_ref(sym) ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym) phys_to_virt((u64)&sym - kimage_voffset)
#ifndef __ASSEMBLY__
struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
#define __virt_to_phys(x) ({ \
phys_addr_t __x = (phys_addr_t)(x); \
__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \
- (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+ (__x - kimage_voffset); })
#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset))
/*
* Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the offset between the kernel virtual and physical mappings */
+extern u64 kimage_voffset;
+
/*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
*/
-#define MAX_MEMBLOCK_ADDR ({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR 0
+#define MAX_MEMBLOCK_ADDR U64_MAX
/*
* PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
and x4, x4, #~(THREAD_SIZE - 1)
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- str_l x24, memstart_addr, x6 // Save PHYS_OFFSET
+
+ ldr x4, =KIMAGE_VADDR // Save the offset between
+ sub x4, x4, x24 // the kernel virtual and
+ str_l x4, kimage_voffset, x5 // physical mappings
+
mov x29, #0
#ifdef CONFIG_KASAN
bl kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
sym##_hi32 = DATA_LE32((data) >> 32)
#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE 1
+#define __HEAD_FLAG_BE 1
#else
-#define __HEAD_FLAG_BE 0
+#define __HEAD_FLAG_BE 0
#endif
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
- (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE 1
+
+#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \
+ (__HEAD_FLAG_PAGE_SIZE << 1) | \
+ (__HEAD_FLAG_PHYS_BASE << 3))
/*
* These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
#include <linux/efi.h>
#include <linux/swiotlb.h>
+#include <asm/boot.h>
#include <asm/fixmap.h>
#include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/sections.h>
#include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
}
early_param("mem", early_mem);
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ * we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+ u64 mem_size, to_remove;
+ int i;
+
+again:
+ mem_size = memblock_phys_mem_size();
+ if (mem_size <= memory_limit || max <= min)
+ return;
+
+ to_remove = mem_size - memory_limit;
+
+ for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+ struct memblock_region *r = memblock.memory.regions + i;
+ u64 start = max(min, r->base);
+ u64 end = min(max, r->base + r->size);
+
+ if (start >= max || end <= min)
+ continue;
+
+ if (end > min) {
+ u64 size = min(to_remove, end - max(start, min));
+
+ memblock_remove(end - size, size);
+ } else {
+ memblock_remove(start, min(max - start, to_remove));
+ }
+ goto again;
+ }
+}
+
void __init arm64_memblock_init(void)
{
- memblock_enforce_memory_limit(memory_limit);
+ const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+ /*
+ * Select a suitable value for the base of physical memory.
+ */
+ memstart_addr = round_down(memblock_start_of_DRAM(),
+ ARM64_MEMSTART_ALIGN);
+
+ /*
+ * Remove the memory that we will not be able to cover
+ * with the linear mapping.
+ */
+ memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+ u64 kend = PAGE_ALIGN(__pa(_end));
+ u64 const sz_4g = 0x100000000UL;
+
+ /*
+ * Clip memory in order of preference:
+ * - above the kernel and above 4 GB
+ * - between 4 GB and the start of the kernel (if the kernel
+ * is loaded high in memory)
+ * - between the kernel and 4 GB (if the kernel is loaded
+ * low in memory)
+ * - below 4 GB
+ */
+ clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+ clip_mem_range(sz_4g, kbase);
+ clip_mem_range(kend, sz_4g);
+ clip_mem_range(0, min(kbase, sz_4g));
+ }
/*
* Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
__setup("keepinitrd", keepinitrd_setup);
#endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+ if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+ pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+ } else {
+ pr_emerg("Memory Limit: none\n");
+ }
+ return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+ .notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &mem_limit_notifier);
+ return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
/*
* Empty_zero_page is a special page that is used for zero-initialized data
* and COW.
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
#include <asm/ptrace.h>
#include <asm/user.h>
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs) \
- *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
/*
* AArch64 static relocation types.
*/
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
*/
#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs) \
+ *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
/*
* When the program starts, a1 contains a pointer to a function to be
* registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
#endif /* CONFIG_COMPAT */
+#endif /* !__ASSEMBLY__ */
+
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
#include <asm/ptrace.h>
#include <asm/user.h>
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs) \
- *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
/*
* AArch64 static relocation types.
*/
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
*/
#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs) \
+ *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
/*
* When the program starts, a1 contains a pointer to a function to be
* registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
#endif /* CONFIG_COMPAT */
+#endif /* !__ASSEMBLY__ */
+
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
#include <asm/ptrace.h>
#include <asm/user.h>
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs) \
- *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
/*
* AArch64 static relocation types.
*/
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
*/
#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs) \
+ *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
/*
* When the program starts, a1 contains a pointer to a function to be
* registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
#endif /* CONFIG_COMPAT */
+#endif /* !__ASSEMBLY__ */
+
#endif
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* Re: [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:42 ` Mark Rutland
-1 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
leif.lindholm, keescook, linux-kernel, stuart.yoder,
bhupesh.sharma, arnd, marc.zyngier, christoffer.dall, labbott,
matt
On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
> #include <asm/ptrace.h>
> #include <asm/user.h>
>
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs) \
> - *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
> /*
> * AArch64 static relocation types.
> */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> */
> #define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
>
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs) \
> + *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
> /*
> * When the program starts, a1 contains a pointer to a function to be
> * registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>
> #endif /* CONFIG_COMPAT */
>
> +#endif /* !__ASSEMBLY__ */
> +
> #endif
> --
> 2.5.0
>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:42 ` Mark Rutland
0 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
leif.lindholm, keescook, linux-kernel, stuart.yoder,
bhupesh.sharma, arnd, marc.zyngier, christoffer.dall, labbott,
matt
On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
> #include <asm/ptrace.h>
> #include <asm/user.h>
>
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs) \
> - *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
> /*
> * AArch64 static relocation types.
> */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> */
> #define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
>
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs) \
> + *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
> /*
> * When the program starts, a1 contains a pointer to a function to be
> * registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>
> #endif /* CONFIG_COMPAT */
>
> +#endif /* !__ASSEMBLY__ */
> +
> #endif
> --
> 2.5.0
>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:42 ` Mark Rutland
0 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
> #include <asm/ptrace.h>
> #include <asm/user.h>
>
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs) \
> - *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
> /*
> * AArch64 static relocation types.
> */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> */
> #define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
>
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs) \
> + *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
> /*
> * When the program starts, a1 contains a pointer to a function to be
> * registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>
> #endif /* CONFIG_COMPAT */
>
> +#endif /* !__ASSEMBLY__ */
> +
> #endif
> --
> 2.5.0
>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
scripts/sortextable.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
break;
} /* end switch */
if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
- || r2(&ehdr->e_type) != ET_EXEC
+ || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
- fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+ fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
fail_file();
}
@@ -304,7 +304,7 @@ do_file(char const *const fname)
if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
|| r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
|| r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do64(ghdr, fname, custom_sort);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
scripts/sortextable.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
break;
} /* end switch */
if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
- || r2(&ehdr->e_type) != ET_EXEC
+ || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
- fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+ fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
fail_file();
}
@@ -304,7 +304,7 @@ do_file(char const *const fname)
if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
|| r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
|| r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do64(ghdr, fname, custom_sort);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
scripts/sortextable.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
break;
} /* end switch */
if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
- || r2(&ehdr->e_type) != ET_EXEC
+ || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
|| ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
- fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+ fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
fail_file();
}
@@ -304,7 +304,7 @@ do_file(char const *const fname)
if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
|| r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
|| r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
fprintf(stderr,
- "unrecognized ET_EXEC file: %s\n", fname);
+ "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
fail_file();
}
do64(ghdr, fname, custom_sort);
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* Re: [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 23:25 ` Kees Cook
-1 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
Matt Fleming
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Something I can actually test! :)
-Kees
> ---
> scripts/sortextable.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
> break;
> } /* end switch */
> if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> - || r2(&ehdr->e_type) != ET_EXEC
> + || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
> || ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> - fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> + fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
> fail_file();
> }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
> if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
> || r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
> if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
> || r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 23:25 ` Kees Cook
0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
Matt Fleming
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Something I can actually test! :)
-Kees
> ---
> scripts/sortextable.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
> break;
> } /* end switch */
> if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> - || r2(&ehdr->e_type) != ET_EXEC
> + || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
> || ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> - fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> + fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
> fail_file();
> }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
> if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
> || r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
> if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
> || r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 23:25 ` Kees Cook
0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Something I can actually test! :)
-Kees
> ---
> scripts/sortextable.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
> break;
> } /* end switch */
> if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> - || r2(&ehdr->e_type) != ET_EXEC
> + || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
> || ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> - fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> + fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
> fail_file();
> }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
> if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
> || r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
> if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
> || r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
> fprintf(stderr,
> - "unrecognized ET_EXEC file: %s\n", fname);
> + "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
> fail_file();
> }
> do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel, Heiko Carstens,
Michael Ellerman, Ingo Molnar, H. Peter Anvin,
Benjamin Herrenschmidt, Michal Marek, Rusty Russell,
Andrew Morton
Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.
On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits. This saves several hundreds of kilobytes of permanent .rodata on
average. In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values. This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64. Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.
Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.
Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
init/Kconfig | 16 ++++
kernel/kallsyms.c | 38 +++++++--
scripts/kallsyms.c | 88 +++++++++++++++++---
scripts/link-vmlinux.sh | 4 +
scripts/namespace.pl | 2 +
5 files changed, 129 insertions(+), 19 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
Say N unless you really need all symbols.
+config KALLSYMS_BASE_RELATIVE
+ bool
+ depends on KALLSYMS
+ default !IA64
+ help
+ Instead of emitting them as absolute values in the native word size,
+ emit the symbol references in the kallsyms table as 32-bit entries,
+ each containing either an absolute value in the range [0, S32_MAX] or
+ a relative value in the range [base, base + S32_MAX], where base is
+ the lowest relative symbol address encountered in the image.
+
+ On 64-bit builds, this reduces the size of the address table by 50%,
+ but more importantly, it results in entries whose values are build
+ time constants, and no relocation pass is required at runtime to fix
+ up the entries based on the runtime load address of the kernel.
+
config PRINTK
default y
bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
* during the second link stage.
*/
extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
extern const u8 kallsyms_names[] __weak;
/*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
extern const unsigned long kallsyms_num_syms
__attribute__((weak, section(".rodata")));
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
extern const u8 kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak;
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
return name - kallsyms_names;
}
+static unsigned long kallsyms_sym_address(int idx)
+{
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ return kallsyms_addresses[idx];
+
+ /* positive offsets are absolute values */
+ if (kallsyms_offsets[idx] >= 0)
+ return kallsyms_offsets[idx];
+
+ /* negative offsets are relative to kallsyms_relative_base - 1 */
+ return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
if (strcmp(namebuf, name) == 0)
- return kallsyms_addresses[i];
+ return kallsyms_sym_address(i);
}
return module_kallsyms_lookup_name(name);
}
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+ ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
if (ret != 0)
return ret;
}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
unsigned long i, low, high, mid;
/* This kernel should never had been booted. */
- BUG_ON(!kallsyms_addresses);
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ BUG_ON(!kallsyms_addresses);
+ else
+ BUG_ON(!kallsyms_offsets);
/* Do a binary search on the sorted kallsyms_addresses array. */
low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
while (high - low > 1) {
mid = low + (high - low) / 2;
- if (kallsyms_addresses[mid] <= addr)
+ if (kallsyms_sym_address(mid) <= addr)
low = mid;
else
high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
* Search for the first aliased symbol. Aliased
* symbols are symbols with the same address.
*/
- while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+ while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
--low;
- symbol_start = kallsyms_addresses[low];
+ symbol_start = kallsyms_sym_address(low);
/* Search for next non-aliased symbol. */
for (i = low + 1; i < kallsyms_num_syms; i++) {
- if (kallsyms_addresses[i] > symbol_start) {
- symbol_end = kallsyms_addresses[i];
+ if (kallsyms_sym_address(i) > symbol_start) {
+ symbol_end = kallsyms_sym_address(i);
break;
}
}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
unsigned off = iter->nameoff;
iter->module_name[0] = '\0';
- iter->value = kallsyms_addresses[iter->pos];
+ iter->value = kallsyms_sym_address(iter->pos);
iter->type = kallsyms_get_symbol_type(off);
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
+#include <limits.h>
#ifndef ARRAY_SIZE
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
};
static unsigned long long _text;
+static unsigned long long relative_base;
static struct addr_range text_ranges[] = {
{ "_stext", "_etext" },
{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
static int absolute_percpu = 0;
static char symbol_prefix_char = '\0';
static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
int token_profit[0x10000];
@@ -75,7 +78,7 @@ static void usage(void)
fprintf(stderr, "Usage: kallsyms [--all-symbols] "
"[--symbol-prefix=<prefix char>] "
"[--page-offset=<CONFIG_PAGE_OFFSET>] "
- "< in.map > out.S\n");
+ "[--base-relative] < in.map > out.S\n");
exit(1);
}
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
*/
static char *special_symbols[] = {
"kallsyms_addresses",
+ "kallsyms_offsets",
+ "kallsyms_relative_base",
"kallsyms_num_syms",
"kallsyms_names",
"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
printf("\t.section .rodata, \"a\"\n");
- /* Provide proper symbols relocatability by their '_text'
- * relativeness. The symbol names cannot be used to construct
- * normal symbol references as the list of symbols contains
- * symbols that are declared static and are private to their
- * .o files. This prevents .tmp_kallsyms.o or any other
- * object from referencing them.
+ /* Provide proper symbols relocatability by their relativeness
+ * to a fixed anchor point in the runtime image, either '_text'
+ * for absolute address tables, in which case the linker will
+ * emit the final addresses at build time. Otherwise, use the
+ * offset relative to the lowest value encountered of all relative
+ * symbols, and emit non-relocatable fixed offsets that will be fixed
+ * up at runtime.
+ *
+ * The symbol names cannot be used to construct normal symbol
+ * references as the list of symbols contains symbols that are
+ * declared static and are private to their .o files. This prevents
+ * .tmp_kallsyms.o or any other object from referencing them.
*/
- output_label("kallsyms_addresses");
+ if (!base_relative)
+ output_label("kallsyms_addresses");
+ else
+ output_label("kallsyms_offsets");
+
for (i = 0; i < table_cnt; i++) {
- if (!symbol_absolute(&table[i])) {
+ if (base_relative) {
+ long long offset;
+
+ if (symbol_absolute(&table[i])) {
+ offset = table[i].addr;
+ if (offset < 0 || offset > INT_MAX) {
+ fprintf(stderr, "kallsyms failure: "
+ "absolute symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ } else {
+ offset = relative_base - table[i].addr - 1;
+ if (offset < INT_MIN || offset >= 0) {
+ fprintf(stderr, "kallsyms failure: "
+ "relative symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ }
+ printf("\t.long\t%#x\n", (int)offset);
+ } else if (!symbol_absolute(&table[i])) {
if (_text <= table[i].addr)
printf("\tPTR\t_text + %#llx\n",
table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
}
printf("\n");
+ if (base_relative) {
+ output_label("kallsyms_relative_base");
+ printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+ printf("\n");
+ }
+
output_label("kallsyms_num_syms");
printf("\tPTR\t%d\n", table_cnt);
printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
}
}
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+ unsigned int i;
+
+ if (kernel_start_addr > 0) {
+ /*
+ * If the kernel start address was specified, use that as
+ * the relative base rather than going through the table,
+ * since it should be a reasonable default, and values below
+ * it will be ignored anyway.
+ */
+ relative_base = kernel_start_addr;
+ } else {
+ relative_base = -1ULL;
+ for (i = 0; i < table_cnt; i++)
+ if (!symbol_absolute(&table[i]) &&
+ table[i].addr < relative_base)
+ relative_base = table[i].addr;
+ }
+}
+
int main(int argc, char **argv)
{
if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
const char *p = &argv[i][14];
kernel_start_addr = strtoull(p, NULL, 16);
- } else
+ } else if (strcmp(argv[i], "--base-relative") == 0)
+ base_relative = 1;
+ else
usage();
}
} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
read_map(stdin);
if (absolute_percpu)
make_percpus_absolute();
+ if (base_relative)
+ record_relative_base();
sort_symbols();
optimize_token_table();
write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
kallsymopt="${kallsymopt} --absolute-percpu"
fi
+ if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+ kallsymopt="${kallsymopt} --base-relative"
+ fi
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
'kallsyms_names' => 1,
'kallsyms_num_syms' => 1,
'kallsyms_addresses'=> 1,
+ 'kallsyms_offsets' => 1,
+ 'kallsyms_relative_base'=> 1,
'__this_module' => 1,
'_etext' => 1,
'_edata' => 1,
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel, Heiko Carstens,
Michael Ellerman, Ingo Molnar, H. Peter Anvin,
Benjamin Herrenschmidt, Michal Marek, Rusty Russell,
Andrew Morton
Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.
On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits. This saves several hundreds of kilobytes of permanent .rodata on
average. In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values. This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64. Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.
Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.
Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
init/Kconfig | 16 ++++
kernel/kallsyms.c | 38 +++++++--
scripts/kallsyms.c | 88 +++++++++++++++++---
scripts/link-vmlinux.sh | 4 +
scripts/namespace.pl | 2 +
5 files changed, 129 insertions(+), 19 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
Say N unless you really need all symbols.
+config KALLSYMS_BASE_RELATIVE
+ bool
+ depends on KALLSYMS
+ default !IA64
+ help
+ Instead of emitting them as absolute values in the native word size,
+ emit the symbol references in the kallsyms table as 32-bit entries,
+ each containing either an absolute value in the range [0, S32_MAX] or
+ a relative value in the range [base, base + S32_MAX], where base is
+ the lowest relative symbol address encountered in the image.
+
+ On 64-bit builds, this reduces the size of the address table by 50%,
+ but more importantly, it results in entries whose values are build
+ time constants, and no relocation pass is required at runtime to fix
+ up the entries based on the runtime load address of the kernel.
+
config PRINTK
default y
bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
* during the second link stage.
*/
extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
extern const u8 kallsyms_names[] __weak;
/*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
extern const unsigned long kallsyms_num_syms
__attribute__((weak, section(".rodata")));
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
extern const u8 kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak;
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
return name - kallsyms_names;
}
+static unsigned long kallsyms_sym_address(int idx)
+{
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ return kallsyms_addresses[idx];
+
+ /* positive offsets are absolute values */
+ if (kallsyms_offsets[idx] >= 0)
+ return kallsyms_offsets[idx];
+
+ /* negative offsets are relative to kallsyms_relative_base - 1 */
+ return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
if (strcmp(namebuf, name) == 0)
- return kallsyms_addresses[i];
+ return kallsyms_sym_address(i);
}
return module_kallsyms_lookup_name(name);
}
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+ ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
if (ret != 0)
return ret;
}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
unsigned long i, low, high, mid;
/* This kernel should never had been booted. */
- BUG_ON(!kallsyms_addresses);
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ BUG_ON(!kallsyms_addresses);
+ else
+ BUG_ON(!kallsyms_offsets);
/* Do a binary search on the sorted kallsyms_addresses array. */
low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
while (high - low > 1) {
mid = low + (high - low) / 2;
- if (kallsyms_addresses[mid] <= addr)
+ if (kallsyms_sym_address(mid) <= addr)
low = mid;
else
high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
* Search for the first aliased symbol. Aliased
* symbols are symbols with the same address.
*/
- while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+ while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
--low;
- symbol_start = kallsyms_addresses[low];
+ symbol_start = kallsyms_sym_address(low);
/* Search for next non-aliased symbol. */
for (i = low + 1; i < kallsyms_num_syms; i++) {
- if (kallsyms_addresses[i] > symbol_start) {
- symbol_end = kallsyms_addresses[i];
+ if (kallsyms_sym_address(i) > symbol_start) {
+ symbol_end = kallsyms_sym_address(i);
break;
}
}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
unsigned off = iter->nameoff;
iter->module_name[0] = '\0';
- iter->value = kallsyms_addresses[iter->pos];
+ iter->value = kallsyms_sym_address(iter->pos);
iter->type = kallsyms_get_symbol_type(off);
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
+#include <limits.h>
#ifndef ARRAY_SIZE
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
};
static unsigned long long _text;
+static unsigned long long relative_base;
static struct addr_range text_ranges[] = {
{ "_stext", "_etext" },
{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
static int absolute_percpu = 0;
static char symbol_prefix_char = '\0';
static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
int token_profit[0x10000];
@@ -75,7 +78,7 @@ static void usage(void)
fprintf(stderr, "Usage: kallsyms [--all-symbols] "
"[--symbol-prefix=<prefix char>] "
"[--page-offset=<CONFIG_PAGE_OFFSET>] "
- "< in.map > out.S\n");
+ "[--base-relative] < in.map > out.S\n");
exit(1);
}
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
*/
static char *special_symbols[] = {
"kallsyms_addresses",
+ "kallsyms_offsets",
+ "kallsyms_relative_base",
"kallsyms_num_syms",
"kallsyms_names",
"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
printf("\t.section .rodata, \"a\"\n");
- /* Provide proper symbols relocatability by their '_text'
- * relativeness. The symbol names cannot be used to construct
- * normal symbol references as the list of symbols contains
- * symbols that are declared static and are private to their
- * .o files. This prevents .tmp_kallsyms.o or any other
- * object from referencing them.
+ /* Provide proper symbols relocatability by their relativeness
+ * to a fixed anchor point in the runtime image, either '_text'
+ * for absolute address tables, in which case the linker will
+ * emit the final addresses at build time. Otherwise, use the
+ * offset relative to the lowest value encountered of all relative
+ * symbols, and emit non-relocatable fixed offsets that will be fixed
+ * up at runtime.
+ *
+ * The symbol names cannot be used to construct normal symbol
+ * references as the list of symbols contains symbols that are
+ * declared static and are private to their .o files. This prevents
+ * .tmp_kallsyms.o or any other object from referencing them.
*/
- output_label("kallsyms_addresses");
+ if (!base_relative)
+ output_label("kallsyms_addresses");
+ else
+ output_label("kallsyms_offsets");
+
for (i = 0; i < table_cnt; i++) {
- if (!symbol_absolute(&table[i])) {
+ if (base_relative) {
+ long long offset;
+
+ if (symbol_absolute(&table[i])) {
+ offset = table[i].addr;
+ if (offset < 0 || offset > INT_MAX) {
+ fprintf(stderr, "kallsyms failure: "
+ "absolute symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ } else {
+ offset = relative_base - table[i].addr - 1;
+ if (offset < INT_MIN || offset >= 0) {
+ fprintf(stderr, "kallsyms failure: "
+ "relative symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ }
+ printf("\t.long\t%#x\n", (int)offset);
+ } else if (!symbol_absolute(&table[i])) {
if (_text <= table[i].addr)
printf("\tPTR\t_text + %#llx\n",
table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
}
printf("\n");
+ if (base_relative) {
+ output_label("kallsyms_relative_base");
+ printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+ printf("\n");
+ }
+
output_label("kallsyms_num_syms");
printf("\tPTR\t%d\n", table_cnt);
printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
}
}
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+ unsigned int i;
+
+ if (kernel_start_addr > 0) {
+ /*
+ * If the kernel start address was specified, use that as
+ * the relative base rather than going through the table,
+ * since it should be a reasonable default, and values below
+ * it will be ignored anyway.
+ */
+ relative_base = kernel_start_addr;
+ } else {
+ relative_base = -1ULL;
+ for (i = 0; i < table_cnt; i++)
+ if (!symbol_absolute(&table[i]) &&
+ table[i].addr < relative_base)
+ relative_base = table[i].addr;
+ }
+}
+
int main(int argc, char **argv)
{
if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
const char *p = &argv[i][14];
kernel_start_addr = strtoull(p, NULL, 16);
- } else
+ } else if (strcmp(argv[i], "--base-relative") == 0)
+ base_relative = 1;
+ else
usage();
}
} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
read_map(stdin);
if (absolute_percpu)
make_percpus_absolute();
+ if (base_relative)
+ record_relative_base();
sort_symbols();
optimize_token_table();
write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
kallsymopt="${kallsymopt} --absolute-percpu"
fi
+ if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+ kallsymopt="${kallsymopt} --base-relative"
+ fi
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
'kallsyms_names' => 1,
'kallsyms_num_syms' => 1,
'kallsyms_addresses'=> 1,
+ 'kallsyms_offsets' => 1,
+ 'kallsyms_relative_base'=> 1,
'__this_module' => 1,
'_etext' => 1,
'_edata' => 1,
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.
On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits. This saves several hundreds of kilobytes of permanent .rodata on
average. In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values. This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64. Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.
Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.
Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
init/Kconfig | 16 ++++
kernel/kallsyms.c | 38 +++++++--
scripts/kallsyms.c | 88 +++++++++++++++++---
scripts/link-vmlinux.sh | 4 +
scripts/namespace.pl | 2 +
5 files changed, 129 insertions(+), 19 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
Say N unless you really need all symbols.
+config KALLSYMS_BASE_RELATIVE
+ bool
+ depends on KALLSYMS
+ default !IA64
+ help
+ Instead of emitting them as absolute values in the native word size,
+ emit the symbol references in the kallsyms table as 32-bit entries,
+ each containing either an absolute value in the range [0, S32_MAX] or
+ a relative value in the range [base, base + S32_MAX], where base is
+ the lowest relative symbol address encountered in the image.
+
+ On 64-bit builds, this reduces the size of the address table by 50%,
+ but more importantly, it results in entries whose values are build
+ time constants, and no relocation pass is required at runtime to fix
+ up the entries based on the runtime load address of the kernel.
+
config PRINTK
default y
bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
* during the second link stage.
*/
extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
extern const u8 kallsyms_names[] __weak;
/*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
extern const unsigned long kallsyms_num_syms
__attribute__((weak, section(".rodata")));
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
extern const u8 kallsyms_token_table[] __weak;
extern const u16 kallsyms_token_index[] __weak;
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
return name - kallsyms_names;
}
+static unsigned long kallsyms_sym_address(int idx)
+{
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ return kallsyms_addresses[idx];
+
+ /* positive offsets are absolute values */
+ if (kallsyms_offsets[idx] >= 0)
+ return kallsyms_offsets[idx];
+
+ /* negative offsets are relative to kallsyms_relative_base - 1 */
+ return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
{
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
if (strcmp(namebuf, name) == 0)
- return kallsyms_addresses[i];
+ return kallsyms_sym_address(i);
}
return module_kallsyms_lookup_name(name);
}
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
- ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+ ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
if (ret != 0)
return ret;
}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
unsigned long i, low, high, mid;
/* This kernel should never had been booted. */
- BUG_ON(!kallsyms_addresses);
+ if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+ BUG_ON(!kallsyms_addresses);
+ else
+ BUG_ON(!kallsyms_offsets);
/* Do a binary search on the sorted kallsyms_addresses array. */
low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
while (high - low > 1) {
mid = low + (high - low) / 2;
- if (kallsyms_addresses[mid] <= addr)
+ if (kallsyms_sym_address(mid) <= addr)
low = mid;
else
high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
* Search for the first aliased symbol. Aliased
* symbols are symbols with the same address.
*/
- while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+ while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
--low;
- symbol_start = kallsyms_addresses[low];
+ symbol_start = kallsyms_sym_address(low);
/* Search for next non-aliased symbol. */
for (i = low + 1; i < kallsyms_num_syms; i++) {
- if (kallsyms_addresses[i] > symbol_start) {
- symbol_end = kallsyms_addresses[i];
+ if (kallsyms_sym_address(i) > symbol_start) {
+ symbol_end = kallsyms_sym_address(i);
break;
}
}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
unsigned off = iter->nameoff;
iter->module_name[0] = '\0';
- iter->value = kallsyms_addresses[iter->pos];
+ iter->value = kallsyms_sym_address(iter->pos);
iter->type = kallsyms_get_symbol_type(off);
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
+#include <limits.h>
#ifndef ARRAY_SIZE
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
};
static unsigned long long _text;
+static unsigned long long relative_base;
static struct addr_range text_ranges[] = {
{ "_stext", "_etext" },
{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
static int absolute_percpu = 0;
static char symbol_prefix_char = '\0';
static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
int token_profit[0x10000];
@@ -75,7 +78,7 @@ static void usage(void)
fprintf(stderr, "Usage: kallsyms [--all-symbols] "
"[--symbol-prefix=<prefix char>] "
"[--page-offset=<CONFIG_PAGE_OFFSET>] "
- "< in.map > out.S\n");
+ "[--base-relative] < in.map > out.S\n");
exit(1);
}
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
*/
static char *special_symbols[] = {
"kallsyms_addresses",
+ "kallsyms_offsets",
+ "kallsyms_relative_base",
"kallsyms_num_syms",
"kallsyms_names",
"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
printf("\t.section .rodata, \"a\"\n");
- /* Provide proper symbols relocatability by their '_text'
- * relativeness. The symbol names cannot be used to construct
- * normal symbol references as the list of symbols contains
- * symbols that are declared static and are private to their
- * .o files. This prevents .tmp_kallsyms.o or any other
- * object from referencing them.
+ /* Provide proper symbols relocatability by their relativeness
+ * to a fixed anchor point in the runtime image, either '_text'
+ * for absolute address tables, in which case the linker will
+ * emit the final addresses at build time. Otherwise, use the
+ * offset relative to the lowest value encountered of all relative
+ * symbols, and emit non-relocatable fixed offsets that will be fixed
+ * up at runtime.
+ *
+ * The symbol names cannot be used to construct normal symbol
+ * references as the list of symbols contains symbols that are
+ * declared static and are private to their .o files. This prevents
+ * .tmp_kallsyms.o or any other object from referencing them.
*/
- output_label("kallsyms_addresses");
+ if (!base_relative)
+ output_label("kallsyms_addresses");
+ else
+ output_label("kallsyms_offsets");
+
for (i = 0; i < table_cnt; i++) {
- if (!symbol_absolute(&table[i])) {
+ if (base_relative) {
+ long long offset;
+
+ if (symbol_absolute(&table[i])) {
+ offset = table[i].addr;
+ if (offset < 0 || offset > INT_MAX) {
+ fprintf(stderr, "kallsyms failure: "
+ "absolute symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ } else {
+ offset = relative_base - table[i].addr - 1;
+ if (offset < INT_MIN || offset >= 0) {
+ fprintf(stderr, "kallsyms failure: "
+ "relative symbol value %#llx out of range in relative mode\n",
+ table[i].addr);
+ exit(EXIT_FAILURE);
+ }
+ }
+ printf("\t.long\t%#x\n", (int)offset);
+ } else if (!symbol_absolute(&table[i])) {
if (_text <= table[i].addr)
printf("\tPTR\t_text + %#llx\n",
table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
}
printf("\n");
+ if (base_relative) {
+ output_label("kallsyms_relative_base");
+ printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+ printf("\n");
+ }
+
output_label("kallsyms_num_syms");
printf("\tPTR\t%d\n", table_cnt);
printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
}
}
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+ unsigned int i;
+
+ if (kernel_start_addr > 0) {
+ /*
+ * If the kernel start address was specified, use that as
+ * the relative base rather than going through the table,
+ * since it should be a reasonable default, and values below
+ * it will be ignored anyway.
+ */
+ relative_base = kernel_start_addr;
+ } else {
+ relative_base = -1ULL;
+ for (i = 0; i < table_cnt; i++)
+ if (!symbol_absolute(&table[i]) &&
+ table[i].addr < relative_base)
+ relative_base = table[i].addr;
+ }
+}
+
int main(int argc, char **argv)
{
if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
const char *p = &argv[i][14];
kernel_start_addr = strtoull(p, NULL, 16);
- } else
+ } else if (strcmp(argv[i], "--base-relative") == 0)
+ base_relative = 1;
+ else
usage();
}
} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
read_map(stdin);
if (absolute_percpu)
make_percpus_absolute();
+ if (base_relative)
+ record_relative_base();
sort_symbols();
optimize_token_table();
write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
kallsymopt="${kallsymopt} --absolute-percpu"
fi
+ if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+ kallsymopt="${kallsymopt} --base-relative"
+ fi
+
local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
'kallsyms_names' => 1,
'kallsyms_num_syms' => 1,
'kallsyms_addresses'=> 1,
+ 'kallsyms_offsets' => 1,
+ 'kallsyms_relative_base'=> 1,
'__this_module' => 1,
'_etext' => 1,
'_edata' => 1,
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.
This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 12 ++++++++
arch/arm64/Makefile | 4 +++
arch/arm64/include/asm/elf.h | 2 ++
arch/arm64/kernel/head.S | 32 ++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
5 files changed, 66 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
+ select KALLSYMS_TEXT_RELATIVE
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
select ARM64_MODULE_CMODEL_LARGE
select HAVE_MOD_ARCH_SPECIFIC
+config RELOCATABLE
+ bool
+ help
+ This builds the kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since AArch64 uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
OBJCOPYFLAGS :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
GZFLAGS :=-9
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux += -pie
+endif
+
KBUILD_DEFCONFIG := defconfig
# Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
#define R_AARCH64_MOVW_PREL_G2_NC 292
#define R_AARCH64_MOVW_PREL_G3 293
+#define R_AARCH64_RELATIVE 1027
+
/*
* These are used to set parameters in the core dumps.
*/
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
#include <asm/asm-offsets.h>
#include <asm/cache.h>
#include <asm/cputype.h>
+#include <asm/elf.h>
#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
bl __pi_memset
dsb ishst // Make zero page visible to PTW
+#ifdef CONFIG_RELOCATABLE
+
+ /*
+ * Iterate over each entry in the relocation table, and apply the
+ * relocations in place.
+ */
+ adr_l x8, __dynsym_start // start of symbol table
+ adr_l x9, __reloc_start // start of reloc table
+ adr_l x10, __reloc_end // end of reloc table
+
+0: cmp x9, x10
+ b.hs 2f
+ ldp x11, x12, [x9], #24
+ ldr x13, [x9, #-8]
+ cmp w12, #R_AARCH64_RELATIVE
+ b.ne 1f
+ str x13, [x11]
+ b 0b
+
+1: cmp w12, #R_AARCH64_ABS64
+ b.ne 0b
+ add x12, x12, x12, lsl #1 // symtab offset: 24x top word
+ add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldr x15, [x12, #8] // Elf64_Sym::st_value
+ add x15, x13, x15
+ str x15, [x11]
+ b 0b
+
+2:
+#endif
+
adr_l sp, initial_sp, x4
mov x4, sp
and x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
EXIT_CALL
*(.discard)
*(.discard.*)
+ *(.interp .dynamic)
}
. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
.altinstr_replacement : {
*(.altinstr_replacement)
}
+ .rela : ALIGN(8) {
+ __reloc_start = .;
+ *(.rela .rela*)
+ __reloc_end = .;
+ }
+ .dynsym : ALIGN(8) {
+ __dynsym_start = .;
+ *(.dynsym)
+ }
+ .dynstr : {
+ *(.dynstr)
+ }
+ .hash : {
+ *(.hash)
+ }
. = ALIGN(PAGE_SIZE);
__init_end = .;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.
This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 12 ++++++++
arch/arm64/Makefile | 4 +++
arch/arm64/include/asm/elf.h | 2 ++
arch/arm64/kernel/head.S | 32 ++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
5 files changed, 66 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
+ select KALLSYMS_TEXT_RELATIVE
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
select ARM64_MODULE_CMODEL_LARGE
select HAVE_MOD_ARCH_SPECIFIC
+config RELOCATABLE
+ bool
+ help
+ This builds the kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since AArch64 uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
OBJCOPYFLAGS :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
GZFLAGS :=-9
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux += -pie
+endif
+
KBUILD_DEFCONFIG := defconfig
# Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
#define R_AARCH64_MOVW_PREL_G2_NC 292
#define R_AARCH64_MOVW_PREL_G3 293
+#define R_AARCH64_RELATIVE 1027
+
/*
* These are used to set parameters in the core dumps.
*/
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
#include <asm/asm-offsets.h>
#include <asm/cache.h>
#include <asm/cputype.h>
+#include <asm/elf.h>
#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
bl __pi_memset
dsb ishst // Make zero page visible to PTW
+#ifdef CONFIG_RELOCATABLE
+
+ /*
+ * Iterate over each entry in the relocation table, and apply the
+ * relocations in place.
+ */
+ adr_l x8, __dynsym_start // start of symbol table
+ adr_l x9, __reloc_start // start of reloc table
+ adr_l x10, __reloc_end // end of reloc table
+
+0: cmp x9, x10
+ b.hs 2f
+ ldp x11, x12, [x9], #24
+ ldr x13, [x9, #-8]
+ cmp w12, #R_AARCH64_RELATIVE
+ b.ne 1f
+ str x13, [x11]
+ b 0b
+
+1: cmp w12, #R_AARCH64_ABS64
+ b.ne 0b
+ add x12, x12, x12, lsl #1 // symtab offset: 24x top word
+ add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldr x15, [x12, #8] // Elf64_Sym::st_value
+ add x15, x13, x15
+ str x15, [x11]
+ b 0b
+
+2:
+#endif
+
adr_l sp, initial_sp, x4
mov x4, sp
and x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
EXIT_CALL
*(.discard)
*(.discard.*)
+ *(.interp .dynamic)
}
. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
.altinstr_replacement : {
*(.altinstr_replacement)
}
+ .rela : ALIGN(8) {
+ __reloc_start = .;
+ *(.rela .rela*)
+ __reloc_end = .;
+ }
+ .dynsym : ALIGN(8) {
+ __dynsym_start = .;
+ *(.dynsym)
+ }
+ .dynstr : {
+ *(.dynstr)
+ }
+ .hash : {
+ *(.hash)
+ }
. = ALIGN(PAGE_SIZE);
__init_end = .;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.
This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 12 ++++++++
arch/arm64/Makefile | 4 +++
arch/arm64/include/asm/elf.h | 2 ++
arch/arm64/kernel/head.S | 32 ++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
5 files changed, 66 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
+ select KALLSYMS_TEXT_RELATIVE
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
select ARM64_MODULE_CMODEL_LARGE
select HAVE_MOD_ARCH_SPECIFIC
+config RELOCATABLE
+ bool
+ help
+ This builds the kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since AArch64 uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
OBJCOPYFLAGS :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
GZFLAGS :=-9
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux += -pie
+endif
+
KBUILD_DEFCONFIG := defconfig
# Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
#define R_AARCH64_MOVW_PREL_G2_NC 292
#define R_AARCH64_MOVW_PREL_G3 293
+#define R_AARCH64_RELATIVE 1027
+
/*
* These are used to set parameters in the core dumps.
*/
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
#include <asm/asm-offsets.h>
#include <asm/cache.h>
#include <asm/cputype.h>
+#include <asm/elf.h>
#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
bl __pi_memset
dsb ishst // Make zero page visible to PTW
+#ifdef CONFIG_RELOCATABLE
+
+ /*
+ * Iterate over each entry in the relocation table, and apply the
+ * relocations in place.
+ */
+ adr_l x8, __dynsym_start // start of symbol table
+ adr_l x9, __reloc_start // start of reloc table
+ adr_l x10, __reloc_end // end of reloc table
+
+0: cmp x9, x10
+ b.hs 2f
+ ldp x11, x12, [x9], #24
+ ldr x13, [x9, #-8]
+ cmp w12, #R_AARCH64_RELATIVE
+ b.ne 1f
+ str x13, [x11]
+ b 0b
+
+1: cmp w12, #R_AARCH64_ABS64
+ b.ne 0b
+ add x12, x12, x12, lsl #1 // symtab offset: 24x top word
+ add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldr x15, [x12, #8] // Elf64_Sym::st_value
+ add x15, x13, x15
+ str x15, [x11]
+ b 0b
+
+2:
+#endif
+
adr_l sp, initial_sp, x4
mov x4, sp
and x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
EXIT_CALL
*(.discard)
*(.discard.*)
+ *(.interp .dynamic)
}
. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
.altinstr_replacement : {
*(.altinstr_replacement)
}
+ .rela : ALIGN(8) {
+ __reloc_start = .;
+ *(.rela .rela*)
+ __reloc_end = .;
+ }
+ .dynsym : ALIGN(8) {
+ __dynsym_start = .;
+ *(.dynsym)
+ }
+ .dynstr : {
+ *(.dynstr)
+ }
+ .hash : {
+ *(.hash)
+ }
. = ALIGN(PAGE_SIZE);
__init_end = .;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 18/22] arm64: add support for kernel ASLR
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 14 ++
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/head.S | 59 ++++++-
arch/arm64/kernel/kaslr.c | 169 ++++++++++++++++++++
arch/arm64/kernel/module.c | 8 +-
arch/arm64/kernel/setup.c | 29 ++++
arch/arm64/mm/mmu.c | 33 ++--
8 files changed, 298 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
relocation pass at runtime even if the kernel is loaded at the
same address it was linked at.
+config RANDOMIZE_BASE
+ bool "Randomize the address of the kernel image"
+ select ARM64_MODULE_PLTS
+ select RELOCATABLE
+ help
+ Randomizes the virtual address at which the kernel image is
+ loaded, as a security feature that deters exploit attempts
+ relying on knowledge of the location of kernel internals.
+
+ It is the bootloader's job to provide entropy, by passing a
+ random u64 value in /chosen/kaslr-seed at kernel entry.
+
+ If unsure, say N.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
#define KIMAGE_VADDR (MODULES_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE (SZ_64M)
+#define MODULES_VSIZE (SZ_128M)
#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64 kimage_vaddr;
+
/* the offset between the kernel virtual and physical mappings */
extern u64 kimage_voffset;
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
arm64-obj-$(CONFIG_ACPI) += acpi.o
arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(arm64-obj-y) vdso/
obj-m += $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
ENTRY(stext)
bl preserve_boot_args
bl el2_setup // Drop to EL1, w20=cpu_boot_mode
+ mov x23, xzr // KASLR offset, defaults to 0
adrp x24, __PHYS_OFFSET
bl set_cpu_boot_mode_flag
bl __create_page_tables // x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
__create_page_tables:
adrp x25, idmap_pg_dir
adrp x26, swapper_pg_dir
- mov x27, lr
+ mov x28, lr
/*
* Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
*/
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
+ add x5, x5, x23 // add KASLR displacement
create_pgd_entry x0, x5, x3, x6
ldr w6, kernel_img_size
add x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
dmb sy
bl __inval_cache_range
- mov lr, x27
- ret
+ ret x28
ENDPROC(__create_page_tables)
kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ mov x28, lr // preserve LR
adr_l x8, vectors // load VBAR_EL1 with virtual
msr vbar_el1, x8 // vector table address
isb
@@ -449,19 +451,26 @@ __mmap_switched:
ldr x13, [x9, #-8]
cmp w12, #R_AARCH64_RELATIVE
b.ne 1f
- str x13, [x11]
+ add x13, x13, x23 // relocate
+ str x13, [x11, x23]
b 0b
1: cmp w12, #R_AARCH64_ABS64
b.ne 0b
add x12, x12, x12, lsl #1 // symtab offset: 24x top word
add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldrsh w14, [x12, #6] // Elf64_Sym::st_shndx
ldr x15, [x12, #8] // Elf64_Sym::st_value
+ cmp w14, #-0xf // SHN_ABS (0xfff1) ?
+ add x14, x15, x23 // relocate
+ csel x15, x14, x15, ne
add x15, x13, x15
- str x15, [x11]
+ str x15, [x11, x23]
b 0b
-2:
+2: adr_l x8, kimage_vaddr // make relocated kimage_vaddr
+ dc cvac, x8 // value visible to secondaries
+ dsb sy // with MMU off
#endif
adr_l sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- ldr x4, =KIMAGE_VADDR // Save the offset between
+ ldr_l x4, kimage_vaddr // Save the offset between
sub x4, x4, x24 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
@@ -478,6 +487,15 @@ __mmap_switched:
#ifdef CONFIG_KASAN
bl kasan_early_init
#endif
+#ifdef CONFIG_RANDOMIZE_BASE
+ cbnz x23, 0f // already running randomized?
+ mov x0, x21 // pass FDT address in x0
+ bl kaslr_early_init // parse FDT for KASLR options
+ cbz x0, 0f // KASLR disabled? just proceed
+ ret x28 // we must enable KASLR, return
+ // to __enable_mmu()
+0:
+#endif
b start_kernel
ENDPROC(__mmap_switched)
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
* hotplug and needs to have the same protections as the text region
*/
.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+ .quad _text - TEXT_OFFSET
+
/*
* If we're fortunate enough to boot at EL2, ensure that the world is
* sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x8, =KIMAGE_VADDR
+ ldr x8, kimage_vaddr
ldr w9, 0f
sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
*/
.section ".idmap.text", "ax"
__enable_mmu:
+ mrs x18, sctlr_el1 // preserve old SCTLR_EL1 value
mrs x1, ID_AA64MMFR0_EL1
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
ic iallu
dsb nsh
isb
+#ifdef CONFIG_RANDOMIZE_BASE
+ mov x19, x0 // preserve new SCTLR_EL1 value
+ blr x27
+
+ /*
+ * If we return here, we have a KASLR displacement in x0 which we need
+ * to record and take into account by discarding the current kernel
+ * mapping and creating a new one.
+ */
+ mov x23, x0 // record the KASLR offset
+ msr sctlr_el1, x18 // disable the MMU
+ isb
+ bl __create_page_tables // recreate kernel mapping
+
+ msr sctlr_el1, x19 // re-enable the MMU
+ isb
+ ic ialluis // flush instructions fetched
+ isb // via old mapping
+ add x27, x27, x23 // relocated __mmap_switched
+#endif
br x27
ENDPROC(__enable_mmu)
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+ int node, len;
+ u64 *prop;
+ u64 ret;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ return 0;
+
+ prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+ if (!prop || len != sizeof(u64))
+ return 0;
+
+ ret = fdt64_to_cpu(*prop);
+ *prop = 0;
+ return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+ if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+ int node;
+ const u8 *prop;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ goto out;
+
+ prop = fdt_getprop(fdt, node, "bootargs", NULL);
+ if (!prop)
+ goto out;
+ return prop;
+ }
+out:
+ return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+ u64 stack_start = (u64)&init_thread_union.stack;
+ u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+ u32 crc;
+
+ crc = crc32_le(~0, _text, stack_start - (u64)_text);
+ crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+ return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+ pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+ void *fdt;
+ u64 seed, offset, mask, module_range;
+ const u8 *cmdline, *str;
+ int size;
+ u32 crc;
+
+ /*
+ * Record the CRC of the entire [_text, _edata] interval, except the
+ * region we are using for the stack. If we detect any changes made
+ * during the course of this function, we bail. This may seem a bit
+ * drastic, but in this case , we have no way of guaranteeing we won't
+ * corrupt anything by moving the kernel image before reentering it.
+ */
+ crc = get_kernel_crc();
+
+ /*
+ * Try to map the FDT early. If this fails, we simply bail,
+ * and proceed with KASLR disabled. We will make another
+ * attempt at mapping the FDT in setup_machine()
+ */
+ early_fixmap_init();
+ fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+ if (!fdt)
+ return 0;
+
+ /*
+ * Retrieve (and wipe) the seed from the FDT
+ */
+ seed = get_kaslr_seed(fdt);
+ if (!seed)
+ return 0;
+
+ /*
+ * Check if 'nokaslr' appears on the command line, and
+ * return 0 if that is the case.
+ */
+ cmdline = get_cmdline(fdt);
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ return 0;
+
+ /* check if we made any inadvertent changes to the kernel text */
+ if (crc != get_kernel_crc())
+ return 0;
+
+ /*
+ * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+ * kernel image offset from the seed. Let's place the kernel in the
+ * lower half of the VMALLOC area (VA_BITS - 2).
+ * Even if we could randomize at page granularity for 16k and 64k pages,
+ * let's always round to 2 MB so we don't interfere with the ability to
+ * map using contiguous PTEs
+ */
+ mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+ offset = seed & mask;
+
+ /*
+ * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+ * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+ * happens, increase the KASLR offset by the size of the kernel image.
+ */
+ if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+ (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+ offset = (offset + (u64)(_end - _text)) & mask;
+
+ /*
+ * Randomize the module region, by setting module_load_offset to
+ * a PAGE_SIZE multiple in the interval [0, module_range). This
+ * ensures that the resulting region still covers [_stext, _etext],
+ * and that all relative branches can be resolved without veneers.
+ */
+ module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+ module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+ return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
void *module_alloc(unsigned long size)
{
void *p;
+ u64 base = (u64)_etext - MODULES_VSIZE;
- p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ extern u32 module_load_offset;
+ base += module_load_offset;
+ }
+
+ p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
return 0;
}
subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+ void *p)
+{
+ u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+ pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+ kaslr_offset, KIMAGE_VADDR);
+ } else {
+ pr_emerg("Kernel Offset: disabled\n");
+ }
+ return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+ .notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &kernel_offset_notifier);
+ return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ if (CONFIG_PGTABLE_LEVELS > 3 &&
+ !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
/*
* We only end up here if the kernel mapping and the fixmap
* share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
}
}
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
{
const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
- pgprot_t prot = PAGE_KERNEL_RO;
- int size, offset;
+ int offset;
void *dt_virt;
/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
if (fdt_check_header(dt_virt) != 0)
return NULL;
- size = fdt_totalsize(dt_virt);
- if (size > MAX_FDT_SIZE)
+ *size = fdt_totalsize(dt_virt);
+ if (*size > MAX_FDT_SIZE)
return NULL;
- if (offset + size > SWAPPER_BLOCK_SIZE)
- create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
- round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+ if (offset + *size > SWAPPER_BLOCK_SIZE)
+ create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+ dt_virt_base,
+ round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+ prot);
- memblock_reserve(dt_phys, size);
+ return dt_virt;
+}
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+ void *dt_virt;
+ int size;
+
+ dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+ if (!dt_virt)
+ return NULL;
+
+ memblock_reserve(dt_phys, size);
return dt_virt;
}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 18/22] arm64: add support for kernel ASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 14 ++
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/head.S | 59 ++++++-
arch/arm64/kernel/kaslr.c | 169 ++++++++++++++++++++
arch/arm64/kernel/module.c | 8 +-
arch/arm64/kernel/setup.c | 29 ++++
arch/arm64/mm/mmu.c | 33 ++--
8 files changed, 298 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
relocation pass at runtime even if the kernel is loaded at the
same address it was linked at.
+config RANDOMIZE_BASE
+ bool "Randomize the address of the kernel image"
+ select ARM64_MODULE_PLTS
+ select RELOCATABLE
+ help
+ Randomizes the virtual address at which the kernel image is
+ loaded, as a security feature that deters exploit attempts
+ relying on knowledge of the location of kernel internals.
+
+ It is the bootloader's job to provide entropy, by passing a
+ random u64 value in /chosen/kaslr-seed at kernel entry.
+
+ If unsure, say N.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
#define KIMAGE_VADDR (MODULES_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE (SZ_64M)
+#define MODULES_VSIZE (SZ_128M)
#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64 kimage_vaddr;
+
/* the offset between the kernel virtual and physical mappings */
extern u64 kimage_voffset;
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
arm64-obj-$(CONFIG_ACPI) += acpi.o
arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(arm64-obj-y) vdso/
obj-m += $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
ENTRY(stext)
bl preserve_boot_args
bl el2_setup // Drop to EL1, w20=cpu_boot_mode
+ mov x23, xzr // KASLR offset, defaults to 0
adrp x24, __PHYS_OFFSET
bl set_cpu_boot_mode_flag
bl __create_page_tables // x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
__create_page_tables:
adrp x25, idmap_pg_dir
adrp x26, swapper_pg_dir
- mov x27, lr
+ mov x28, lr
/*
* Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
*/
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
+ add x5, x5, x23 // add KASLR displacement
create_pgd_entry x0, x5, x3, x6
ldr w6, kernel_img_size
add x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
dmb sy
bl __inval_cache_range
- mov lr, x27
- ret
+ ret x28
ENDPROC(__create_page_tables)
kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ mov x28, lr // preserve LR
adr_l x8, vectors // load VBAR_EL1 with virtual
msr vbar_el1, x8 // vector table address
isb
@@ -449,19 +451,26 @@ __mmap_switched:
ldr x13, [x9, #-8]
cmp w12, #R_AARCH64_RELATIVE
b.ne 1f
- str x13, [x11]
+ add x13, x13, x23 // relocate
+ str x13, [x11, x23]
b 0b
1: cmp w12, #R_AARCH64_ABS64
b.ne 0b
add x12, x12, x12, lsl #1 // symtab offset: 24x top word
add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldrsh w14, [x12, #6] // Elf64_Sym::st_shndx
ldr x15, [x12, #8] // Elf64_Sym::st_value
+ cmp w14, #-0xf // SHN_ABS (0xfff1) ?
+ add x14, x15, x23 // relocate
+ csel x15, x14, x15, ne
add x15, x13, x15
- str x15, [x11]
+ str x15, [x11, x23]
b 0b
-2:
+2: adr_l x8, kimage_vaddr // make relocated kimage_vaddr
+ dc cvac, x8 // value visible to secondaries
+ dsb sy // with MMU off
#endif
adr_l sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- ldr x4, =KIMAGE_VADDR // Save the offset between
+ ldr_l x4, kimage_vaddr // Save the offset between
sub x4, x4, x24 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
@@ -478,6 +487,15 @@ __mmap_switched:
#ifdef CONFIG_KASAN
bl kasan_early_init
#endif
+#ifdef CONFIG_RANDOMIZE_BASE
+ cbnz x23, 0f // already running randomized?
+ mov x0, x21 // pass FDT address in x0
+ bl kaslr_early_init // parse FDT for KASLR options
+ cbz x0, 0f // KASLR disabled? just proceed
+ ret x28 // we must enable KASLR, return
+ // to __enable_mmu()
+0:
+#endif
b start_kernel
ENDPROC(__mmap_switched)
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
* hotplug and needs to have the same protections as the text region
*/
.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+ .quad _text - TEXT_OFFSET
+
/*
* If we're fortunate enough to boot at EL2, ensure that the world is
* sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x8, =KIMAGE_VADDR
+ ldr x8, kimage_vaddr
ldr w9, 0f
sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
*/
.section ".idmap.text", "ax"
__enable_mmu:
+ mrs x18, sctlr_el1 // preserve old SCTLR_EL1 value
mrs x1, ID_AA64MMFR0_EL1
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
ic iallu
dsb nsh
isb
+#ifdef CONFIG_RANDOMIZE_BASE
+ mov x19, x0 // preserve new SCTLR_EL1 value
+ blr x27
+
+ /*
+ * If we return here, we have a KASLR displacement in x0 which we need
+ * to record and take into account by discarding the current kernel
+ * mapping and creating a new one.
+ */
+ mov x23, x0 // record the KASLR offset
+ msr sctlr_el1, x18 // disable the MMU
+ isb
+ bl __create_page_tables // recreate kernel mapping
+
+ msr sctlr_el1, x19 // re-enable the MMU
+ isb
+ ic ialluis // flush instructions fetched
+ isb // via old mapping
+ add x27, x27, x23 // relocated __mmap_switched
+#endif
br x27
ENDPROC(__enable_mmu)
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+ int node, len;
+ u64 *prop;
+ u64 ret;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ return 0;
+
+ prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+ if (!prop || len != sizeof(u64))
+ return 0;
+
+ ret = fdt64_to_cpu(*prop);
+ *prop = 0;
+ return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+ if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+ int node;
+ const u8 *prop;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ goto out;
+
+ prop = fdt_getprop(fdt, node, "bootargs", NULL);
+ if (!prop)
+ goto out;
+ return prop;
+ }
+out:
+ return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+ u64 stack_start = (u64)&init_thread_union.stack;
+ u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+ u32 crc;
+
+ crc = crc32_le(~0, _text, stack_start - (u64)_text);
+ crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+ return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+ pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+ void *fdt;
+ u64 seed, offset, mask, module_range;
+ const u8 *cmdline, *str;
+ int size;
+ u32 crc;
+
+ /*
+ * Record the CRC of the entire [_text, _edata] interval, except the
+ * region we are using for the stack. If we detect any changes made
+ * during the course of this function, we bail. This may seem a bit
+ * drastic, but in this case , we have no way of guaranteeing we won't
+ * corrupt anything by moving the kernel image before reentering it.
+ */
+ crc = get_kernel_crc();
+
+ /*
+ * Try to map the FDT early. If this fails, we simply bail,
+ * and proceed with KASLR disabled. We will make another
+ * attempt at mapping the FDT in setup_machine()
+ */
+ early_fixmap_init();
+ fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+ if (!fdt)
+ return 0;
+
+ /*
+ * Retrieve (and wipe) the seed from the FDT
+ */
+ seed = get_kaslr_seed(fdt);
+ if (!seed)
+ return 0;
+
+ /*
+ * Check if 'nokaslr' appears on the command line, and
+ * return 0 if that is the case.
+ */
+ cmdline = get_cmdline(fdt);
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ return 0;
+
+ /* check if we made any inadvertent changes to the kernel text */
+ if (crc != get_kernel_crc())
+ return 0;
+
+ /*
+ * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+ * kernel image offset from the seed. Let's place the kernel in the
+ * lower half of the VMALLOC area (VA_BITS - 2).
+ * Even if we could randomize at page granularity for 16k and 64k pages,
+ * let's always round to 2 MB so we don't interfere with the ability to
+ * map using contiguous PTEs
+ */
+ mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+ offset = seed & mask;
+
+ /*
+ * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+ * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+ * happens, increase the KASLR offset by the size of the kernel image.
+ */
+ if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+ (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+ offset = (offset + (u64)(_end - _text)) & mask;
+
+ /*
+ * Randomize the module region, by setting module_load_offset to
+ * a PAGE_SIZE multiple in the interval [0, module_range). This
+ * ensures that the resulting region still covers [_stext, _etext],
+ * and that all relative branches can be resolved without veneers.
+ */
+ module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+ module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+ return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
void *module_alloc(unsigned long size)
{
void *p;
+ u64 base = (u64)_etext - MODULES_VSIZE;
- p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ extern u32 module_load_offset;
+ base += module_load_offset;
+ }
+
+ p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
return 0;
}
subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+ void *p)
+{
+ u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+ pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+ kaslr_offset, KIMAGE_VADDR);
+ } else {
+ pr_emerg("Kernel Offset: disabled\n");
+ }
+ return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+ .notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &kernel_offset_notifier);
+ return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ if (CONFIG_PGTABLE_LEVELS > 3 &&
+ !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
/*
* We only end up here if the kernel mapping and the fixmap
* share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
}
}
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
{
const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
- pgprot_t prot = PAGE_KERNEL_RO;
- int size, offset;
+ int offset;
void *dt_virt;
/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
if (fdt_check_header(dt_virt) != 0)
return NULL;
- size = fdt_totalsize(dt_virt);
- if (size > MAX_FDT_SIZE)
+ *size = fdt_totalsize(dt_virt);
+ if (*size > MAX_FDT_SIZE)
return NULL;
- if (offset + size > SWAPPER_BLOCK_SIZE)
- create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
- round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+ if (offset + *size > SWAPPER_BLOCK_SIZE)
+ create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+ dt_virt_base,
+ round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+ prot);
- memblock_reserve(dt_phys, size);
+ return dt_virt;
+}
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+ void *dt_virt;
+ int size;
+
+ dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+ if (!dt_virt)
+ return NULL;
+
+ memblock_reserve(dt_phys, size);
return dt_virt;
}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 18/22] arm64: add support for kernel ASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 14 ++
arch/arm64/include/asm/memory.h | 5 +-
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/head.S | 59 ++++++-
arch/arm64/kernel/kaslr.c | 169 ++++++++++++++++++++
arch/arm64/kernel/module.c | 8 +-
arch/arm64/kernel/setup.c | 29 ++++
arch/arm64/mm/mmu.c | 33 ++--
8 files changed, 298 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
relocation pass at runtime even if the kernel is loaded at the
same address it was linked at.
+config RANDOMIZE_BASE
+ bool "Randomize the address of the kernel image"
+ select ARM64_MODULE_PLTS
+ select RELOCATABLE
+ help
+ Randomizes the virtual address at which the kernel image is
+ loaded, as a security feature that deters exploit attempts
+ relying on knowledge of the location of kernel internals.
+
+ It is the bootloader's job to provide entropy, by passing a
+ random u64 value in /chosen/kaslr-seed at kernel entry.
+
+ If unsure, say N.
+
endmenu
menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
#define KIMAGE_VADDR (MODULES_END)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE (SZ_64M)
+#define MODULES_VSIZE (SZ_128M)
#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ memstart_addr; })
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64 kimage_vaddr;
+
/* the offset between the kernel virtual and physical mappings */
extern u64 kimage_voffset;
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
arm64-obj-$(CONFIG_ACPI) += acpi.o
arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(arm64-obj-y) vdso/
obj-m += $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
ENTRY(stext)
bl preserve_boot_args
bl el2_setup // Drop to EL1, w20=cpu_boot_mode
+ mov x23, xzr // KASLR offset, defaults to 0
adrp x24, __PHYS_OFFSET
bl set_cpu_boot_mode_flag
bl __create_page_tables // x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
__create_page_tables:
adrp x25, idmap_pg_dir
adrp x26, swapper_pg_dir
- mov x27, lr
+ mov x28, lr
/*
* Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
*/
mov x0, x26 // swapper_pg_dir
ldr x5, =KIMAGE_VADDR
+ add x5, x5, x23 // add KASLR displacement
create_pgd_entry x0, x5, x3, x6
ldr w6, kernel_img_size
add x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
dmb sy
bl __inval_cache_range
- mov lr, x27
- ret
+ ret x28
ENDPROC(__create_page_tables)
kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
+ mov x28, lr // preserve LR
adr_l x8, vectors // load VBAR_EL1 with virtual
msr vbar_el1, x8 // vector table address
isb
@@ -449,19 +451,26 @@ __mmap_switched:
ldr x13, [x9, #-8]
cmp w12, #R_AARCH64_RELATIVE
b.ne 1f
- str x13, [x11]
+ add x13, x13, x23 // relocate
+ str x13, [x11, x23]
b 0b
1: cmp w12, #R_AARCH64_ABS64
b.ne 0b
add x12, x12, x12, lsl #1 // symtab offset: 24x top word
add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldrsh w14, [x12, #6] // Elf64_Sym::st_shndx
ldr x15, [x12, #8] // Elf64_Sym::st_value
+ cmp w14, #-0xf // SHN_ABS (0xfff1) ?
+ add x14, x15, x23 // relocate
+ csel x15, x14, x15, ne
add x15, x13, x15
- str x15, [x11]
+ str x15, [x11, x23]
b 0b
-2:
+2: adr_l x8, kimage_vaddr // make relocated kimage_vaddr
+ dc cvac, x8 // value visible to secondaries
+ dsb sy // with MMU off
#endif
adr_l sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- ldr x4, =KIMAGE_VADDR // Save the offset between
+ ldr_l x4, kimage_vaddr // Save the offset between
sub x4, x4, x24 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
@@ -478,6 +487,15 @@ __mmap_switched:
#ifdef CONFIG_KASAN
bl kasan_early_init
#endif
+#ifdef CONFIG_RANDOMIZE_BASE
+ cbnz x23, 0f // already running randomized?
+ mov x0, x21 // pass FDT address in x0
+ bl kaslr_early_init // parse FDT for KASLR options
+ cbz x0, 0f // KASLR disabled? just proceed
+ ret x28 // we must enable KASLR, return
+ // to __enable_mmu()
+0:
+#endif
b start_kernel
ENDPROC(__mmap_switched)
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
* hotplug and needs to have the same protections as the text region
*/
.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+ .quad _text - TEXT_OFFSET
+
/*
* If we're fortunate enough to boot at EL2, ensure that the world is
* sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x8, =KIMAGE_VADDR
+ ldr x8, kimage_vaddr
ldr w9, 0f
sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
*/
.section ".idmap.text", "ax"
__enable_mmu:
+ mrs x18, sctlr_el1 // preserve old SCTLR_EL1 value
mrs x1, ID_AA64MMFR0_EL1
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
ic iallu
dsb nsh
isb
+#ifdef CONFIG_RANDOMIZE_BASE
+ mov x19, x0 // preserve new SCTLR_EL1 value
+ blr x27
+
+ /*
+ * If we return here, we have a KASLR displacement in x0 which we need
+ * to record and take into account by discarding the current kernel
+ * mapping and creating a new one.
+ */
+ mov x23, x0 // record the KASLR offset
+ msr sctlr_el1, x18 // disable the MMU
+ isb
+ bl __create_page_tables // recreate kernel mapping
+
+ msr sctlr_el1, x19 // re-enable the MMU
+ isb
+ ic ialluis // flush instructions fetched
+ isb // via old mapping
+ add x27, x27, x23 // relocated __mmap_switched
+#endif
br x27
ENDPROC(__enable_mmu)
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+ int node, len;
+ u64 *prop;
+ u64 ret;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ return 0;
+
+ prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+ if (!prop || len != sizeof(u64))
+ return 0;
+
+ ret = fdt64_to_cpu(*prop);
+ *prop = 0;
+ return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+ if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+ int node;
+ const u8 *prop;
+
+ node = fdt_path_offset(fdt, "/chosen");
+ if (node < 0)
+ goto out;
+
+ prop = fdt_getprop(fdt, node, "bootargs", NULL);
+ if (!prop)
+ goto out;
+ return prop;
+ }
+out:
+ return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+ u64 stack_start = (u64)&init_thread_union.stack;
+ u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+ u32 crc;
+
+ crc = crc32_le(~0, _text, stack_start - (u64)_text);
+ crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+ return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+ pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+ void *fdt;
+ u64 seed, offset, mask, module_range;
+ const u8 *cmdline, *str;
+ int size;
+ u32 crc;
+
+ /*
+ * Record the CRC of the entire [_text, _edata] interval, except the
+ * region we are using for the stack. If we detect any changes made
+ * during the course of this function, we bail. This may seem a bit
+ * drastic, but in this case , we have no way of guaranteeing we won't
+ * corrupt anything by moving the kernel image before reentering it.
+ */
+ crc = get_kernel_crc();
+
+ /*
+ * Try to map the FDT early. If this fails, we simply bail,
+ * and proceed with KASLR disabled. We will make another
+ * attempt@mapping the FDT in setup_machine()
+ */
+ early_fixmap_init();
+ fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+ if (!fdt)
+ return 0;
+
+ /*
+ * Retrieve (and wipe) the seed from the FDT
+ */
+ seed = get_kaslr_seed(fdt);
+ if (!seed)
+ return 0;
+
+ /*
+ * Check if 'nokaslr' appears on the command line, and
+ * return 0 if that is the case.
+ */
+ cmdline = get_cmdline(fdt);
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ return 0;
+
+ /* check if we made any inadvertent changes to the kernel text */
+ if (crc != get_kernel_crc())
+ return 0;
+
+ /*
+ * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+ * kernel image offset from the seed. Let's place the kernel in the
+ * lower half of the VMALLOC area (VA_BITS - 2).
+ * Even if we could randomize at page granularity for 16k and 64k pages,
+ * let's always round to 2 MB so we don't interfere with the ability to
+ * map using contiguous PTEs
+ */
+ mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+ offset = seed & mask;
+
+ /*
+ * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+ * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+ * happens, increase the KASLR offset by the size of the kernel image.
+ */
+ if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+ (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+ offset = (offset + (u64)(_end - _text)) & mask;
+
+ /*
+ * Randomize the module region, by setting module_load_offset to
+ * a PAGE_SIZE multiple in the interval [0, module_range). This
+ * ensures that the resulting region still covers [_stext, _etext],
+ * and that all relative branches can be resolved without veneers.
+ */
+ module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+ module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+ return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
void *module_alloc(unsigned long size)
{
void *p;
+ u64 base = (u64)_etext - MODULES_VSIZE;
- p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ extern u32 module_load_offset;
+ base += module_load_offset;
+ }
+
+ p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
NUMA_NO_NODE, __builtin_return_address(0));
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
return 0;
}
subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+ void *p)
+{
+ u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+ pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+ kaslr_offset, KIMAGE_VADDR);
+ } else {
+ pr_emerg("Kernel Offset: disabled\n");
+ }
+ return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+ .notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &kernel_offset_notifier);
+ return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
unsigned long addr = FIXADDR_START;
pgd = pgd_offset_k(addr);
- if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+ if (CONFIG_PGTABLE_LEVELS > 3 &&
+ !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
/*
* We only end up here if the kernel mapping and the fixmap
* share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
}
}
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
{
const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
- pgprot_t prot = PAGE_KERNEL_RO;
- int size, offset;
+ int offset;
void *dt_virt;
/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
if (fdt_check_header(dt_virt) != 0)
return NULL;
- size = fdt_totalsize(dt_virt);
- if (size > MAX_FDT_SIZE)
+ *size = fdt_totalsize(dt_virt);
+ if (*size > MAX_FDT_SIZE)
return NULL;
- if (offset + size > SWAPPER_BLOCK_SIZE)
- create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
- round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+ if (offset + *size > SWAPPER_BLOCK_SIZE)
+ create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+ dt_virt_base,
+ round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+ prot);
- memblock_reserve(dt_phys, size);
+ return dt_virt;
+}
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+ void *dt_virt;
+ int size;
+
+ dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+ if (!dt_virt)
+ return NULL;
+
+ memblock_reserve(dt_phys, size);
return dt_virt;
}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/efistub.h | 3 ++
drivers/firmware/efi/libstub/random.c | 35 ++++++++++++++++++++
include/linux/efi.h | 5 ++-
4 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o \
$(patsubst %.c,lib-%.o,$(arm-deps))
lib-$(CONFIG_ARM) += arm32-stub.o
-lib-$(CONFIG_ARM64) += arm64-stub.o
+lib-$(CONFIG_ARM64) += arm64-stub.o random.o
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
#
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
unsigned long desc_size, efi_memory_desc_t *runtime_map,
int *count);
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+ unsigned long size, u8 *out);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd; <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+ efi_status_t (*get_info)(struct efi_rng_protocol *,
+ unsigned long *, efi_guid_t *);
+ efi_status_t (*get_rng)(struct efi_rng_protocol *,
+ efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+ unsigned long size, u8 *out)
+{
+ efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+ efi_status_t status;
+ struct efi_rng_protocol *rng;
+
+ status = efi_call_early(locate_protocol, &rng_proto, NULL,
+ (void **)&rng);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
void *open_protocol_information;
void *protocols_per_handle;
void *locate_handle_buffer;
- void *locate_protocol;
+ efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
void *install_multiple_protocol_interfaces;
void *uninstall_multiple_protocol_interfaces;
void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
#define EFI_PROPERTIES_TABLE_GUID \
EFI_GUID( 0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
+#define EFI_RNG_PROTOCOL_GUID \
+ EFI_GUID( 0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
typedef struct {
efi_guid_t guid;
u64 table;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/efistub.h | 3 ++
drivers/firmware/efi/libstub/random.c | 35 ++++++++++++++++++++
include/linux/efi.h | 5 ++-
4 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o \
$(patsubst %.c,lib-%.o,$(arm-deps))
lib-$(CONFIG_ARM) += arm32-stub.o
-lib-$(CONFIG_ARM64) += arm64-stub.o
+lib-$(CONFIG_ARM64) += arm64-stub.o random.o
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
#
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
unsigned long desc_size, efi_memory_desc_t *runtime_map,
int *count);
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+ unsigned long size, u8 *out);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd; <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+ efi_status_t (*get_info)(struct efi_rng_protocol *,
+ unsigned long *, efi_guid_t *);
+ efi_status_t (*get_rng)(struct efi_rng_protocol *,
+ efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+ unsigned long size, u8 *out)
+{
+ efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+ efi_status_t status;
+ struct efi_rng_protocol *rng;
+
+ status = efi_call_early(locate_protocol, &rng_proto, NULL,
+ (void **)&rng);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
void *open_protocol_information;
void *protocols_per_handle;
void *locate_handle_buffer;
- void *locate_protocol;
+ efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
void *install_multiple_protocol_interfaces;
void *uninstall_multiple_protocol_interfaces;
void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
#define EFI_PROPERTIES_TABLE_GUID \
EFI_GUID( 0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
+#define EFI_RNG_PROTOCOL_GUID \
+ EFI_GUID( 0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
typedef struct {
efi_guid_t guid;
u64 table;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/efistub.h | 3 ++
drivers/firmware/efi/libstub/random.c | 35 ++++++++++++++++++++
include/linux/efi.h | 5 ++-
4 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o \
$(patsubst %.c,lib-%.o,$(arm-deps))
lib-$(CONFIG_ARM) += arm32-stub.o
-lib-$(CONFIG_ARM64) += arm64-stub.o
+lib-$(CONFIG_ARM64) += arm64-stub.o random.o
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
#
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
unsigned long desc_size, efi_memory_desc_t *runtime_map,
int *count);
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+ unsigned long size, u8 *out);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd; <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+ efi_status_t (*get_info)(struct efi_rng_protocol *,
+ unsigned long *, efi_guid_t *);
+ efi_status_t (*get_rng)(struct efi_rng_protocol *,
+ efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+ unsigned long size, u8 *out)
+{
+ efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+ efi_status_t status;
+ struct efi_rng_protocol *rng;
+
+ status = efi_call_early(locate_protocol, &rng_proto, NULL,
+ (void **)&rng);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
void *open_protocol_information;
void *protocols_per_handle;
void *locate_handle_buffer;
- void *locate_protocol;
+ efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
void *install_multiple_protocol_interfaces;
void *uninstall_multiple_protocol_interfaces;
void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
#define EFI_PROPERTIES_TABLE_GUID \
EFI_GUID( 0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
+#define EFI_RNG_PROTOCOL_GUID \
+ EFI_GUID( 0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
typedef struct {
efi_guid_t guid;
u64 table;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.
This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/efistub.h | 4 +
drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
unsigned long size, u8 *out);
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long random_seed);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
return rng->get_rng(rng, NULL, size, out);
}
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+ unsigned long size,
+ unsigned long align)
+{
+ u64 start, end;
+
+ if (md->type != EFI_CONVENTIONAL_MEMORY)
+ return 0;
+
+ start = round_up(md->phys_addr, align);
+ end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+ align);
+
+ if (start > end)
+ return 0;
+
+ return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md) ((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size,
+ unsigned long align,
+ unsigned long *addr,
+ unsigned long random_seed)
+{
+ unsigned long map_size, desc_size, total_slots = 0, target_slot;
+ efi_status_t status = EFI_NOT_FOUND;
+ efi_memory_desc_t *memory_map;
+ int map_offset;
+
+ status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+ &desc_size, NULL, NULL);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ if (align < EFI_ALLOC_ALIGN)
+ align = EFI_ALLOC_ALIGN;
+
+ /* count the suitable slots in each memory map entry */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ unsigned long slots;
+
+ slots = get_entry_num_slots(md, size, align);
+ MD_NUM_SLOTS(md) = slots;
+ total_slots += slots;
+ }
+
+ /* find a random number between 0 and total_slots */
+ target_slot = (total_slots * (u16)random_seed) >> 16;
+
+ /*
+ * target_slot is now a value in the range [0, total_slots), and so
+ * it corresponds with exactly one of the suitable slots we recorded
+ * when iterating over the memory map the first time around.
+ *
+ * So iterate over the memory map again, subtracting the number of
+ * slots of each entry at each iteration, until we have found the entry
+ * that covers our chosen slot. Use the residual value of target_slot
+ * to calculate the randomly chosen address, and allocate it directly
+ * using EFI_ALLOCATE_ADDRESS.
+ */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ efi_physical_addr_t target;
+ unsigned long pages;
+
+ if (target_slot >= MD_NUM_SLOTS(md)) {
+ target_slot -= MD_NUM_SLOTS(md);
+ continue;
+ }
+
+ target = round_up(md->phys_addr, align) + target_slot * align;
+ pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+ status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+ EFI_LOADER_DATA, pages, &target);
+ if (status == EFI_SUCCESS)
+ *addr = target;
+ break;
+ }
+
+ efi_call_early(free_pool, memory_map);
+
+ return status;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.
This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/efistub.h | 4 +
drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
unsigned long size, u8 *out);
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long random_seed);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
return rng->get_rng(rng, NULL, size, out);
}
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+ unsigned long size,
+ unsigned long align)
+{
+ u64 start, end;
+
+ if (md->type != EFI_CONVENTIONAL_MEMORY)
+ return 0;
+
+ start = round_up(md->phys_addr, align);
+ end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+ align);
+
+ if (start > end)
+ return 0;
+
+ return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md) ((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size,
+ unsigned long align,
+ unsigned long *addr,
+ unsigned long random_seed)
+{
+ unsigned long map_size, desc_size, total_slots = 0, target_slot;
+ efi_status_t status = EFI_NOT_FOUND;
+ efi_memory_desc_t *memory_map;
+ int map_offset;
+
+ status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+ &desc_size, NULL, NULL);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ if (align < EFI_ALLOC_ALIGN)
+ align = EFI_ALLOC_ALIGN;
+
+ /* count the suitable slots in each memory map entry */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ unsigned long slots;
+
+ slots = get_entry_num_slots(md, size, align);
+ MD_NUM_SLOTS(md) = slots;
+ total_slots += slots;
+ }
+
+ /* find a random number between 0 and total_slots */
+ target_slot = (total_slots * (u16)random_seed) >> 16;
+
+ /*
+ * target_slot is now a value in the range [0, total_slots), and so
+ * it corresponds with exactly one of the suitable slots we recorded
+ * when iterating over the memory map the first time around.
+ *
+ * So iterate over the memory map again, subtracting the number of
+ * slots of each entry at each iteration, until we have found the entry
+ * that covers our chosen slot. Use the residual value of target_slot
+ * to calculate the randomly chosen address, and allocate it directly
+ * using EFI_ALLOCATE_ADDRESS.
+ */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ efi_physical_addr_t target;
+ unsigned long pages;
+
+ if (target_slot >= MD_NUM_SLOTS(md)) {
+ target_slot -= MD_NUM_SLOTS(md);
+ continue;
+ }
+
+ target = round_up(md->phys_addr, align) + target_slot * align;
+ pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+ status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+ EFI_LOADER_DATA, pages, &target);
+ if (status == EFI_SUCCESS)
+ *addr = target;
+ break;
+ }
+
+ efi_call_early(free_pool, memory_map);
+
+ return status;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.
This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
drivers/firmware/efi/libstub/efistub.h | 4 +
drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
unsigned long size, u8 *out);
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long random_seed);
+
#endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
return rng->get_rng(rng, NULL, size, out);
}
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+ unsigned long size,
+ unsigned long align)
+{
+ u64 start, end;
+
+ if (md->type != EFI_CONVENTIONAL_MEMORY)
+ return 0;
+
+ start = round_up(md->phys_addr, align);
+ end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+ align);
+
+ if (start > end)
+ return 0;
+
+ return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md) ((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+ unsigned long size,
+ unsigned long align,
+ unsigned long *addr,
+ unsigned long random_seed)
+{
+ unsigned long map_size, desc_size, total_slots = 0, target_slot;
+ efi_status_t status = EFI_NOT_FOUND;
+ efi_memory_desc_t *memory_map;
+ int map_offset;
+
+ status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+ &desc_size, NULL, NULL);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ if (align < EFI_ALLOC_ALIGN)
+ align = EFI_ALLOC_ALIGN;
+
+ /* count the suitable slots in each memory map entry */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ unsigned long slots;
+
+ slots = get_entry_num_slots(md, size, align);
+ MD_NUM_SLOTS(md) = slots;
+ total_slots += slots;
+ }
+
+ /* find a random number between 0 and total_slots */
+ target_slot = (total_slots * (u16)random_seed) >> 16;
+
+ /*
+ * target_slot is now a value in the range [0, total_slots), and so
+ * it corresponds with exactly one of the suitable slots we recorded
+ * when iterating over the memory map the first time around.
+ *
+ * So iterate over the memory map again, subtracting the number of
+ * slots of each entry at each iteration, until we have found the entry
+ * that covers our chosen slot. Use the residual value of target_slot
+ * to calculate the randomly chosen address, and allocate it directly
+ * using EFI_ALLOCATE_ADDRESS.
+ */
+ for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+ efi_memory_desc_t *md = (void *)memory_map + map_offset;
+ efi_physical_addr_t target;
+ unsigned long pages;
+
+ if (target_slot >= MD_NUM_SLOTS(md)) {
+ target_slot -= MD_NUM_SLOTS(md);
+ continue;
+ }
+
+ target = round_up(md->phys_addr, align) + target_slot * align;
+ pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+ status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+ EFI_LOADER_DATA, pages, &target);
+ if (status == EFI_SUCCESS)
+ *addr = target;
+ break;
+ }
+
+ efi_call_early(free_pool, memory_map);
+
+ return status;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 23:52 ` Kees Cook
-1 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
Matt Fleming
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)
-Kees
> ---
> drivers/firmware/efi/libstub/efistub.h | 4 +
> drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
> 2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
> efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
> unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size, unsigned long align,
> + unsigned long *addr, unsigned long random_seed);
> +
> #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
> return rng->get_rng(rng, NULL, size, out);
> }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> + unsigned long size,
> + unsigned long align)
> +{
> + u64 start, end;
> +
> + if (md->type != EFI_CONVENTIONAL_MEMORY)
> + return 0;
> +
> + start = round_up(md->phys_addr, align);
> + end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> + align);
> +
> + if (start > end)
> + return 0;
> +
> + return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
> +
> + status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> + &desc_size, NULL, NULL);
> + if (status != EFI_SUCCESS)
> + return status;
> +
> + if (align < EFI_ALLOC_ALIGN)
> + align = EFI_ALLOC_ALIGN;
> +
> + /* count the suitable slots in each memory map entry */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + unsigned long slots;
> +
> + slots = get_entry_num_slots(md, size, align);
> + MD_NUM_SLOTS(md) = slots;
> + total_slots += slots;
> + }
> +
> + /* find a random number between 0 and total_slots */
> + target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> + /*
> + * target_slot is now a value in the range [0, total_slots), and so
> + * it corresponds with exactly one of the suitable slots we recorded
> + * when iterating over the memory map the first time around.
> + *
> + * So iterate over the memory map again, subtracting the number of
> + * slots of each entry at each iteration, until we have found the entry
> + * that covers our chosen slot. Use the residual value of target_slot
> + * to calculate the randomly chosen address, and allocate it directly
> + * using EFI_ALLOCATE_ADDRESS.
> + */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + efi_physical_addr_t target;
> + unsigned long pages;
> +
> + if (target_slot >= MD_NUM_SLOTS(md)) {
> + target_slot -= MD_NUM_SLOTS(md);
> + continue;
> + }
> +
> + target = round_up(md->phys_addr, align) + target_slot * align;
> + pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> + status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> + EFI_LOADER_DATA, pages, &target);
> + if (status == EFI_SUCCESS)
> + *addr = target;
> + break;
> + }
> +
> + efi_call_early(free_pool, memory_map);
> +
> + return status;
> +}
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 23:52 ` Kees Cook
0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
Matt Fleming
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)
-Kees
> ---
> drivers/firmware/efi/libstub/efistub.h | 4 +
> drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
> 2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
> efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
> unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size, unsigned long align,
> + unsigned long *addr, unsigned long random_seed);
> +
> #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
> return rng->get_rng(rng, NULL, size, out);
> }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> + unsigned long size,
> + unsigned long align)
> +{
> + u64 start, end;
> +
> + if (md->type != EFI_CONVENTIONAL_MEMORY)
> + return 0;
> +
> + start = round_up(md->phys_addr, align);
> + end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> + align);
> +
> + if (start > end)
> + return 0;
> +
> + return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
> +
> + status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> + &desc_size, NULL, NULL);
> + if (status != EFI_SUCCESS)
> + return status;
> +
> + if (align < EFI_ALLOC_ALIGN)
> + align = EFI_ALLOC_ALIGN;
> +
> + /* count the suitable slots in each memory map entry */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + unsigned long slots;
> +
> + slots = get_entry_num_slots(md, size, align);
> + MD_NUM_SLOTS(md) = slots;
> + total_slots += slots;
> + }
> +
> + /* find a random number between 0 and total_slots */
> + target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> + /*
> + * target_slot is now a value in the range [0, total_slots), and so
> + * it corresponds with exactly one of the suitable slots we recorded
> + * when iterating over the memory map the first time around.
> + *
> + * So iterate over the memory map again, subtracting the number of
> + * slots of each entry at each iteration, until we have found the entry
> + * that covers our chosen slot. Use the residual value of target_slot
> + * to calculate the randomly chosen address, and allocate it directly
> + * using EFI_ALLOCATE_ADDRESS.
> + */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + efi_physical_addr_t target;
> + unsigned long pages;
> +
> + if (target_slot >= MD_NUM_SLOTS(md)) {
> + target_slot -= MD_NUM_SLOTS(md);
> + continue;
> + }
> +
> + target = round_up(md->phys_addr, align) + target_slot * align;
> + pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> + status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> + EFI_LOADER_DATA, pages, &target);
> + if (status == EFI_SUCCESS)
> + *addr = target;
> + break;
> + }
> +
> + efi_call_early(free_pool, memory_map);
> +
> + return status;
> +}
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 23:52 ` Kees Cook
0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)
-Kees
> ---
> drivers/firmware/efi/libstub/efistub.h | 4 +
> drivers/firmware/efi/libstub/random.c | 100 ++++++++++++++++++++
> 2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
> efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
> unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size, unsigned long align,
> + unsigned long *addr, unsigned long random_seed);
> +
> #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
> return rng->get_rng(rng, NULL, size, out);
> }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> + unsigned long size,
> + unsigned long align)
> +{
> + u64 start, end;
> +
> + if (md->type != EFI_CONVENTIONAL_MEMORY)
> + return 0;
> +
> + start = round_up(md->phys_addr, align);
> + end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> + align);
> +
> + if (start > end)
> + return 0;
> +
> + return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
> +
> + status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> + &desc_size, NULL, NULL);
> + if (status != EFI_SUCCESS)
> + return status;
> +
> + if (align < EFI_ALLOC_ALIGN)
> + align = EFI_ALLOC_ALIGN;
> +
> + /* count the suitable slots in each memory map entry */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + unsigned long slots;
> +
> + slots = get_entry_num_slots(md, size, align);
> + MD_NUM_SLOTS(md) = slots;
> + total_slots += slots;
> + }
> +
> + /* find a random number between 0 and total_slots */
> + target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> + /*
> + * target_slot is now a value in the range [0, total_slots), and so
> + * it corresponds with exactly one of the suitable slots we recorded
> + * when iterating over the memory map the first time around.
> + *
> + * So iterate over the memory map again, subtracting the number of
> + * slots of each entry at each iteration, until we have found the entry
> + * that covers our chosen slot. Use the residual value of target_slot
> + * to calculate the randomly chosen address, and allocate it directly
> + * using EFI_ALLOCATE_ADDRESS.
> + */
> + for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> + efi_memory_desc_t *md = (void *)memory_map + map_offset;
> + efi_physical_addr_t target;
> + unsigned long pages;
> +
> + if (target_slot >= MD_NUM_SLOTS(md)) {
> + target_slot -= MD_NUM_SLOTS(md);
> + continue;
> + }
> +
> + target = round_up(md->phys_addr, align) + target_slot * align;
> + pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> + status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> + EFI_LOADER_DATA, pages, &target);
> + if (status == EFI_SUCCESS)
> + *addr = target;
> + break;
> + }
> +
> + efi_call_early(free_pool, memory_map);
> +
> + return status;
> +}
> --
> 2.5.0
>
--
Kees Cook
Chrome OS & Brillo Security
^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-29 15:39 ` Matt Fleming
-1 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel,
stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott
On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
The assignment of 'status' is not needed since it gets assigned a new
value immediately.
Otherwise looks good.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-29 15:39 ` Matt Fleming
0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel,
stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott
On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
The assignment of 'status' is not needed since it gets assigned a new
value immediately.
Otherwise looks good.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-29 15:39 ` Matt Fleming
0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md) ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> + unsigned long size,
> + unsigned long align,
> + unsigned long *addr,
> + unsigned long random_seed)
> +{
> + unsigned long map_size, desc_size, total_slots = 0, target_slot;
> + efi_status_t status = EFI_NOT_FOUND;
> + efi_memory_desc_t *memory_map;
> + int map_offset;
The assignment of 'status' is not needed since it gets assigned a new
value immediately.
Otherwise looks good.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 21/22] efi: stub: use high allocation for converted command line
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.
Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/x86/include/asm/efi.h | 2 ++
drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
#define EFI32_LOADER_SIGNATURE "EL32"
#define EFI64_LOADER_SIGNATURE "EL64"
+#define MAX_CMDLINE_ADDRESS UINT_MAX
+
#ifdef CONFIG_X86_32
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
return dst;
}
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS ULONG_MAX
+#endif
+
/*
* Convert the unicode UEFI command line to ASCII to pass to kernel.
* Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
options_bytes++; /* NUL termination */
- status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+ status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+ &cmdline_addr, MAX_CMDLINE_ADDRESS);
if (status != EFI_SUCCESS)
return NULL;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 21/22] efi: stub: use high allocation for converted command line
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.
Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/x86/include/asm/efi.h | 2 ++
drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
#define EFI32_LOADER_SIGNATURE "EL32"
#define EFI64_LOADER_SIGNATURE "EL64"
+#define MAX_CMDLINE_ADDRESS UINT_MAX
+
#ifdef CONFIG_X86_32
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
return dst;
}
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS ULONG_MAX
+#endif
+
/*
* Convert the unicode UEFI command line to ASCII to pass to kernel.
* Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
options_bytes++; /* NUL termination */
- status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+ status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+ &cmdline_addr, MAX_CMDLINE_ADDRESS);
if (status != EFI_SUCCESS)
return NULL;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 21/22] efi: stub: use high allocation for converted command line
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.
Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/x86/include/asm/efi.h | 2 ++
drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
#define EFI32_LOADER_SIGNATURE "EL32"
#define EFI64_LOADER_SIGNATURE "EL64"
+#define MAX_CMDLINE_ADDRESS UINT_MAX
+
#ifdef CONFIG_X86_32
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
return dst;
}
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS ULONG_MAX
+#endif
+
/*
* Convert the unicode UEFI command line to ASCII to pass to kernel.
* Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
options_bytes++; /* NUL termination */
- status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+ status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+ &cmdline_addr, MAX_CMDLINE_ADDRESS);
if (status != EFI_SUCCESS)
return NULL;
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-26 17:10 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 5 ++
drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
drivers/firmware/efi/libstub/fdt.c | 9 +++
4 files changed, 97 insertions(+), 35 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
It is the bootloader's job to provide entropy, by passing a
random u64 value in /chosen/kaslr-seed at kernel entry.
+ When booting via the UEFI stub, it will invoke the firmware's
+ EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+ to the kernel proper. In addition, it will randomise the physical
+ location of the kernel Image as well.
+
If unsure, say N.
endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
#include "efistub.h"
+bool __nokaslr;
+
static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
{
static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
pr_efi_err(sys_table, "Failed to find DRAM base\n");
goto fail;
}
- status = handle_kernel_image(sys_table, image_addr, &image_size,
- &reserve_addr,
- &reserve_size,
- dram_base, image);
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table, "Failed to relocate kernel\n");
- goto fail;
- }
/*
* Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
if (!cmdline_ptr) {
pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
- goto fail_free_image;
+ goto fail;
+ }
+
+ /* check whether 'nokaslr' was passed on the command line */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+ const u8 *str, *cmdline = cmdline_ptr;
+
+ if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+ cmdline = default_cmdline;
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ __nokaslr = true;
+ }
+
+ status = handle_kernel_image(sys_table, image_addr, &image_size,
+ &reserve_addr,
+ &reserve_size,
+ dram_base, image);
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table, "Failed to relocate kernel\n");
+ goto fail_free_cmdline;
}
status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Failed to load device tree!\n");
- goto fail_free_cmdline;
+ goto fail_free_image;
}
}
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
efi_free(sys_table, initrd_size, initrd_addr);
efi_free(sys_table, fdt_size, fdt_addr);
-fail_free_cmdline:
- efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
fail_free_image:
efi_free(sys_table, image_size, *image_addr);
efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+ efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
fail:
return EFI_ERROR;
}
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
#include <asm/efi.h>
#include <asm/sections.h>
+#include "efistub.h"
+
+extern bool __nokaslr;
+
efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
unsigned long *image_addr,
unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
{
efi_status_t status;
unsigned long kernel_size, kernel_memsize = 0;
- unsigned long nr_pages;
void *old_image_addr = (void *)*image_addr;
unsigned long preferred_offset;
+ u64 phys_seed = 0;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ if (!__nokaslr) {
+ status = efi_get_random_bytes(sys_table_arg,
+ sizeof(phys_seed),
+ (u8 *)&phys_seed);
+ if (status == EFI_NOT_FOUND) {
+ pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+ } else if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+ return status;
+ }
+ } else {
+ pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+ }
+ }
/*
* The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
* a 2 MB aligned base, which itself may be lower than dram_base, as
* long as the resulting offset equals or exceeds it.
*/
- preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+ preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
if (preferred_offset < dram_base)
- preferred_offset += SZ_2M;
+ preferred_offset += MIN_KIMG_ALIGN;
- /* Relocate the image, if required. */
kernel_size = _edata - _text;
- if (*image_addr != preferred_offset) {
- kernel_memsize = kernel_size + (_end - _edata);
+ kernel_memsize = kernel_size + (_end - _edata);
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+ /*
+ * If KASLR is enabled, and we have some randomness available,
+ * locate the kernel at a randomized offset in physical memory.
+ */
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_random_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr,
+ phys_seed);
+ *image_addr = *reserve_addr + TEXT_OFFSET;
+ } else {
/*
- * First, try a straight allocation at the preferred offset.
+ * Else, try a straight allocation at the preferred offset.
* This will work around the issue where, if dram_base == 0x0,
* efi_low_alloc() refuses to allocate at 0x0 (to prevent the
* address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
* Mustang), we can still place the kernel at the address
* 'dram_base + TEXT_OFFSET'.
*/
+ if (*image_addr == preferred_offset)
+ return EFI_SUCCESS;
+
*image_addr = *reserve_addr = preferred_offset;
- nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
- EFI_PAGE_SIZE;
+ *reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages,
+ EFI_LOADER_DATA,
+ *reserve_size / EFI_PAGE_SIZE,
(efi_physical_addr_t *)reserve_addr);
- if (status != EFI_SUCCESS) {
- kernel_memsize += TEXT_OFFSET;
- status = efi_low_alloc(sys_table_arg, kernel_memsize,
- SZ_2M, reserve_addr);
+ }
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
- return status;
- }
- *image_addr = *reserve_addr + TEXT_OFFSET;
+ if (status != EFI_SUCCESS) {
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_low_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr);
+
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+ *reserve_size = 0;
+ return status;
}
- memcpy((void *)*image_addr, old_image_addr, kernel_size);
- *reserve_size = kernel_memsize;
+ *image_addr = *reserve_addr + TEXT_OFFSET;
}
-
+ memcpy((void *)*image_addr, old_image_addr, kernel_size);
return EFI_SUCCESS;
}
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
if (status)
goto fdt_set_fail;
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+ (u8 *)&fdt_val64);
+ if (status == EFI_SUCCESS)
+ status = fdt_setprop(fdt, node, "kaslr-seed",
+ &fdt_val64, sizeof(fdt_val64));
+ else if (status != EFI_NOT_FOUND)
+ goto fdt_set_fail;
+ }
return EFI_SUCCESS;
fdt_set_fail:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [kernel-hardening] [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel
Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott, matt, Ard Biesheuvel
Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 5 ++
drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
drivers/firmware/efi/libstub/fdt.c | 9 +++
4 files changed, 97 insertions(+), 35 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
It is the bootloader's job to provide entropy, by passing a
random u64 value in /chosen/kaslr-seed at kernel entry.
+ When booting via the UEFI stub, it will invoke the firmware's
+ EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+ to the kernel proper. In addition, it will randomise the physical
+ location of the kernel Image as well.
+
If unsure, say N.
endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
#include "efistub.h"
+bool __nokaslr;
+
static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
{
static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
pr_efi_err(sys_table, "Failed to find DRAM base\n");
goto fail;
}
- status = handle_kernel_image(sys_table, image_addr, &image_size,
- &reserve_addr,
- &reserve_size,
- dram_base, image);
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table, "Failed to relocate kernel\n");
- goto fail;
- }
/*
* Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
if (!cmdline_ptr) {
pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
- goto fail_free_image;
+ goto fail;
+ }
+
+ /* check whether 'nokaslr' was passed on the command line */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+ const u8 *str, *cmdline = cmdline_ptr;
+
+ if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+ cmdline = default_cmdline;
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ __nokaslr = true;
+ }
+
+ status = handle_kernel_image(sys_table, image_addr, &image_size,
+ &reserve_addr,
+ &reserve_size,
+ dram_base, image);
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table, "Failed to relocate kernel\n");
+ goto fail_free_cmdline;
}
status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Failed to load device tree!\n");
- goto fail_free_cmdline;
+ goto fail_free_image;
}
}
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
efi_free(sys_table, initrd_size, initrd_addr);
efi_free(sys_table, fdt_size, fdt_addr);
-fail_free_cmdline:
- efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
fail_free_image:
efi_free(sys_table, image_size, *image_addr);
efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+ efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
fail:
return EFI_ERROR;
}
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
#include <asm/efi.h>
#include <asm/sections.h>
+#include "efistub.h"
+
+extern bool __nokaslr;
+
efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
unsigned long *image_addr,
unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
{
efi_status_t status;
unsigned long kernel_size, kernel_memsize = 0;
- unsigned long nr_pages;
void *old_image_addr = (void *)*image_addr;
unsigned long preferred_offset;
+ u64 phys_seed = 0;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ if (!__nokaslr) {
+ status = efi_get_random_bytes(sys_table_arg,
+ sizeof(phys_seed),
+ (u8 *)&phys_seed);
+ if (status == EFI_NOT_FOUND) {
+ pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+ } else if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+ return status;
+ }
+ } else {
+ pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+ }
+ }
/*
* The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
* a 2 MB aligned base, which itself may be lower than dram_base, as
* long as the resulting offset equals or exceeds it.
*/
- preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+ preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
if (preferred_offset < dram_base)
- preferred_offset += SZ_2M;
+ preferred_offset += MIN_KIMG_ALIGN;
- /* Relocate the image, if required. */
kernel_size = _edata - _text;
- if (*image_addr != preferred_offset) {
- kernel_memsize = kernel_size + (_end - _edata);
+ kernel_memsize = kernel_size + (_end - _edata);
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+ /*
+ * If KASLR is enabled, and we have some randomness available,
+ * locate the kernel at a randomized offset in physical memory.
+ */
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_random_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr,
+ phys_seed);
+ *image_addr = *reserve_addr + TEXT_OFFSET;
+ } else {
/*
- * First, try a straight allocation at the preferred offset.
+ * Else, try a straight allocation at the preferred offset.
* This will work around the issue where, if dram_base == 0x0,
* efi_low_alloc() refuses to allocate at 0x0 (to prevent the
* address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
* Mustang), we can still place the kernel at the address
* 'dram_base + TEXT_OFFSET'.
*/
+ if (*image_addr == preferred_offset)
+ return EFI_SUCCESS;
+
*image_addr = *reserve_addr = preferred_offset;
- nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
- EFI_PAGE_SIZE;
+ *reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages,
+ EFI_LOADER_DATA,
+ *reserve_size / EFI_PAGE_SIZE,
(efi_physical_addr_t *)reserve_addr);
- if (status != EFI_SUCCESS) {
- kernel_memsize += TEXT_OFFSET;
- status = efi_low_alloc(sys_table_arg, kernel_memsize,
- SZ_2M, reserve_addr);
+ }
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
- return status;
- }
- *image_addr = *reserve_addr + TEXT_OFFSET;
+ if (status != EFI_SUCCESS) {
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_low_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr);
+
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+ *reserve_size = 0;
+ return status;
}
- memcpy((void *)*image_addr, old_image_addr, kernel_size);
- *reserve_size = kernel_memsize;
+ *image_addr = *reserve_addr + TEXT_OFFSET;
}
-
+ memcpy((void *)*image_addr, old_image_addr, kernel_size);
return EFI_SUCCESS;
}
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
if (status)
goto fdt_set_fail;
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+ (u8 *)&fdt_val64);
+ if (status == EFI_SUCCESS)
+ status = fdt_setprop(fdt, node, "kaslr-seed",
+ &fdt_val64, sizeof(fdt_val64));
+ else if (status != EFI_NOT_FOUND)
+ goto fdt_set_fail;
+ }
return EFI_SUCCESS;
fdt_set_fail:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-26 17:10 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
To: linux-arm-kernel
Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/Kconfig | 5 ++
drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
drivers/firmware/efi/libstub/fdt.c | 9 +++
4 files changed, 97 insertions(+), 35 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
It is the bootloader's job to provide entropy, by passing a
random u64 value in /chosen/kaslr-seed at kernel entry.
+ When booting via the UEFI stub, it will invoke the firmware's
+ EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+ to the kernel proper. In addition, it will randomise the physical
+ location of the kernel Image as well.
+
If unsure, say N.
endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
#include "efistub.h"
+bool __nokaslr;
+
static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
{
static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
pr_efi_err(sys_table, "Failed to find DRAM base\n");
goto fail;
}
- status = handle_kernel_image(sys_table, image_addr, &image_size,
- &reserve_addr,
- &reserve_size,
- dram_base, image);
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table, "Failed to relocate kernel\n");
- goto fail;
- }
/*
* Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
if (!cmdline_ptr) {
pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
- goto fail_free_image;
+ goto fail;
+ }
+
+ /* check whether 'nokaslr' was passed on the command line */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ static const u8 default_cmdline[] = CONFIG_CMDLINE;
+ const u8 *str, *cmdline = cmdline_ptr;
+
+ if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+ cmdline = default_cmdline;
+ str = strstr(cmdline, "nokaslr");
+ if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+ __nokaslr = true;
+ }
+
+ status = handle_kernel_image(sys_table, image_addr, &image_size,
+ &reserve_addr,
+ &reserve_size,
+ dram_base, image);
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table, "Failed to relocate kernel\n");
+ goto fail_free_cmdline;
}
status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Failed to load device tree!\n");
- goto fail_free_cmdline;
+ goto fail_free_image;
}
}
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
efi_free(sys_table, initrd_size, initrd_addr);
efi_free(sys_table, fdt_size, fdt_addr);
-fail_free_cmdline:
- efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
fail_free_image:
efi_free(sys_table, image_size, *image_addr);
efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+ efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
fail:
return EFI_ERROR;
}
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
#include <asm/efi.h>
#include <asm/sections.h>
+#include "efistub.h"
+
+extern bool __nokaslr;
+
efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
unsigned long *image_addr,
unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
{
efi_status_t status;
unsigned long kernel_size, kernel_memsize = 0;
- unsigned long nr_pages;
void *old_image_addr = (void *)*image_addr;
unsigned long preferred_offset;
+ u64 phys_seed = 0;
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ if (!__nokaslr) {
+ status = efi_get_random_bytes(sys_table_arg,
+ sizeof(phys_seed),
+ (u8 *)&phys_seed);
+ if (status == EFI_NOT_FOUND) {
+ pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+ } else if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+ return status;
+ }
+ } else {
+ pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+ }
+ }
/*
* The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
* a 2 MB aligned base, which itself may be lower than dram_base, as
* long as the resulting offset equals or exceeds it.
*/
- preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+ preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
if (preferred_offset < dram_base)
- preferred_offset += SZ_2M;
+ preferred_offset += MIN_KIMG_ALIGN;
- /* Relocate the image, if required. */
kernel_size = _edata - _text;
- if (*image_addr != preferred_offset) {
- kernel_memsize = kernel_size + (_end - _edata);
+ kernel_memsize = kernel_size + (_end - _edata);
+
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+ /*
+ * If KASLR is enabled, and we have some randomness available,
+ * locate the kernel at a randomized offset in physical memory.
+ */
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_random_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr,
+ phys_seed);
+ *image_addr = *reserve_addr + TEXT_OFFSET;
+ } else {
/*
- * First, try a straight allocation at the preferred offset.
+ * Else, try a straight allocation at the preferred offset.
* This will work around the issue where, if dram_base == 0x0,
* efi_low_alloc() refuses to allocate at 0x0 (to prevent the
* address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
* Mustang), we can still place the kernel@the address
* 'dram_base + TEXT_OFFSET'.
*/
+ if (*image_addr == preferred_offset)
+ return EFI_SUCCESS;
+
*image_addr = *reserve_addr = preferred_offset;
- nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
- EFI_PAGE_SIZE;
+ *reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages,
+ EFI_LOADER_DATA,
+ *reserve_size / EFI_PAGE_SIZE,
(efi_physical_addr_t *)reserve_addr);
- if (status != EFI_SUCCESS) {
- kernel_memsize += TEXT_OFFSET;
- status = efi_low_alloc(sys_table_arg, kernel_memsize,
- SZ_2M, reserve_addr);
+ }
- if (status != EFI_SUCCESS) {
- pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
- return status;
- }
- *image_addr = *reserve_addr + TEXT_OFFSET;
+ if (status != EFI_SUCCESS) {
+ *reserve_size = kernel_memsize + TEXT_OFFSET;
+ status = efi_low_alloc(sys_table_arg, *reserve_size,
+ MIN_KIMG_ALIGN, reserve_addr);
+
+ if (status != EFI_SUCCESS) {
+ pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+ *reserve_size = 0;
+ return status;
}
- memcpy((void *)*image_addr, old_image_addr, kernel_size);
- *reserve_size = kernel_memsize;
+ *image_addr = *reserve_addr + TEXT_OFFSET;
}
-
+ memcpy((void *)*image_addr, old_image_addr, kernel_size);
return EFI_SUCCESS;
}
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
if (status)
goto fdt_set_fail;
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+ (u8 *)&fdt_val64);
+ if (status == EFI_SUCCESS)
+ status = fdt_setprop(fdt, node, "kaslr-seed",
+ &fdt_val64, sizeof(fdt_val64));
+ else if (status != EFI_NOT_FOUND)
+ goto fdt_set_fail;
+ }
return EFI_SUCCESS;
fdt_set_fail:
--
2.5.0
^ permalink raw reply related [flat|nested] 93+ messages in thread
* Re: [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-29 15:57 ` Matt Fleming
-1 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel,
stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott
On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
> arch/arm64/Kconfig | 5 ++
> drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
> drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
> drivers/firmware/efi/libstub/fdt.c | 9 +++
> 4 files changed, 97 insertions(+), 35 deletions(-)
[...]
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
> if (status)
> goto fdt_set_fail;
>
> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> + status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> + (u8 *)&fdt_val64);
> + if (status == EFI_SUCCESS)
> + status = fdt_setprop(fdt, node, "kaslr-seed",
> + &fdt_val64, sizeof(fdt_val64));
> + else if (status != EFI_NOT_FOUND)
> + goto fdt_set_fail;
> + }
> return EFI_SUCCESS;
>
> fdt_set_fail:
I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-29 15:57 ` Matt Fleming
0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
mark.rutland, leif.lindholm, keescook, linux-kernel,
stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
christoffer.dall, labbott
On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
> arch/arm64/Kconfig | 5 ++
> drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
> drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
> drivers/firmware/efi/libstub/fdt.c | 9 +++
> 4 files changed, 97 insertions(+), 35 deletions(-)
[...]
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
> if (status)
> goto fdt_set_fail;
>
> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> + status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> + (u8 *)&fdt_val64);
> + if (status == EFI_SUCCESS)
> + status = fdt_setprop(fdt, node, "kaslr-seed",
> + &fdt_val64, sizeof(fdt_val64));
> + else if (status != EFI_NOT_FOUND)
> + goto fdt_set_fail;
> + }
> return EFI_SUCCESS;
>
> fdt_set_fail:
I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-29 15:57 ` Matt Fleming
0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
> arch/arm64/Kconfig | 5 ++
> drivers/firmware/efi/libstub/arm-stub.c | 40 ++++++----
> drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
> drivers/firmware/efi/libstub/fdt.c | 9 +++
> 4 files changed, 97 insertions(+), 35 deletions(-)
[...]
> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
> if (status)
> goto fdt_set_fail;
>
> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> + status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> + (u8 *)&fdt_val64);
> + if (status == EFI_SUCCESS)
> + status = fdt_setprop(fdt, node, "kaslr-seed",
> + &fdt_val64, sizeof(fdt_val64));
> + else if (status != EFI_NOT_FOUND)
> + goto fdt_set_fail;
> + }
> return EFI_SUCCESS;
>
> fdt_set_fail:
I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".
^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [PATCH v4 00/22] arm64: implement support for KASLR
2016-01-26 17:10 ` Ard Biesheuvel
(?)
@ 2016-01-29 18:26 ` Catalin Marinas
-1 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
leif.lindholm, keescook, linux-kernel, matt, arnd,
bhupesh.sharma, stuart.yoder, marc.zyngier, christoffer.dall,
labbott
Hi Ard,
On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.
So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:
1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes
1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.
Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:26 ` Catalin Marinas
0 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
leif.lindholm, keescook, linux-kernel, matt, arnd,
bhupesh.sharma, stuart.yoder, marc.zyngier, christoffer.dall,
labbott
Hi Ard,
On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.
So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:
1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes
1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.
Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:26 ` Catalin Marinas
0 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
To: linux-arm-kernel
Hi Ard,
On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.
So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:
1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes
1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.
Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [PATCH v4 00/22] arm64: implement support for KASLR
2016-01-29 18:26 ` Catalin Marinas
(?)
@ 2016-01-29 18:49 ` Ard Biesheuvel
-1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
To: Catalin Marinas
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
Leif Lindholm, Kees Cook, linux-kernel, Matt Fleming,
Arnd Bergmann, Sharma Bhupesh, Stuart Yoder, Marc Zyngier,
Christoffer Dall, Laura Abbott
giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?
Sure
> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable
This has been picked up by akpm in the mean time
> 2. huge-vmap
This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.
> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR
Relative kallsyms has also been picked up by akpm
> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>
I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday
--
Ard.
^ permalink raw reply [flat|nested] 93+ messages in thread
* [kernel-hardening] Re: [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:49 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
To: Catalin Marinas
Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
Leif Lindholm, Kees Cook, linux-kernel, Matt Fleming,
Arnd Bergmann, Sharma Bhupesh, Stuart Yoder, Marc Zyngier,
Christoffer Dall, Laura Abbott
giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?
Sure
> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable
This has been picked up by akpm in the mean time
> 2. huge-vmap
This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.
> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR
Relative kallsyms has also been picked up by akpm
> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>
I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday
--
Ard.
^ permalink raw reply [flat|nested] 93+ messages in thread
* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:49 ` Ard Biesheuvel
0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
To: linux-arm-kernel
giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?
Sure
> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable
This has been picked up by akpm in the mean time
> 2. huge-vmap
This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.
> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR
Relative kallsyms has also been picked up by akpm
> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>
I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday
--
Ard.
^ permalink raw reply [flat|nested] 93+ messages in thread