All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. v2 and up also implement physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v3:
- Implemented base relative kallsyms address tables. This saves 250 KB of
  permanent .rodata (for a defconfig build), but more importantly, saves about
  1.4 MB of __init data in the dynamic relocation table. This patch has been
  picked up by akpm in the mean time, but it is reproduced here for completeness
- Reimplemented the KASLR init code in C. This has a couple of benefits, i.e.,
  we can now parse the 'nokaslr' command line option in a timely fashion, and we
  can pass the KASLR random seed via /chosen/kaslr-seed rather than via x1,
  which means fewer changes to bootloaders. This is implemented using a two pass
  approach, i.e., the kernel is booted without KASLR to a state where it can
  invoke ordinary C code, and it re-enters the early boot code to recreate
  the kernel mappings at an offset if it finds the prerequisite data in the FDT.
  Note that this requires some mild refactoring to ensure that early_fixmap_init
  and fixmap_remap_fdt can cope with being called twice.
- Added a patch to enable the inappropriately named huge-vmap feature for arm64,
  which allows ioremap() (but not vmap or vmalloc) to use block mappings. This
  should be an improvement in itself, but the significance for this series is
  that it also allows the __init region to be unmapped entirely via
  unmap_kernel_range() [which complains about block mappings without this
  feature enabled]
- Split the implementation of CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE into
  separate patches.
- Randomize the module region independently from the core kernel. It is chosen
  such that it covers the [_stext, _etext] interval of the core kernel to avoid
  using PLT entries unless we really have to.
- Update the module PLT patch to replace the O(n^2) searches with sorting, and
  use a single .plt section for __init and ordinary code.
- Added panic notifiers to report the KASLR offset, and whether a mem= limit is
  in effect.
- Replaced Mark Rutland's asm/elf.g split off patch with one that puts the C
  declarations between #ifndef __ASSEMBLY__/#endif
- Incorporated feedback (and tags) from Mark Rutland and Matt Fleming
- Minor tweaks and fixes.
  
Changes since v2:
- Incorporated feedback from Marc Zyngier into the KVM patch (#5)
- Dropped the pgdir section and the patch that memblock_reserve()'s the kernel
  sections at a smaller granularity. This is no longer necessary with the pgdir
  section gone. This also fixes an issue spotted by James Morse where the fixmap
  page tables are not zeroed correctly; these have been moved back to the .bss
  section.
- Got rid of all ifdef'ery regarding the number of translation levels in the
  changed .c files, by introducing new definitions in pgtable.h (#3, #6)
- Fixed KAsan support, which was broken by all earlier versions.
- Moved module region along with the virtually randomized kernel, so that module
  addresses become unpredictable as well, and we only have to rely on veneers in
  the PLTs when the module region is exhausted (which is somewhat more likely
  since the module region is now shared with other uses of the vmalloc area)
- Added support for the 'nokaslr' command line option. This affects the
  randomization performed by the stub, and results in a warning if passed while
  the bootloader also presented a random seed for virtual KASLR in register x1.
- The .text/.rodata sections of the kernel are no longer aliased in the linear
  region with a writable mapping.
- Added a separate image header flag for kernel images that may be loaded at any
  2 MB aligned offset (+ TEXT_OFFSET)
- The KASLR displacement is now corrected if it results in the kernel image
  intersecting a PUD/PMD boundary (4k and 16k/64k granule kernels, respectively)
- Split out UEFI stub random routines into separate patches.
- Implemented a weight based EFI random allocation routine so that each suitable
  offset in available memory is equally likely to be selected (as suggested by
  Kees Cook)
- Reused CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE instead of introducing
  new Kconfig symbols to describe the same functionality.
- Reimplemented mem= logic so memory is clipped from the top first.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in
  /chosen/kaslr-seed upon kernel entry. Since we have no decompressor, it is
  simply not feasible to collect randomness in the head.S code path before
  mapping the kernel and enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 introduces pte_offset_kimg(), pmd_offset_kimg() and pud_offset_kimg()
that allow statically allocated page tables (i.e., by fixmap and kasan) to be
traversed before the linear mapping is installed.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 wires up the huge-vmap generic feature for arm64.

Patch #7 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #8 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range. This is required for KASLR,
since modules may be loaded far away from the core kernel.

Patch #9 and #10 move arm64 to the a new generic relative version of the extable
implementation so that it no longer contains absolute addresses that require
fixing up at relocation time, but uses relative offsets instead.

Patch #11 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS64 relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #12 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #13 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #14 wraps C declarations in asm/elf.h inside #ifndef __ASSEMBLY__ so we
can include it in head.S

Patch #15 updates scripts/sortextable.c so it accepts ET_DYN (relocatable)
executables as well as ET_EXEC (static) executables.

Patch #16 implements base kallsyms base relative address tables.

Patch #17 implements CONFIG_RELOCATABLE, i.e., building vmlinux as a PIE
executable that self relocates at early boot time

Patch #18 implements the core KASLR, by taking randomness supplied in
/chosen/kaslr-seed and using it to move the kernel inside the vmalloc area,
and randomize the module region around it.

Patch #19 implements efi_get_random_bytes() based on the EFI_RNG_PROTOCOL

Patch #20 implements efi_random_alloc()

Patch #21 moves the allocation for the converted command line (UTF-16 to ASCII)
away from the base of memory. This is necessary since for parsing 

Patch #22 implements the actual KASLR, by randomizing the kernel physical
address, and passing entropy in /chosen/kaslr-seed so that the kernel proper can
relocate itself virtually.

Ard Biesheuvel (22):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: add support for ioremap() block mappings
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  extable: add support for relative extables to search and sort routines
  arm64: switch to relative exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: make asm/elf.h available to asm files
  scripts/sortextable: add support for ET_DYN binaries
  kallsyms: add support for relative offsets in kallsyms address table
  arm64: add support for building the kernel as a relocate PIE binary
  arm64: add support for kernel ASLR
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: use high allocation for converted command line
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt                      |  20 +-
 Documentation/features/vm/huge-vmap/arch-support.txt |   2 +-
 arch/arm/include/asm/kvm_asm.h                       |   2 +
 arch/arm/kvm/arm.c                                   |   8 +-
 arch/arm64/Kconfig                                   |  41 ++++
 arch/arm64/Makefile                                  |  10 +-
 arch/arm64/include/asm/assembler.h                   |  26 ++-
 arch/arm64/include/asm/boot.h                        |   6 +
 arch/arm64/include/asm/elf.h                         |  24 ++-
 arch/arm64/include/asm/futex.h                       |  12 +-
 arch/arm64/include/asm/kasan.h                       |   2 +-
 arch/arm64/include/asm/kernel-pgtable.h              |  11 ++
 arch/arm64/include/asm/kvm_asm.h                     |   2 +
 arch/arm64/include/asm/kvm_host.h                    |   8 +-
 arch/arm64/include/asm/memory.h                      |  47 +++--
 arch/arm64/include/asm/module.h                      |  11 ++
 arch/arm64/include/asm/pgtable.h                     |  23 ++-
 arch/arm64/include/asm/uaccess.h                     |  30 +--
 arch/arm64/include/asm/word-at-a-time.h              |   7 +-
 arch/arm64/kernel/Makefile                           |   2 +
 arch/arm64/kernel/armv8_deprecated.c                 |   7 +-
 arch/arm64/kernel/efi-entry.S                        |   2 +-
 arch/arm64/kernel/head.S                             | 134 +++++++++++--
 arch/arm64/kernel/image.h                            |  45 +++--
 arch/arm64/kernel/kaslr.c                            | 169 ++++++++++++++++
 arch/arm64/kernel/module-plts.c                      | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c                           |  20 +-
 arch/arm64/kernel/module.lds                         |   3 +
 arch/arm64/kernel/setup.c                            |  29 +++
 arch/arm64/kernel/vmlinux.lds.S                      |  20 +-
 arch/arm64/kvm/hyp.S                                 |   6 +-
 arch/arm64/mm/dump.c                                 |  12 +-
 arch/arm64/mm/extable.c                              |   2 +-
 arch/arm64/mm/init.c                                 | 119 ++++++++++--
 arch/arm64/mm/kasan_init.c                           |  18 +-
 arch/arm64/mm/mmu.c                                  | 186 +++++++++++++-----
 arch/x86/include/asm/efi.h                           |   2 +
 drivers/firmware/efi/libstub/Makefile                |   2 +-
 drivers/firmware/efi/libstub/arm-stub.c              |  40 ++--
 drivers/firmware/efi/libstub/arm64-stub.c            |  78 +++++---
 drivers/firmware/efi/libstub/efi-stub-helper.c       |   7 +-
 drivers/firmware/efi/libstub/efistub.h               |   7 +
 drivers/firmware/efi/libstub/fdt.c                   |   9 +
 drivers/firmware/efi/libstub/random.c                | 135 +++++++++++++
 drivers/of/fdt.c                                     |   5 +-
 include/linux/efi.h                                  |   5 +-
 init/Kconfig                                         |  16 ++
 kernel/kallsyms.c                                    |  38 +++-
 lib/extable.c                                        |  50 ++++-
 scripts/kallsyms.c                                   |  88 ++++++++-
 scripts/link-vmlinux.sh                              |   4 +
 scripts/namespace.pl                                 |   2 +
 scripts/sortextable.c                                |  10 +-
 53 files changed, 1496 insertions(+), 269 deletions(-)
 create mode 100644 arch/arm64/kernel/kaslr.c
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds
 create mode 100644 drivers/firmware/efi/libstub/random.c

EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface :
     >> Supported RNG Algorithm (Count = 2) :
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface :
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. v2 and up also implement physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v3:
- Implemented base relative kallsyms address tables. This saves 250 KB of
  permanent .rodata (for a defconfig build), but more importantly, saves about
  1.4 MB of __init data in the dynamic relocation table. This patch has been
  picked up by akpm in the mean time, but it is reproduced here for completeness
- Reimplemented the KASLR init code in C. This has a couple of benefits, i.e.,
  we can now parse the 'nokaslr' command line option in a timely fashion, and we
  can pass the KASLR random seed via /chosen/kaslr-seed rather than via x1,
  which means fewer changes to bootloaders. This is implemented using a two pass
  approach, i.e., the kernel is booted without KASLR to a state where it can
  invoke ordinary C code, and it re-enters the early boot code to recreate
  the kernel mappings at an offset if it finds the prerequisite data in the FDT.
  Note that this requires some mild refactoring to ensure that early_fixmap_init
  and fixmap_remap_fdt can cope with being called twice.
- Added a patch to enable the inappropriately named huge-vmap feature for arm64,
  which allows ioremap() (but not vmap or vmalloc) to use block mappings. This
  should be an improvement in itself, but the significance for this series is
  that it also allows the __init region to be unmapped entirely via
  unmap_kernel_range() [which complains about block mappings without this
  feature enabled]
- Split the implementation of CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE into
  separate patches.
- Randomize the module region independently from the core kernel. It is chosen
  such that it covers the [_stext, _etext] interval of the core kernel to avoid
  using PLT entries unless we really have to.
- Update the module PLT patch to replace the O(n^2) searches with sorting, and
  use a single .plt section for __init and ordinary code.
- Added panic notifiers to report the KASLR offset, and whether a mem= limit is
  in effect.
- Replaced Mark Rutland's asm/elf.g split off patch with one that puts the C
  declarations between #ifndef __ASSEMBLY__/#endif
- Incorporated feedback (and tags) from Mark Rutland and Matt Fleming
- Minor tweaks and fixes.
  
Changes since v2:
- Incorporated feedback from Marc Zyngier into the KVM patch (#5)
- Dropped the pgdir section and the patch that memblock_reserve()'s the kernel
  sections at a smaller granularity. This is no longer necessary with the pgdir
  section gone. This also fixes an issue spotted by James Morse where the fixmap
  page tables are not zeroed correctly; these have been moved back to the .bss
  section.
- Got rid of all ifdef'ery regarding the number of translation levels in the
  changed .c files, by introducing new definitions in pgtable.h (#3, #6)
- Fixed KAsan support, which was broken by all earlier versions.
- Moved module region along with the virtually randomized kernel, so that module
  addresses become unpredictable as well, and we only have to rely on veneers in
  the PLTs when the module region is exhausted (which is somewhat more likely
  since the module region is now shared with other uses of the vmalloc area)
- Added support for the 'nokaslr' command line option. This affects the
  randomization performed by the stub, and results in a warning if passed while
  the bootloader also presented a random seed for virtual KASLR in register x1.
- The .text/.rodata sections of the kernel are no longer aliased in the linear
  region with a writable mapping.
- Added a separate image header flag for kernel images that may be loaded at any
  2 MB aligned offset (+ TEXT_OFFSET)
- The KASLR displacement is now corrected if it results in the kernel image
  intersecting a PUD/PMD boundary (4k and 16k/64k granule kernels, respectively)
- Split out UEFI stub random routines into separate patches.
- Implemented a weight based EFI random allocation routine so that each suitable
  offset in available memory is equally likely to be selected (as suggested by
  Kees Cook)
- Reused CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE instead of introducing
  new Kconfig symbols to describe the same functionality.
- Reimplemented mem= logic so memory is clipped from the top first.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in
  /chosen/kaslr-seed upon kernel entry. Since we have no decompressor, it is
  simply not feasible to collect randomness in the head.S code path before
  mapping the kernel and enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 introduces pte_offset_kimg(), pmd_offset_kimg() and pud_offset_kimg()
that allow statically allocated page tables (i.e., by fixmap and kasan) to be
traversed before the linear mapping is installed.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 wires up the huge-vmap generic feature for arm64.

Patch #7 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #8 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range. This is required for KASLR,
since modules may be loaded far away from the core kernel.

Patch #9 and #10 move arm64 to the a new generic relative version of the extable
implementation so that it no longer contains absolute addresses that require
fixing up at relocation time, but uses relative offsets instead.

Patch #11 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS64 relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #12 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #13 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #14 wraps C declarations in asm/elf.h inside #ifndef __ASSEMBLY__ so we
can include it in head.S

Patch #15 updates scripts/sortextable.c so it accepts ET_DYN (relocatable)
executables as well as ET_EXEC (static) executables.

Patch #16 implements base kallsyms base relative address tables.

Patch #17 implements CONFIG_RELOCATABLE, i.e., building vmlinux as a PIE
executable that self relocates at early boot time

Patch #18 implements the core KASLR, by taking randomness supplied in
/chosen/kaslr-seed and using it to move the kernel inside the vmalloc area,
and randomize the module region around it.

Patch #19 implements efi_get_random_bytes() based on the EFI_RNG_PROTOCOL

Patch #20 implements efi_random_alloc()

Patch #21 moves the allocation for the converted command line (UTF-16 to ASCII)
away from the base of memory. This is necessary since for parsing 

Patch #22 implements the actual KASLR, by randomizing the kernel physical
address, and passing entropy in /chosen/kaslr-seed so that the kernel proper can
relocate itself virtually.

Ard Biesheuvel (22):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: add support for ioremap() block mappings
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  extable: add support for relative extables to search and sort routines
  arm64: switch to relative exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: make asm/elf.h available to asm files
  scripts/sortextable: add support for ET_DYN binaries
  kallsyms: add support for relative offsets in kallsyms address table
  arm64: add support for building the kernel as a relocate PIE binary
  arm64: add support for kernel ASLR
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: use high allocation for converted command line
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt                      |  20 +-
 Documentation/features/vm/huge-vmap/arch-support.txt |   2 +-
 arch/arm/include/asm/kvm_asm.h                       |   2 +
 arch/arm/kvm/arm.c                                   |   8 +-
 arch/arm64/Kconfig                                   |  41 ++++
 arch/arm64/Makefile                                  |  10 +-
 arch/arm64/include/asm/assembler.h                   |  26 ++-
 arch/arm64/include/asm/boot.h                        |   6 +
 arch/arm64/include/asm/elf.h                         |  24 ++-
 arch/arm64/include/asm/futex.h                       |  12 +-
 arch/arm64/include/asm/kasan.h                       |   2 +-
 arch/arm64/include/asm/kernel-pgtable.h              |  11 ++
 arch/arm64/include/asm/kvm_asm.h                     |   2 +
 arch/arm64/include/asm/kvm_host.h                    |   8 +-
 arch/arm64/include/asm/memory.h                      |  47 +++--
 arch/arm64/include/asm/module.h                      |  11 ++
 arch/arm64/include/asm/pgtable.h                     |  23 ++-
 arch/arm64/include/asm/uaccess.h                     |  30 +--
 arch/arm64/include/asm/word-at-a-time.h              |   7 +-
 arch/arm64/kernel/Makefile                           |   2 +
 arch/arm64/kernel/armv8_deprecated.c                 |   7 +-
 arch/arm64/kernel/efi-entry.S                        |   2 +-
 arch/arm64/kernel/head.S                             | 134 +++++++++++--
 arch/arm64/kernel/image.h                            |  45 +++--
 arch/arm64/kernel/kaslr.c                            | 169 ++++++++++++++++
 arch/arm64/kernel/module-plts.c                      | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c                           |  20 +-
 arch/arm64/kernel/module.lds                         |   3 +
 arch/arm64/kernel/setup.c                            |  29 +++
 arch/arm64/kernel/vmlinux.lds.S                      |  20 +-
 arch/arm64/kvm/hyp.S                                 |   6 +-
 arch/arm64/mm/dump.c                                 |  12 +-
 arch/arm64/mm/extable.c                              |   2 +-
 arch/arm64/mm/init.c                                 | 119 ++++++++++--
 arch/arm64/mm/kasan_init.c                           |  18 +-
 arch/arm64/mm/mmu.c                                  | 186 +++++++++++++-----
 arch/x86/include/asm/efi.h                           |   2 +
 drivers/firmware/efi/libstub/Makefile                |   2 +-
 drivers/firmware/efi/libstub/arm-stub.c              |  40 ++--
 drivers/firmware/efi/libstub/arm64-stub.c            |  78 +++++---
 drivers/firmware/efi/libstub/efi-stub-helper.c       |   7 +-
 drivers/firmware/efi/libstub/efistub.h               |   7 +
 drivers/firmware/efi/libstub/fdt.c                   |   9 +
 drivers/firmware/efi/libstub/random.c                | 135 +++++++++++++
 drivers/of/fdt.c                                     |   5 +-
 include/linux/efi.h                                  |   5 +-
 init/Kconfig                                         |  16 ++
 kernel/kallsyms.c                                    |  38 +++-
 lib/extable.c                                        |  50 ++++-
 scripts/kallsyms.c                                   |  88 ++++++++-
 scripts/link-vmlinux.sh                              |   4 +
 scripts/namespace.pl                                 |   2 +
 scripts/sortextable.c                                |  10 +-
 53 files changed, 1496 insertions(+), 269 deletions(-)
 create mode 100644 arch/arm64/kernel/kaslr.c
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds
 create mode 100644 drivers/firmware/efi/libstub/random.c

EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface :
     >> Supported RNG Algorithm (Count = 2) :
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface :
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-26 17:10 ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. v2 and up also implement physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v3:
- Implemented base relative kallsyms address tables. This saves 250 KB of
  permanent .rodata (for a defconfig build), but more importantly, saves about
  1.4 MB of __init data in the dynamic relocation table. This patch has been
  picked up by akpm in the mean time, but it is reproduced here for completeness
- Reimplemented the KASLR init code in C. This has a couple of benefits, i.e.,
  we can now parse the 'nokaslr' command line option in a timely fashion, and we
  can pass the KASLR random seed via /chosen/kaslr-seed rather than via x1,
  which means fewer changes to bootloaders. This is implemented using a two pass
  approach, i.e., the kernel is booted without KASLR to a state where it can
  invoke ordinary C code, and it re-enters the early boot code to recreate
  the kernel mappings at an offset if it finds the prerequisite data in the FDT.
  Note that this requires some mild refactoring to ensure that early_fixmap_init
  and fixmap_remap_fdt can cope with being called twice.
- Added a patch to enable the inappropriately named huge-vmap feature for arm64,
  which allows ioremap() (but not vmap or vmalloc) to use block mappings. This
  should be an improvement in itself, but the significance for this series is
  that it also allows the __init region to be unmapped entirely via
  unmap_kernel_range() [which complains about block mappings without this
  feature enabled]
- Split the implementation of CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE into
  separate patches.
- Randomize the module region independently from the core kernel. It is chosen
  such that it covers the [_stext, _etext] interval of the core kernel to avoid
  using PLT entries unless we really have to.
- Update the module PLT patch to replace the O(n^2) searches with sorting, and
  use a single .plt section for __init and ordinary code.
- Added panic notifiers to report the KASLR offset, and whether a mem= limit is
  in effect.
- Replaced Mark Rutland's asm/elf.g split off patch with one that puts the C
  declarations between #ifndef __ASSEMBLY__/#endif
- Incorporated feedback (and tags) from Mark Rutland and Matt Fleming
- Minor tweaks and fixes.
  
Changes since v2:
- Incorporated feedback from Marc Zyngier into the KVM patch (#5)
- Dropped the pgdir section and the patch that memblock_reserve()'s the kernel
  sections at a smaller granularity. This is no longer necessary with the pgdir
  section gone. This also fixes an issue spotted by James Morse where the fixmap
  page tables are not zeroed correctly; these have been moved back to the .bss
  section.
- Got rid of all ifdef'ery regarding the number of translation levels in the
  changed .c files, by introducing new definitions in pgtable.h (#3, #6)
- Fixed KAsan support, which was broken by all earlier versions.
- Moved module region along with the virtually randomized kernel, so that module
  addresses become unpredictable as well, and we only have to rely on veneers in
  the PLTs when the module region is exhausted (which is somewhat more likely
  since the module region is now shared with other uses of the vmalloc area)
- Added support for the 'nokaslr' command line option. This affects the
  randomization performed by the stub, and results in a warning if passed while
  the bootloader also presented a random seed for virtual KASLR in register x1.
- The .text/.rodata sections of the kernel are no longer aliased in the linear
  region with a writable mapping.
- Added a separate image header flag for kernel images that may be loaded at any
  2 MB aligned offset (+ TEXT_OFFSET)
- The KASLR displacement is now corrected if it results in the kernel image
  intersecting a PUD/PMD boundary (4k and 16k/64k granule kernels, respectively)
- Split out UEFI stub random routines into separate patches.
- Implemented a weight based EFI random allocation routine so that each suitable
  offset in available memory is equally likely to be selected (as suggested by
  Kees Cook)
- Reused CONFIG_RELOCATABLE and CONFIG_RANDOMIZE_BASE instead of introducing
  new Kconfig symbols to describe the same functionality.
- Reimplemented mem= logic so memory is clipped from the top first.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in
  /chosen/kaslr-seed upon kernel entry. Since we have no decompressor, it is
  simply not feasible to collect randomness in the head.S code path before
  mapping the kernel and enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 introduces pte_offset_kimg(), pmd_offset_kimg() and pud_offset_kimg()
that allow statically allocated page tables (i.e., by fixmap and kasan) to be
traversed before the linear mapping is installed.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 wires up the huge-vmap generic feature for arm64.

Patch #7 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #8 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range. This is required for KASLR,
since modules may be loaded far away from the core kernel.

Patch #9 and #10 move arm64 to the a new generic relative version of the extable
implementation so that it no longer contains absolute addresses that require
fixing up at relocation time, but uses relative offsets instead.

Patch #11 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS64 relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #12 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #13 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #14 wraps C declarations in asm/elf.h inside #ifndef __ASSEMBLY__ so we
can include it in head.S

Patch #15 updates scripts/sortextable.c so it accepts ET_DYN (relocatable)
executables as well as ET_EXEC (static) executables.

Patch #16 implements base kallsyms base relative address tables.

Patch #17 implements CONFIG_RELOCATABLE, i.e., building vmlinux as a PIE
executable that self relocates at early boot time

Patch #18 implements the core KASLR, by taking randomness supplied in
/chosen/kaslr-seed and using it to move the kernel inside the vmalloc area,
and randomize the module region around it.

Patch #19 implements efi_get_random_bytes() based on the EFI_RNG_PROTOCOL

Patch #20 implements efi_random_alloc()

Patch #21 moves the allocation for the converted command line (UTF-16 to ASCII)
away from the base of memory. This is necessary since for parsing 

Patch #22 implements the actual KASLR, by randomizing the kernel physical
address, and passing entropy in /chosen/kaslr-seed so that the kernel proper can
relocate itself virtually.

Ard Biesheuvel (22):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: add support for ioremap() block mappings
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  extable: add support for relative extables to search and sort routines
  arm64: switch to relative exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: make asm/elf.h available to asm files
  scripts/sortextable: add support for ET_DYN binaries
  kallsyms: add support for relative offsets in kallsyms address table
  arm64: add support for building the kernel as a relocate PIE binary
  arm64: add support for kernel ASLR
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: use high allocation for converted command line
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt                      |  20 +-
 Documentation/features/vm/huge-vmap/arch-support.txt |   2 +-
 arch/arm/include/asm/kvm_asm.h                       |   2 +
 arch/arm/kvm/arm.c                                   |   8 +-
 arch/arm64/Kconfig                                   |  41 ++++
 arch/arm64/Makefile                                  |  10 +-
 arch/arm64/include/asm/assembler.h                   |  26 ++-
 arch/arm64/include/asm/boot.h                        |   6 +
 arch/arm64/include/asm/elf.h                         |  24 ++-
 arch/arm64/include/asm/futex.h                       |  12 +-
 arch/arm64/include/asm/kasan.h                       |   2 +-
 arch/arm64/include/asm/kernel-pgtable.h              |  11 ++
 arch/arm64/include/asm/kvm_asm.h                     |   2 +
 arch/arm64/include/asm/kvm_host.h                    |   8 +-
 arch/arm64/include/asm/memory.h                      |  47 +++--
 arch/arm64/include/asm/module.h                      |  11 ++
 arch/arm64/include/asm/pgtable.h                     |  23 ++-
 arch/arm64/include/asm/uaccess.h                     |  30 +--
 arch/arm64/include/asm/word-at-a-time.h              |   7 +-
 arch/arm64/kernel/Makefile                           |   2 +
 arch/arm64/kernel/armv8_deprecated.c                 |   7 +-
 arch/arm64/kernel/efi-entry.S                        |   2 +-
 arch/arm64/kernel/head.S                             | 134 +++++++++++--
 arch/arm64/kernel/image.h                            |  45 +++--
 arch/arm64/kernel/kaslr.c                            | 169 ++++++++++++++++
 arch/arm64/kernel/module-plts.c                      | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c                           |  20 +-
 arch/arm64/kernel/module.lds                         |   3 +
 arch/arm64/kernel/setup.c                            |  29 +++
 arch/arm64/kernel/vmlinux.lds.S                      |  20 +-
 arch/arm64/kvm/hyp.S                                 |   6 +-
 arch/arm64/mm/dump.c                                 |  12 +-
 arch/arm64/mm/extable.c                              |   2 +-
 arch/arm64/mm/init.c                                 | 119 ++++++++++--
 arch/arm64/mm/kasan_init.c                           |  18 +-
 arch/arm64/mm/mmu.c                                  | 186 +++++++++++++-----
 arch/x86/include/asm/efi.h                           |   2 +
 drivers/firmware/efi/libstub/Makefile                |   2 +-
 drivers/firmware/efi/libstub/arm-stub.c              |  40 ++--
 drivers/firmware/efi/libstub/arm64-stub.c            |  78 +++++---
 drivers/firmware/efi/libstub/efi-stub-helper.c       |   7 +-
 drivers/firmware/efi/libstub/efistub.h               |   7 +
 drivers/firmware/efi/libstub/fdt.c                   |   9 +
 drivers/firmware/efi/libstub/random.c                | 135 +++++++++++++
 drivers/of/fdt.c                                     |   5 +-
 include/linux/efi.h                                  |   5 +-
 init/Kconfig                                         |  16 ++
 kernel/kallsyms.c                                    |  38 +++-
 lib/extable.c                                        |  50 ++++-
 scripts/kallsyms.c                                   |  88 ++++++++-
 scripts/link-vmlinux.sh                              |   4 +
 scripts/namespace.pl                                 |   2 +
 scripts/sortextable.c                                |  10 +-
 53 files changed, 1496 insertions(+), 269 deletions(-)
 create mode 100644 arch/arm64/kernel/kaslr.c
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds
 create mode 100644 drivers/firmware/efi/libstub/random.c

EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface :
     >> Supported RNG Algorithm (Count = 2) :
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface :
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 655f79db7899..1f98156f8996 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -976,13 +976,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 51370f8808bc..85181cb60f46 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b78a3c772294..282e3e64a17e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -89,7 +89,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -186,4 +186,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr)	((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 
 #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
 #else
 
 #define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 #define pmd_set_fixmap_offset(pudp, addr)	((pmd_t *)pudp)
 #define pmd_clear_fixmap()
 
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)dir)
+
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
 #else
 
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_set_fixmap_offset(pgdp, addr)	((pud_t *)pgdp)
 #define pud_clear_fixmap()
 
+#define pud_offset_kimg(dir,addr)	((pud_t *)dir)
+
 #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
 
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr)	((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 
 #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
 #else
 
 #define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 #define pmd_set_fixmap_offset(pudp, addr)	((pmd_t *)pudp)
 #define pmd_clear_fixmap()
 
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)dir)
+
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
 #else
 
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_set_fixmap_offset(pgdp, addr)	((pud_t *)pgdp)
 #define pud_clear_fixmap()
 
+#define pud_offset_kimg(dir,addr)	((pud_t *)dir)
+
 #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
 
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/pgtable.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 2fb94ef881f5..e9eaf6e9262d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -446,6 +446,9 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr)	((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
@@ -489,6 +492,9 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 
 #define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
 #else
 
 #define pud_page_paddr(pud)	({ BUILD_BUG(); 0; })
@@ -498,6 +504,8 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 #define pmd_set_fixmap_offset(pudp, addr)	((pmd_t *)pudp)
 #define pmd_clear_fixmap()
 
+#define pmd_offset_kimg(dir,addr)	((pmd_t *)dir)
+
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
 
 #if CONFIG_PGTABLE_LEVELS > 3
@@ -536,6 +544,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 #define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
 
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
 #else
 
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
@@ -545,6 +556,8 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_set_fixmap_offset(pgdp, addr)	((pud_t *)pgdp)
 #define pud_clear_fixmap()
 
+#define pud_offset_kimg(dir,addr)	((pud_t *)dir)
+
 #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
 
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
 
 	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
 
-	return pud_offset(pgd, addr);
+	return pud_offset_kimg(pgd, addr);
 }
 
 static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
 
 	BUG_ON(pud_none(*pud) || pud_bad(*pud));
 
-	return pmd_offset(pud, addr);
+	return pmd_offset_kimg(pud, addr);
 }
 
 static inline pte_t * fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return &bm_pte[pte_index(addr)];
 }
 
 void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
 
 	pgd = pgd_offset_k(addr);
 	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
+	pud = fixmap_pud(addr);
 	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
+	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
 
 	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
 
-	return pud_offset(pgd, addr);
+	return pud_offset_kimg(pgd, addr);
 }
 
 static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
 
 	BUG_ON(pud_none(*pud) || pud_bad(*pud));
 
-	return pmd_offset(pud, addr);
+	return pmd_offset_kimg(pud, addr);
 }
 
 static inline pte_t * fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return &bm_pte[pte_index(addr)];
 }
 
 void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
 
 	pgd = pgd_offset_k(addr);
 	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
+	pud = fixmap_pud(addr);
 	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
+	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Since the fixmap page tables are disjoint from the kernel mapping up
to the top level pgd entry, we can refer to bm_pte[] directly, and there
is no need to walk the page tables and perform __pa()/__va() translations
at each step.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554a94f4..cb3a7bdb4e23 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -583,7 +583,7 @@ static inline pud_t * fixmap_pud(unsigned long addr)
 
 	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
 
-	return pud_offset(pgd, addr);
+	return pud_offset_kimg(pgd, addr);
 }
 
 static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -592,16 +592,12 @@ static inline pmd_t * fixmap_pmd(unsigned long addr)
 
 	BUG_ON(pud_none(*pud) || pud_bad(*pud));
 
-	return pmd_offset(pud, addr);
+	return pmd_offset_kimg(pud, addr);
 }
 
 static inline pte_t * fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return &bm_pte[pte_index(addr)];
 }
 
 void __init early_fixmap_init(void)
@@ -613,14 +609,14 @@ void __init early_fixmap_init(void)
 
 	pgd = pgd_offset_k(addr);
 	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
+	pud = fixmap_pud(addr);
 	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
+	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h    | 2 ++
 arch/arm/kvm/arm.c                | 8 +++++---
 arch/arm64/include/asm/kvm_asm.h  | 2 ++
 arch/arm64/include/asm/kvm_host.h | 8 +++++---
 arch/arm64/kvm/hyp.S              | 6 +++---
 5 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
 #define rr_lo_hi(a1, a2) a1, a2
 #endif
 
+#define kvm_ksym_ref(kva)	(kva)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
 	}
 
-	err = create_hyp_mappings(__start_rodata, __end_rodata);
+	err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+				  kvm_ksym_ref(__end_rodata));
 	if (err) {
 		kvm_err("Cannot map rodata section\n");
 		goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	 * Call initialization code, and switch to the full blown
 	 * HYP code.
 	 */
-	kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
-		     hyp_stack_ptr, vector_ptr);
+	__kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+		       hyp_stack_ptr, vector_ptr);
 }
 
 static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
 #include <asm/assembler.h>
 
 /*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
  *
  * This is not really a variadic function in the classic C-way and care must
  * be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
  * used to implement __hyp_get_vectors in the same way as in
  * arch/arm64/kernel/hyp_stub.S.
  */
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
 	hvc	#0
 	ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h    | 2 ++
 arch/arm/kvm/arm.c                | 8 +++++---
 arch/arm64/include/asm/kvm_asm.h  | 2 ++
 arch/arm64/include/asm/kvm_host.h | 8 +++++---
 arch/arm64/kvm/hyp.S              | 6 +++---
 5 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
 #define rr_lo_hi(a1, a2) a1, a2
 #endif
 
+#define kvm_ksym_ref(kva)	(kva)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
 	}
 
-	err = create_hyp_mappings(__start_rodata, __end_rodata);
+	err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+				  kvm_ksym_ref(__end_rodata));
 	if (err) {
 		kvm_err("Cannot map rodata section\n");
 		goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	 * Call initialization code, and switch to the full blown
 	 * HYP code.
 	 */
-	kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
-		     hyp_stack_ptr, vector_ptr);
+	__kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+		       hyp_stack_ptr, vector_ptr);
 }
 
 static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
 #include <asm/assembler.h>
 
 /*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
  *
  * This is not really a variadic function in the classic C-way and care must
  * be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
  * used to implement __hyp_get_vectors in the same way as in
  * arch/arm64/kernel/hyp_stub.S.
  */
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
 	hvc	#0
 	ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h    | 2 ++
 arch/arm/kvm/arm.c                | 8 +++++---
 arch/arm64/include/asm/kvm_asm.h  | 2 ++
 arch/arm64/include/asm/kvm_host.h | 8 +++++---
 arch/arm64/kvm/hyp.S              | 6 +++---
 5 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..c35c349da069 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
 #define rr_lo_hi(a1, a2) a1, a2
 #endif
 
+#define kvm_ksym_ref(kva)	(kva)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..975da6cfbf59 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -982,7 +982,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1074,13 +1074,15 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
 	}
 
-	err = create_hyp_mappings(__start_rodata, __end_rodata);
+	err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
+				  kvm_ksym_ref(__end_rodata));
 	if (err) {
 		kvm_err("Cannot map rodata section\n");
 		goto out_free_mappings;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 52b777b7d407..30e0e04c9f16 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,6 +26,8 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..e3d67ff8798b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -307,7 +307,7 @@ static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -328,8 +328,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	 * Call initialization code, and switch to the full blown
 	 * HYP code.
 	 */
-	kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
-		     hyp_stack_ptr, vector_ptr);
+	__kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+		       hyp_stack_ptr, vector_ptr);
 }
 
 static inline void kvm_arch_hardware_disable(void) {}
@@ -343,4 +343,6 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..870578f84b1c 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -20,7 +20,7 @@
 #include <asm/assembler.h>
 
 /*
- * u64 kvm_call_hyp(void *hypfn, ...);
+ * u64 __kvm_call_hyp(void *hypfn, ...);
  *
  * This is not really a variadic function in the classic C-way and care must
  * be taken when calling this to ensure parameters are passed in registers
@@ -37,7 +37,7 @@
  * used to implement __hyp_get_vectors in the same way as in
  * arch/arm64/kernel/hyp_stub.S.
  */
-ENTRY(kvm_call_hyp)
+ENTRY(__kvm_call_hyp)
 	hvc	#0
 	ret
-ENDPROC(kvm_call_hyp)
+ENDPROC(__kvm_call_hyp)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 06/22] arm64: add support for ioremap() block mappings
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/features/vm/huge-vmap/arch-support.txt |  2 +-
 arch/arm64/Kconfig                                   |  1 +
 arch/arm64/include/asm/memory.h                      |  6 +++
 arch/arm64/mm/mmu.c                                  | 41 ++++++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
     |       alpha: | TODO |
     |         arc: | TODO |
     |         arm: | TODO |
-    |       arm64: | TODO |
+    |       arm64: |  ok  |
     |       avr32: | TODO |
     |    blackfin: | TODO |
     |         c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
 	select HAVE_ALIGNED_STRUCT_PAGE if SLUB
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_BITREVERSE
+	select HAVE_ARCH_HUGE_VMAP
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
 	select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
 #define MT_S2_NORMAL		0xf
 #define MT_S2_DEVICE_nGnRE	0x1
 
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER	(PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER	(PMD_SHIFT)
+#endif
+
 #ifndef __ASSEMBLY__
 
 extern phys_addr_t		memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 	return dt_virt;
 }
+
+int __init arch_ioremap_pud_supported(void)
+{
+	/* only 4k granule supports level 1 block mappings */
+	return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+	return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PUD_MASK);
+	set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PMD_MASK);
+	set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+	if (!pud_sect(*pud))
+		return 0;
+	pud_clear(pud);
+	return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+	if (!pmd_sect(*pmd))
+		return 0;
+	pmd_clear(pmd);
+	return 1;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 06/22] arm64: add support for ioremap() block mappings
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/features/vm/huge-vmap/arch-support.txt |  2 +-
 arch/arm64/Kconfig                                   |  1 +
 arch/arm64/include/asm/memory.h                      |  6 +++
 arch/arm64/mm/mmu.c                                  | 41 ++++++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
     |       alpha: | TODO |
     |         arc: | TODO |
     |         arm: | TODO |
-    |       arm64: | TODO |
+    |       arm64: |  ok  |
     |       avr32: | TODO |
     |    blackfin: | TODO |
     |         c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
 	select HAVE_ALIGNED_STRUCT_PAGE if SLUB
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_BITREVERSE
+	select HAVE_ARCH_HUGE_VMAP
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
 	select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
 #define MT_S2_NORMAL		0xf
 #define MT_S2_DEVICE_nGnRE	0x1
 
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER	(PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER	(PMD_SHIFT)
+#endif
+
 #ifndef __ASSEMBLY__
 
 extern phys_addr_t		memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 	return dt_virt;
 }
+
+int __init arch_ioremap_pud_supported(void)
+{
+	/* only 4k granule supports level 1 block mappings */
+	return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+	return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PUD_MASK);
+	set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PMD_MASK);
+	set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+	if (!pud_sect(*pud))
+		return 0;
+	pud_clear(pud);
+	return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+	if (!pmd_sect(*pmd))
+		return 0;
+	pmd_clear(pmd);
+	return 1;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 06/22] arm64: add support for ioremap() block mappings
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/features/vm/huge-vmap/arch-support.txt |  2 +-
 arch/arm64/Kconfig                                   |  1 +
 arch/arm64/include/asm/memory.h                      |  6 +++
 arch/arm64/mm/mmu.c                                  | 41 ++++++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bccb43..df1d1f3c9af2 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
     |       alpha: | TODO |
     |         arc: | TODO |
     |         arm: | TODO |
-    |       arm64: | TODO |
+    |       arm64: |  ok  |
     |       avr32: | TODO |
     |    blackfin: | TODO |
     |         c6x: | TODO |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..cd767fa3037a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -49,6 +49,7 @@ config ARM64
 	select HAVE_ALIGNED_STRUCT_PAGE if SLUB
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_BITREVERSE
+	select HAVE_ARCH_HUGE_VMAP
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
 	select HAVE_ARCH_KGDB
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..aebc739f5a11 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -106,6 +106,12 @@
 #define MT_S2_NORMAL		0xf
 #define MT_S2_DEVICE_nGnRE	0x1
 
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER	(PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER	(PMD_SHIFT)
+#endif
+
 #ifndef __ASSEMBLY__
 
 extern phys_addr_t		memstart_addr;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cb3a7bdb4e23..b84915723ea0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -710,3 +710,44 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 	return dt_virt;
 }
+
+int __init arch_ioremap_pud_supported(void)
+{
+	/* only 4k granule supports level 1 block mappings */
+	return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+}
+
+int __init arch_ioremap_pmd_supported(void)
+{
+	return 1;
+}
+
+int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PUD_MASK);
+	set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
+{
+	BUG_ON(phys & ~PMD_MASK);
+	set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	return 1;
+}
+
+int pud_clear_huge(pud_t *pud)
+{
+	if (!pud_sect(*pud))
+		return 0;
+	pud_clear(pud);
+	return 1;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+	if (!pmd_sect(*pmd))
+		return 0;
+	pmd_clear(pmd);
+	return 1;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h   |  2 +-
 arch/arm64/include/asm/memory.h  | 21 +++--
 arch/arm64/include/asm/pgtable.h | 10 +-
 arch/arm64/mm/dump.c             | 12 +--
 arch/arm64/mm/init.c             | 23 ++---
 arch/arm64/mm/kasan_init.c       | 18 +++-
 arch/arm64/mm/mmu.c              | 97 +++++++++++++-------
 7 files changed, 116 insertions(+), 67 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
 #define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END        (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
 
 /*
  * This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
  * VA_START - the first kernel virtual address.
  * TASK_SIZE - the maximum size of a user space task.
  * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR		(MODULES_END)
+#define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE		(SZ_64M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
 /*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE	(UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE	(0)
+#endif
+
+/*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
  *
  * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
  *	(rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
  *	fixed mappings and modules
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START		(MODULES_END)
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kasan.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	fixup_init();
 	free_initmem_default(0);
+	fixup_init();
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
 #include <linux/start_kernel.h>
 
 #include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/page.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
+#include <asm/sections.h>
 #include <asm/tlbflush.h>
 
 static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
 	if (pmd_none(*pmd))
 		pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
 
-	pte = pte_offset_kernel(pmd, addr);
+	pte = pte_offset_kimg(pmd, addr);
 	do {
 		next = addr + PAGE_SIZE;
 		set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
 	if (pud_none(*pud))
 		pud_populate(&init_mm, pud, kasan_zero_pmd);
 
-	pmd = pmd_offset(pud, addr);
+	pmd = pmd_offset_kimg(pud, addr);
 	do {
 		next = pmd_addr_end(addr, end);
 		kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
 	if (pgd_none(*pgd))
 		pgd_populate(&init_mm, pgd, kasan_zero_pud);
 
-	pud = pud_offset(pgd, addr);
+	pud = pud_offset_kimg(pgd, addr);
 	do {
 		next = pud_addr_end(addr, end);
 		kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
 
 void __init kasan_init(void)
 {
+	u64 kimg_shadow_start, kimg_shadow_end;
 	struct memblock_region *reg;
 
+	kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+	kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
 	/*
 	 * We are going to perform proper setup of shadow memory.
 	 * At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
 
 	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
 
+	vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
 	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
-			kasan_mem_to_shadow((void *)MODULES_VADDR));
+				   (void *)kimg_shadow_start);
+	kasan_populate_zero_shadow((void *)kimg_shadow_end,
+				   kasan_mem_to_shadow((void *)PAGE_OFFSET));
 
 	for_each_memblock(memory, reg) {
 		void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
 
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 {
 
 	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
+	unsigned long kernel_end = __pa(_etext);
 
 	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
+	 * Take care not to create a writable alias for the
+	 * read-only text and rodata sections of the kernel image.
 	 */
 
-	/* No overlap with the kernel. */
+	/* No overlap with the kernel text */
 	if (end < kernel_start || start >= kernel_end) {
 		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
 				     end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 	}
 
 	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
+	 * This block overlaps the kernel text mapping. Map the portion(s) which
 	 * don't overlap.
 	 */
 	if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
 	}
 }
 
-#ifdef CONFIG_DEBUG_RODATA
 void mark_rodata_ro(void)
 {
+	if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+		return;
+
 	create_mapping_late(__pa(_stext), (unsigned long)_stext,
 				(unsigned long)_etext - (unsigned long)_stext,
 				PAGE_KERNEL_ROX);
-
 }
-#endif
 
 void fixup_init(void)
 {
-	create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
-			(unsigned long)__init_end - (unsigned long)__init_begin,
-			PAGE_KERNEL);
+	/*
+	 * Unmap the __init region but leave the VM area in place. This
+	 * prevents the region from being reused for kernel modules, which
+	 * is not supported by kallsyms.
+	 */
+	unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
 }
 
 static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
-				    pgprot_t prot)
+				    pgprot_t prot, struct vm_struct *vma)
 {
 	phys_addr_t pa_start = __pa(va_start);
 	unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
 
 	__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
 			     early_pgtable_alloc);
+
+	vma->addr	= va_start;
+	vma->phys_addr	= pa_start;
+	vma->size	= size;
+	vma->flags	= VM_MAP;
+	vma->caller	= map_kernel_chunk;
+
+	vm_area_add_early(vma);
 }
 
 /*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
  */
 static void __init map_kernel(pgd_t *pgd)
 {
+	static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
 
-	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+			 &vmlinux_init);
+	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
 
-	/*
-	 * The fixmap falls in a separate pgd to the kernel, and doesn't live
-	 * in the carveout for the swapper_pg_dir. We can simply re-use the
-	 * existing dir for the fixmap.
-	 */
-	set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+	if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+		/*
+		 * The fixmap falls in a separate pgd to the kernel, and doesn't
+		 * live in the carveout for the swapper_pg_dir. We can simply
+		 * re-use the existing dir for the fixmap.
+		 */
+		set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+			*pgd_offset_k(FIXADDR_START));
+	} else if (CONFIG_PGTABLE_LEVELS > 3) {
+		/*
+		 * The fixmap shares its top level pgd entry with the kernel
+		 * mapping. This can really only occur when we are running
+		 * with 16k/4 levels, so we can simply reuse the pud level
+		 * entry instead.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+			__pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+		pud_clear_fixmap();
+	} else {
+		BUG();
+	}
 
 	kasan_copy_shadow(pgd);
 }
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = fixmap_pud(addr);
+	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+		/*
+		 * We only end up here if the kernel mapping and the fixmap
+		 * share the top level pgd entry, which should only happen on
+		 * 16k/4 levels configurations.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		pud = pud_offset_kimg(pgd, addr);
+	} else {
+		pgd_populate(&init_mm, pgd, bm_pud);
+		pud = fixmap_pud(addr);
+	}
 	pud_populate(&init_mm, pud, bm_pmd);
 	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h   |  2 +-
 arch/arm64/include/asm/memory.h  | 21 +++--
 arch/arm64/include/asm/pgtable.h | 10 +-
 arch/arm64/mm/dump.c             | 12 +--
 arch/arm64/mm/init.c             | 23 ++---
 arch/arm64/mm/kasan_init.c       | 18 +++-
 arch/arm64/mm/mmu.c              | 97 +++++++++++++-------
 7 files changed, 116 insertions(+), 67 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
 #define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END        (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
 
 /*
  * This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
  * VA_START - the first kernel virtual address.
  * TASK_SIZE - the maximum size of a user space task.
  * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR		(MODULES_END)
+#define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE		(SZ_64M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
 /*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE	(UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE	(0)
+#endif
+
+/*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
  *
  * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
  *	(rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
  *	fixed mappings and modules
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START		(MODULES_END)
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kasan.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	fixup_init();
 	free_initmem_default(0);
+	fixup_init();
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
 #include <linux/start_kernel.h>
 
 #include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/page.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
+#include <asm/sections.h>
 #include <asm/tlbflush.h>
 
 static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
 	if (pmd_none(*pmd))
 		pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
 
-	pte = pte_offset_kernel(pmd, addr);
+	pte = pte_offset_kimg(pmd, addr);
 	do {
 		next = addr + PAGE_SIZE;
 		set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
 	if (pud_none(*pud))
 		pud_populate(&init_mm, pud, kasan_zero_pmd);
 
-	pmd = pmd_offset(pud, addr);
+	pmd = pmd_offset_kimg(pud, addr);
 	do {
 		next = pmd_addr_end(addr, end);
 		kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
 	if (pgd_none(*pgd))
 		pgd_populate(&init_mm, pgd, kasan_zero_pud);
 
-	pud = pud_offset(pgd, addr);
+	pud = pud_offset_kimg(pgd, addr);
 	do {
 		next = pud_addr_end(addr, end);
 		kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
 
 void __init kasan_init(void)
 {
+	u64 kimg_shadow_start, kimg_shadow_end;
 	struct memblock_region *reg;
 
+	kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+	kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
 	/*
 	 * We are going to perform proper setup of shadow memory.
 	 * At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
 
 	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
 
+	vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
 	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
-			kasan_mem_to_shadow((void *)MODULES_VADDR));
+				   (void *)kimg_shadow_start);
+	kasan_populate_zero_shadow((void *)kimg_shadow_end,
+				   kasan_mem_to_shadow((void *)PAGE_OFFSET));
 
 	for_each_memblock(memory, reg) {
 		void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
 
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 {
 
 	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
+	unsigned long kernel_end = __pa(_etext);
 
 	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
+	 * Take care not to create a writable alias for the
+	 * read-only text and rodata sections of the kernel image.
 	 */
 
-	/* No overlap with the kernel. */
+	/* No overlap with the kernel text */
 	if (end < kernel_start || start >= kernel_end) {
 		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
 				     end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 	}
 
 	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
+	 * This block overlaps the kernel text mapping. Map the portion(s) which
 	 * don't overlap.
 	 */
 	if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
 	}
 }
 
-#ifdef CONFIG_DEBUG_RODATA
 void mark_rodata_ro(void)
 {
+	if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+		return;
+
 	create_mapping_late(__pa(_stext), (unsigned long)_stext,
 				(unsigned long)_etext - (unsigned long)_stext,
 				PAGE_KERNEL_ROX);
-
 }
-#endif
 
 void fixup_init(void)
 {
-	create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
-			(unsigned long)__init_end - (unsigned long)__init_begin,
-			PAGE_KERNEL);
+	/*
+	 * Unmap the __init region but leave the VM area in place. This
+	 * prevents the region from being reused for kernel modules, which
+	 * is not supported by kallsyms.
+	 */
+	unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
 }
 
 static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
-				    pgprot_t prot)
+				    pgprot_t prot, struct vm_struct *vma)
 {
 	phys_addr_t pa_start = __pa(va_start);
 	unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
 
 	__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
 			     early_pgtable_alloc);
+
+	vma->addr	= va_start;
+	vma->phys_addr	= pa_start;
+	vma->size	= size;
+	vma->flags	= VM_MAP;
+	vma->caller	= map_kernel_chunk;
+
+	vm_area_add_early(vma);
 }
 
 /*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
  */
 static void __init map_kernel(pgd_t *pgd)
 {
+	static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
 
-	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+			 &vmlinux_init);
+	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
 
-	/*
-	 * The fixmap falls in a separate pgd to the kernel, and doesn't live
-	 * in the carveout for the swapper_pg_dir. We can simply re-use the
-	 * existing dir for the fixmap.
-	 */
-	set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+	if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+		/*
+		 * The fixmap falls in a separate pgd to the kernel, and doesn't
+		 * live in the carveout for the swapper_pg_dir. We can simply
+		 * re-use the existing dir for the fixmap.
+		 */
+		set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+			*pgd_offset_k(FIXADDR_START));
+	} else if (CONFIG_PGTABLE_LEVELS > 3) {
+		/*
+		 * The fixmap shares its top level pgd entry with the kernel
+		 * mapping. This can really only occur when we are running
+		 * with 16k/4 levels, so we can simply reuse the pud level
+		 * entry instead.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+			__pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+		pud_clear_fixmap();
+	} else {
+		BUG();
+	}
 
 	kasan_copy_shadow(pgd);
 }
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = fixmap_pud(addr);
+	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+		/*
+		 * We only end up here if the kernel mapping and the fixmap
+		 * share the top level pgd entry, which should only happen on
+		 * 16k/4 levels configurations.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		pud = pud_offset_kimg(pgd, addr);
+	} else {
+		pgd_populate(&init_mm, pgd, bm_pud);
+		pud = fixmap_pud(addr);
+	}
 	pud_populate(&init_mm, pud, bm_pmd);
 	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing KASLR, which allows the
kernel image to be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h   |  2 +-
 arch/arm64/include/asm/memory.h  | 21 +++--
 arch/arm64/include/asm/pgtable.h | 10 +-
 arch/arm64/mm/dump.c             | 12 +--
 arch/arm64/mm/init.c             | 23 ++---
 arch/arm64/mm/kasan_init.c       | 18 +++-
 arch/arm64/mm/mmu.c              | 97 +++++++++++++-------
 7 files changed, 116 insertions(+), 67 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index de0d21211c34..71ad0f93eb71 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -14,7 +14,7 @@
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
 #define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END        (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
 
 /*
  * This value is used to map an address to the corresponding shadow
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index aebc739f5a11..4388651d1f0d 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -45,16 +45,15 @@
  * VA_START - the first kernel virtual address.
  * TASK_SIZE - the maximum size of a user space task.
  * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
  */
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR		(MODULES_END)
+#define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE		(SZ_64M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
@@ -72,6 +71,16 @@
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
 /*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE	(UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE	(0)
+#endif
+
+/*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e9eaf6e9262d..550d21574d0d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
  *
  * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
  *	(rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
  *	fixed mappings and modules
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START		(MODULES_END)
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3b061e67bfe..1d627cd8121c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kasan.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -302,22 +303,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -326,11 +331,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
@@ -358,8 +359,8 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	fixup_init();
 	free_initmem_default(0);
+	fixup_init();
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 0ca411fc5ea3..4d0623bfb781 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -17,9 +17,11 @@
 #include <linux/start_kernel.h>
 
 #include <asm/mmu_context.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/page.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
+#include <asm/sections.h>
 #include <asm/tlbflush.h>
 
 static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
@@ -33,7 +35,7 @@ static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr,
 	if (pmd_none(*pmd))
 		pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte);
 
-	pte = pte_offset_kernel(pmd, addr);
+	pte = pte_offset_kimg(pmd, addr);
 	do {
 		next = addr + PAGE_SIZE;
 		set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page),
@@ -51,7 +53,7 @@ static void __init kasan_early_pmd_populate(pud_t *pud,
 	if (pud_none(*pud))
 		pud_populate(&init_mm, pud, kasan_zero_pmd);
 
-	pmd = pmd_offset(pud, addr);
+	pmd = pmd_offset_kimg(pud, addr);
 	do {
 		next = pmd_addr_end(addr, end);
 		kasan_early_pte_populate(pmd, addr, next);
@@ -68,7 +70,7 @@ static void __init kasan_early_pud_populate(pgd_t *pgd,
 	if (pgd_none(*pgd))
 		pgd_populate(&init_mm, pgd, kasan_zero_pud);
 
-	pud = pud_offset(pgd, addr);
+	pud = pud_offset_kimg(pgd, addr);
 	do {
 		next = pud_addr_end(addr, end);
 		kasan_early_pmd_populate(pud, addr, next);
@@ -126,8 +128,12 @@ static void __init clear_pgds(unsigned long start,
 
 void __init kasan_init(void)
 {
+	u64 kimg_shadow_start, kimg_shadow_end;
 	struct memblock_region *reg;
 
+	kimg_shadow_start = (u64)kasan_mem_to_shadow(_text);
+	kimg_shadow_end = (u64)kasan_mem_to_shadow(_end);
+
 	/*
 	 * We are going to perform proper setup of shadow memory.
 	 * At first we should unmap early shadow (clear_pgds() call bellow).
@@ -141,8 +147,12 @@ void __init kasan_init(void)
 
 	clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END);
 
+	vmemmap_populate(kimg_shadow_start, kimg_shadow_end, NUMA_NO_NODE);
+
 	kasan_populate_zero_shadow((void *)KASAN_SHADOW_START,
-			kasan_mem_to_shadow((void *)MODULES_VADDR));
+				   (void *)kimg_shadow_start);
+	kasan_populate_zero_shadow((void *)kimg_shadow_end,
+				   kasan_mem_to_shadow((void *)PAGE_OFFSET));
 
 	for_each_memblock(memory, reg) {
 		void *start = (void *)__phys_to_virt(reg->base);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b84915723ea0..4c4b15932963 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -53,6 +53,10 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
 
+static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
+static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
+static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
@@ -349,14 +353,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 {
 
 	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
+	unsigned long kernel_end = __pa(_etext);
 
 	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
+	 * Take care not to create a writable alias for the
+	 * read-only text and rodata sections of the kernel image.
 	 */
 
-	/* No overlap with the kernel. */
+	/* No overlap with the kernel text */
 	if (end < kernel_start || start >= kernel_end) {
 		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
 				     end - start, PAGE_KERNEL,
@@ -365,7 +369,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 	}
 
 	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
+	 * This block overlaps the kernel text mapping. Map the portion(s) which
 	 * don't overlap.
 	 */
 	if (start < kernel_start)
@@ -398,25 +402,28 @@ static void __init map_mem(pgd_t *pgd)
 	}
 }
 
-#ifdef CONFIG_DEBUG_RODATA
 void mark_rodata_ro(void)
 {
+	if (!IS_ENABLED(CONFIG_DEBUG_RODATA))
+		return;
+
 	create_mapping_late(__pa(_stext), (unsigned long)_stext,
 				(unsigned long)_etext - (unsigned long)_stext,
 				PAGE_KERNEL_ROX);
-
 }
-#endif
 
 void fixup_init(void)
 {
-	create_mapping_late(__pa(__init_begin), (unsigned long)__init_begin,
-			(unsigned long)__init_end - (unsigned long)__init_begin,
-			PAGE_KERNEL);
+	/*
+	 * Unmap the __init region but leave the VM area in place. This
+	 * prevents the region from being reused for kernel modules, which
+	 * is not supported by kallsyms.
+	 */
+	unmap_kernel_range((u64)__init_begin, (u64)(__init_end - __init_begin));
 }
 
 static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
-				    pgprot_t prot)
+				    pgprot_t prot, struct vm_struct *vma)
 {
 	phys_addr_t pa_start = __pa(va_start);
 	unsigned long size = va_end - va_start;
@@ -426,6 +433,14 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
 
 	__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
 			     early_pgtable_alloc);
+
+	vma->addr	= va_start;
+	vma->phys_addr	= pa_start;
+	vma->size	= size;
+	vma->flags	= VM_MAP;
+	vma->caller	= map_kernel_chunk;
+
+	vm_area_add_early(vma);
 }
 
 /*
@@ -433,17 +448,35 @@ static void __init map_kernel_chunk(pgd_t *pgd, void *va_start, void *va_end,
  */
 static void __init map_kernel(pgd_t *pgd)
 {
+	static struct vm_struct vmlinux_text, vmlinux_init, vmlinux_data;
 
-	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC);
-	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL);
+	map_kernel_chunk(pgd, _stext, _etext, PAGE_KERNEL_EXEC, &vmlinux_text);
+	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
+			 &vmlinux_init);
+	map_kernel_chunk(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
 
-	/*
-	 * The fixmap falls in a separate pgd to the kernel, and doesn't live
-	 * in the carveout for the swapper_pg_dir. We can simply re-use the
-	 * existing dir for the fixmap.
-	 */
-	set_pgd(pgd_offset_raw(pgd, FIXADDR_START), *pgd_offset_k(FIXADDR_START));
+	if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) {
+		/*
+		 * The fixmap falls in a separate pgd to the kernel, and doesn't
+		 * live in the carveout for the swapper_pg_dir. We can simply
+		 * re-use the existing dir for the fixmap.
+		 */
+		set_pgd(pgd_offset_raw(pgd, FIXADDR_START),
+			*pgd_offset_k(FIXADDR_START));
+	} else if (CONFIG_PGTABLE_LEVELS > 3) {
+		/*
+		 * The fixmap shares its top level pgd entry with the kernel
+		 * mapping. This can really only occur when we are running
+		 * with 16k/4 levels, so we can simply reuse the pud level
+		 * entry instead.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
+			__pud(__pa(bm_pmd) | PUD_TYPE_TABLE));
+		pud_clear_fixmap();
+	} else {
+		BUG();
+	}
 
 	kasan_copy_shadow(pgd);
 }
@@ -569,14 +602,6 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgd = pgd_offset_k(addr);
@@ -608,8 +633,18 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = fixmap_pud(addr);
+	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+		/*
+		 * We only end up here if the kernel mapping and the fixmap
+		 * share the top level pgd entry, which should only happen on
+		 * 16k/4 levels configurations.
+		 */
+		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
+		pud = pud_offset_kimg(pgd, addr);
+	} else {
+		pgd_populate(&init_mm, pgd, bm_pud);
+		pud = fixmap_pud(addr);
+	}
 	pud_populate(&init_mm, pud, bm_pmd);
 	pmd = fixmap_pmd(addr);
 	pmd_populate_kernel(&init_mm, pmd, bm_pte);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 08/22] arm64: add support for module PLTs
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)

For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:

                    relocs    branches   unique     !local
  .text              3925       3347       518        219
  .init.text           11          8         7          1
  .exit.text            4          4         4          1
  .text.unlikely       81         67        36         17

('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)

IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
 arch/arm64/Kconfig              |   9 +
 arch/arm64/Makefile             |   6 +-
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |  12 ++
 arch/arm64/kernel/module.lds    |   3 +
 7 files changed, 242 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_MODULE_PLTS
+	bool
+	select ARM64_MODULE_CMODEL_LARGE
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*plt;
+	int			plt_num_entries;
+	int			plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+	/*
+	 * A program that conforms to the AArch64 Procedure Call Standard
+	 * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+	 * IP1 (x17) may be inserted at any branch instruction that is
+	 * exposed to a relocation that supports long branches. Since that
+	 * is exactly what we are dealing with here, we are free to use x16
+	 * as a scratch register in the PLT veneers.
+	 */
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym)
+{
+	struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+	int i = mod->arch.plt_num_entries;
+	u64 val = sym->st_value + rela->r_addend;
+
+	/*
+	 * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+	 * which are listed in the ELF symtab section, but without a type
+	 * or a size.
+	 * So, similar to how the module loader uses the Elf64_Sym::st_value
+	 * field to store the resolved addresses of undefined symbols, let's
+	 * borrow the Elf64_Sym::st_size field (whose value is never used by
+	 * the module loader, even for symbols that are defined) to record
+	 * the address of a symbol's associated PLT entry as we emit it for a
+	 * zero addend relocation (which is the only kind we have to deal with
+	 * in practice). This allows us to find duplicates without having to
+	 * go through the table every time.
+	 */
+	if (rela->r_addend == 0 && sym->st_size != 0) {
+		BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+		return sym->st_size;
+	}
+
+	mod->arch.plt_num_entries++;
+	BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+	/*
+	 * MOVK/MOVN/MOVZ opcode:
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 *
+	 * Rd     := 0x10 (x16)
+	 * hw     := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+	 * opc    := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+	 * sf     := 1 (64-bit variant)
+	 */
+	plt[i] = (struct plt_entry){
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	};
+
+	if (rela->r_addend == 0)
+		sym->st_size = (u64)&plt[i];
+
+	return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b)	((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+	const Elf64_Rela *x = a, *y = b;
+	int i;
+
+	/* sort by type, symbol index and addend */
+	i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+	if (i == 0)
+		i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+	if (i == 0)
+		i = cmp_3way(x->r_addend, y->r_addend);
+	return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+	/*
+	 * Entries are sorted by type, symbol index and addend. That means
+	 * that, if a duplicate entry exists, it must be in the preceding
+	 * slot.
+	 */
+	return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	Elf64_Sym *s;
+	int i;
+
+	for (i = 0; i < num; i++) {
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			/*
+			 * We only have to consider branch targets that resolve
+			 * to undefined symbols. This is not simply a heuristic,
+			 * it is a fundamental limitation, since the PLT itself
+			 * is part of the module, and needs to be within 128 MB
+			 * as well, so modules can never grow beyond that limit.
+			 */
+			s = syms + ELF64_R_SYM(rela[i].r_info);
+			if (s->st_shndx != SHN_UNDEF)
+				break;
+
+			/*
+			 * Jump relocations with non-zero addends against
+			 * undefined symbols are supported by the ELF spec, but
+			 * do not occur in practice (e.g., 'jump n bytes past
+			 * the entry point of undefined function symbol f').
+			 * So we need to support them, but there is no need to
+			 * take them into consideration when trying to optimize
+			 * this code. So let's only check for duplicates when
+			 * the addend is zero: this allows us to record the PLT
+			 * entry address in the symbol table itself, rather than
+			 * having to search the list for duplicates each time we
+			 * emit one.
+			 */
+			if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+				ret++;
+			break;
+		}
+	}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long plt_max_entries = 0;
+	Elf64_Sym *syms = NULL;
+	int i;
+
+	/*
+	 * Find the empty .plt section so we can expand it to store the PLT
+	 * entries. Record the symtab address as well.
+	 */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+			mod->arch.plt = sechdrs + i;
+		else if (sechdrs[i].sh_type == SHT_SYMTAB)
+			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+	}
+
+	if (!mod->arch.plt) {
+		pr_err("%s: module PLT section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+	if (!syms) {
+		pr_err("%s: module symtab section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+		int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+		if (sechdrs[i].sh_type != SHT_RELA)
+			continue;
+
+		/* ignore relocations that operate on non-exec sections */
+		if (!(dstsec->sh_flags & SHF_EXECINSTR))
+			continue;
+
+		/* sort by type, symbol index and addend */
+		sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+		plt_max_entries += count_plts(syms, rels, numrels);
+	}
+
+	mod->arch.plt->sh_type = SHT_NOBITS;
+	mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+	mod->arch.plt_num_entries = 0;
+	mod->arch.plt_max_entries = plt_max_entries;
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
+	if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+		p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+				VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+				NUMA_NO_NODE, __builtin_return_address(0));
+
 	if (p && (kasan_module_alloc(p, size) < 0)) {
 		vfree(p);
 		return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = module_emit_plt_entry(me, &rel[i], sym);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+	.plt (NOLOAD) : { BYTE(0) }
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 08/22] arm64: add support for module PLTs
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)

For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:

                    relocs    branches   unique     !local
  .text              3925       3347       518        219
  .init.text           11          8         7          1
  .exit.text            4          4         4          1
  .text.unlikely       81         67        36         17

('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)

IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
 arch/arm64/Kconfig              |   9 +
 arch/arm64/Makefile             |   6 +-
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |  12 ++
 arch/arm64/kernel/module.lds    |   3 +
 7 files changed, 242 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_MODULE_PLTS
+	bool
+	select ARM64_MODULE_CMODEL_LARGE
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*plt;
+	int			plt_num_entries;
+	int			plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+	/*
+	 * A program that conforms to the AArch64 Procedure Call Standard
+	 * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+	 * IP1 (x17) may be inserted at any branch instruction that is
+	 * exposed to a relocation that supports long branches. Since that
+	 * is exactly what we are dealing with here, we are free to use x16
+	 * as a scratch register in the PLT veneers.
+	 */
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym)
+{
+	struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+	int i = mod->arch.plt_num_entries;
+	u64 val = sym->st_value + rela->r_addend;
+
+	/*
+	 * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+	 * which are listed in the ELF symtab section, but without a type
+	 * or a size.
+	 * So, similar to how the module loader uses the Elf64_Sym::st_value
+	 * field to store the resolved addresses of undefined symbols, let's
+	 * borrow the Elf64_Sym::st_size field (whose value is never used by
+	 * the module loader, even for symbols that are defined) to record
+	 * the address of a symbol's associated PLT entry as we emit it for a
+	 * zero addend relocation (which is the only kind we have to deal with
+	 * in practice). This allows us to find duplicates without having to
+	 * go through the table every time.
+	 */
+	if (rela->r_addend == 0 && sym->st_size != 0) {
+		BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+		return sym->st_size;
+	}
+
+	mod->arch.plt_num_entries++;
+	BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+	/*
+	 * MOVK/MOVN/MOVZ opcode:
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 *
+	 * Rd     := 0x10 (x16)
+	 * hw     := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+	 * opc    := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+	 * sf     := 1 (64-bit variant)
+	 */
+	plt[i] = (struct plt_entry){
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	};
+
+	if (rela->r_addend == 0)
+		sym->st_size = (u64)&plt[i];
+
+	return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b)	((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+	const Elf64_Rela *x = a, *y = b;
+	int i;
+
+	/* sort by type, symbol index and addend */
+	i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+	if (i == 0)
+		i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+	if (i == 0)
+		i = cmp_3way(x->r_addend, y->r_addend);
+	return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+	/*
+	 * Entries are sorted by type, symbol index and addend. That means
+	 * that, if a duplicate entry exists, it must be in the preceding
+	 * slot.
+	 */
+	return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	Elf64_Sym *s;
+	int i;
+
+	for (i = 0; i < num; i++) {
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			/*
+			 * We only have to consider branch targets that resolve
+			 * to undefined symbols. This is not simply a heuristic,
+			 * it is a fundamental limitation, since the PLT itself
+			 * is part of the module, and needs to be within 128 MB
+			 * as well, so modules can never grow beyond that limit.
+			 */
+			s = syms + ELF64_R_SYM(rela[i].r_info);
+			if (s->st_shndx != SHN_UNDEF)
+				break;
+
+			/*
+			 * Jump relocations with non-zero addends against
+			 * undefined symbols are supported by the ELF spec, but
+			 * do not occur in practice (e.g., 'jump n bytes past
+			 * the entry point of undefined function symbol f').
+			 * So we need to support them, but there is no need to
+			 * take them into consideration when trying to optimize
+			 * this code. So let's only check for duplicates when
+			 * the addend is zero: this allows us to record the PLT
+			 * entry address in the symbol table itself, rather than
+			 * having to search the list for duplicates each time we
+			 * emit one.
+			 */
+			if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+				ret++;
+			break;
+		}
+	}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long plt_max_entries = 0;
+	Elf64_Sym *syms = NULL;
+	int i;
+
+	/*
+	 * Find the empty .plt section so we can expand it to store the PLT
+	 * entries. Record the symtab address as well.
+	 */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+			mod->arch.plt = sechdrs + i;
+		else if (sechdrs[i].sh_type == SHT_SYMTAB)
+			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+	}
+
+	if (!mod->arch.plt) {
+		pr_err("%s: module PLT section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+	if (!syms) {
+		pr_err("%s: module symtab section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+		int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+		if (sechdrs[i].sh_type != SHT_RELA)
+			continue;
+
+		/* ignore relocations that operate on non-exec sections */
+		if (!(dstsec->sh_flags & SHF_EXECINSTR))
+			continue;
+
+		/* sort by type, symbol index and addend */
+		sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+		plt_max_entries += count_plts(syms, rels, numrels);
+	}
+
+	mod->arch.plt->sh_type = SHT_NOBITS;
+	mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+	mod->arch.plt_num_entries = 0;
+	mod->arch.plt_max_entries = plt_max_entries;
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
+	if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+		p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+				VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+				NUMA_NO_NODE, __builtin_return_address(0));
+
 	if (p && (kasan_module_alloc(p, size) < 0)) {
 		vfree(p);
 		return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = module_emit_plt_entry(me, &rel[i], sym);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+	.plt (NOLOAD) : { BYTE(0) }
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 08/22] arm64: add support for module PLTs
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)

For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:

                    relocs    branches   unique     !local
  .text              3925       3347       518        219
  .init.text           11          8         7          1
  .exit.text            4          4         4          1
  .text.unlikely       81         67        36         17

('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)

IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.
---
 arch/arm64/Kconfig              |   9 +
 arch/arm64/Makefile             |   6 +-
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 201 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |  12 ++
 arch/arm64/kernel/module.lds    |   3 +
 7 files changed, 242 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cd767fa3037a..141f65ab0ed5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -394,6 +394,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -753,6 +754,14 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_MODULE_PLTS
+	bool
+	select ARM64_MODULE_CMODEL_LARGE
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..db462980c6be 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -41,10 +41,14 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..8652fb613304 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*plt;
+	int			plt_num_entries;
+	int			plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..e2f0a755beaa 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..1ce90d8450ae
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (C) 2014-2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+struct plt_entry {
+	/*
+	 * A program that conforms to the AArch64 Procedure Call Standard
+	 * (AAPCS64) must assume that a veneer that alters IP0 (x16) and/or
+	 * IP1 (x17) may be inserted at any branch instruction that is
+	 * exposed to a relocation that supports long branches. Since that
+	 * is exactly what we are dealing with here, we are free to use x16
+	 * as a scratch register in the PLT veneers.
+	 */
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+};
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+			  Elf64_Sym *sym)
+{
+	struct plt_entry *plt = (struct plt_entry *)mod->arch.plt->sh_addr;
+	int i = mod->arch.plt_num_entries;
+	u64 val = sym->st_value + rela->r_addend;
+
+	/*
+	 * We only emit PLT entries against undefined (SHN_UNDEF) symbols,
+	 * which are listed in the ELF symtab section, but without a type
+	 * or a size.
+	 * So, similar to how the module loader uses the Elf64_Sym::st_value
+	 * field to store the resolved addresses of undefined symbols, let's
+	 * borrow the Elf64_Sym::st_size field (whose value is never used by
+	 * the module loader, even for symbols that are defined) to record
+	 * the address of a symbol's associated PLT entry as we emit it for a
+	 * zero addend relocation (which is the only kind we have to deal with
+	 * in practice). This allows us to find duplicates without having to
+	 * go through the table every time.
+	 */
+	if (rela->r_addend == 0 && sym->st_size != 0) {
+		BUG_ON(sym->st_size < (u64)plt || sym->st_size >= (u64)&plt[i]);
+		return sym->st_size;
+	}
+
+	mod->arch.plt_num_entries++;
+	BUG_ON(mod->arch.plt_num_entries > mod->arch.plt_max_entries);
+
+	/*
+	 * MOVK/MOVN/MOVZ opcode:
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 * | sf[31] | opc[30:29] | 100101 | hw[22:21] | imm16[20:5] | Rd[4:0] |
+	 * +--------+------------+--------+-----------+-------------+---------+
+	 *
+	 * Rd     := 0x10 (x16)
+	 * hw     := 0b00 (no shift), 0b01 (lsl #16), 0b10 (lsl #32)
+	 * opc    := 0b11 (MOVK), 0b00 (MOVN), 0b10 (MOVZ)
+	 * sf     := 1 (64-bit variant)
+	 */
+	plt[i] = (struct plt_entry){
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	};
+
+	if (rela->r_addend == 0)
+		sym->st_size = (u64)&plt[i];
+
+	return (u64)&plt[i];
+}
+
+#define cmp_3way(a,b)	((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+	const Elf64_Rela *x = a, *y = b;
+	int i;
+
+	/* sort by type, symbol index and addend */
+	i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+	if (i == 0)
+		i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+	if (i == 0)
+		i = cmp_3way(x->r_addend, y->r_addend);
+	return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+	/*
+	 * Entries are sorted by type, symbol index and addend. That means
+	 * that, if a duplicate entry exists, it must be in the preceding
+	 * slot.
+	 */
+	return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	Elf64_Sym *s;
+	int i;
+
+	for (i = 0; i < num; i++) {
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			/*
+			 * We only have to consider branch targets that resolve
+			 * to undefined symbols. This is not simply a heuristic,
+			 * it is a fundamental limitation, since the PLT itself
+			 * is part of the module, and needs to be within 128 MB
+			 * as well, so modules can never grow beyond that limit.
+			 */
+			s = syms + ELF64_R_SYM(rela[i].r_info);
+			if (s->st_shndx != SHN_UNDEF)
+				break;
+
+			/*
+			 * Jump relocations with non-zero addends against
+			 * undefined symbols are supported by the ELF spec, but
+			 * do not occur in practice (e.g., 'jump n bytes past
+			 * the entry point of undefined function symbol f').
+			 * So we need to support them, but there is no need to
+			 * take them into consideration when trying to optimize
+			 * this code. So let's only check for duplicates when
+			 * the addend is zero: this allows us to record the PLT
+			 * entry address in the symbol table itself, rather than
+			 * having to search the list for duplicates each time we
+			 * emit one.
+			 */
+			if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+				ret++;
+			break;
+		}
+	}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long plt_max_entries = 0;
+	Elf64_Sym *syms = NULL;
+	int i;
+
+	/*
+	 * Find the empty .plt section so we can expand it to store the PLT
+	 * entries. Record the symtab address as well.
+	 */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (strcmp(".plt", secstrings + sechdrs[i].sh_name) == 0)
+			mod->arch.plt = sechdrs + i;
+		else if (sechdrs[i].sh_type == SHT_SYMTAB)
+			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+	}
+
+	if (!mod->arch.plt) {
+		pr_err("%s: module PLT section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+	if (!syms) {
+		pr_err("%s: module symtab section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+		int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+		if (sechdrs[i].sh_type != SHT_RELA)
+			continue;
+
+		/* ignore relocations that operate on non-exec sections */
+		if (!(dstsec->sh_flags & SHF_EXECINSTR))
+			continue;
+
+		/* sort by type, symbol index and addend */
+		sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+		plt_max_entries += count_plts(syms, rels, numrels);
+	}
+
+	mod->arch.plt->sh_type = SHT_NOBITS;
+	mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.plt->sh_size = plt_max_entries * sizeof(struct plt_entry);
+	mod->arch.plt_num_entries = 0;
+	mod->arch.plt_max_entries = plt_max_entries;
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 93e970231ca9..84113d3e1df1 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -38,6 +38,11 @@ void *module_alloc(unsigned long size)
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
+	if (!p && IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+		p = __vmalloc_node_range(size, MODULE_ALIGN, VMALLOC_START,
+				VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
+				NUMA_NO_NODE, __builtin_return_address(0));
+
 	if (p && (kasan_module_alloc(p, size) < 0)) {
 		vfree(p);
 		return NULL;
@@ -361,6 +366,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = module_emit_plt_entry(me, &rel[i], sym);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..8949f6c6f729
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+	.plt (NOLOAD) : { BYTE(0) }
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.

Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 lib/extable.c | 50 ++++++++++++++++----
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
 #include <linux/sort.h>
 #include <asm/uaccess.h>
 
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x)	((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
 #ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex		NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+	struct exception_table_entry *x = a, *y = b, tmp;
+	int delta = b - a;
+
+	tmp = *x;
+	x->insn = y->insn + delta;
+	y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+	swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+	x->fixup = y->fixup + delta;
+	y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
 /*
  * The exception table needs to be sorted so that the binary
  * search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
 	const struct exception_table_entry *x = a, *y = b;
 
 	/* avoid overflow */
-	if (x->insn > y->insn)
+	if (ex_to_insn(x) > ex_to_insn(y))
 		return 1;
-	if (x->insn < y->insn)
+	if (ex_to_insn(x) < ex_to_insn(y))
 		return -1;
 	return 0;
 }
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
 		  struct exception_table_entry *finish)
 {
 	sort(start, finish - start, sizeof(struct exception_table_entry),
-	     cmp_ex, NULL);
+	     cmp_ex, swap_ex);
 }
 
 #ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
 void trim_init_extable(struct module *m)
 {
 	/*trim the beginning*/
-	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+	while (m->num_exentries &&
+	       within_module_init(ex_to_insn(&m->extable[0]), m)) {
 		m->extable++;
 		m->num_exentries--;
 	}
 	/*trim the end*/
 	while (m->num_exentries &&
-		within_module_init(m->extable[m->num_exentries-1].insn, m))
+	       within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+				  m))
 		m->num_exentries--;
 }
 #endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
 		 * careful, the distance between value and insn
 		 * can be larger than MAX_LONG:
 		 */
-		if (mid->insn < value)
+		if (ex_to_insn(mid) < value)
 			first = mid + 1;
-		else if (mid->insn > value)
+		else if (ex_to_insn(mid) > value)
 			last = mid - 1;
 		else
 			return mid;
-        }
-        return NULL;
+	}
+	return NULL;
 }
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.

Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 lib/extable.c | 50 ++++++++++++++++----
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
 #include <linux/sort.h>
 #include <asm/uaccess.h>
 
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x)	((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
 #ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex		NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+	struct exception_table_entry *x = a, *y = b, tmp;
+	int delta = b - a;
+
+	tmp = *x;
+	x->insn = y->insn + delta;
+	y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+	swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+	x->fixup = y->fixup + delta;
+	y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
 /*
  * The exception table needs to be sorted so that the binary
  * search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
 	const struct exception_table_entry *x = a, *y = b;
 
 	/* avoid overflow */
-	if (x->insn > y->insn)
+	if (ex_to_insn(x) > ex_to_insn(y))
 		return 1;
-	if (x->insn < y->insn)
+	if (ex_to_insn(x) < ex_to_insn(y))
 		return -1;
 	return 0;
 }
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
 		  struct exception_table_entry *finish)
 {
 	sort(start, finish - start, sizeof(struct exception_table_entry),
-	     cmp_ex, NULL);
+	     cmp_ex, swap_ex);
 }
 
 #ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
 void trim_init_extable(struct module *m)
 {
 	/*trim the beginning*/
-	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+	while (m->num_exentries &&
+	       within_module_init(ex_to_insn(&m->extable[0]), m)) {
 		m->extable++;
 		m->num_exentries--;
 	}
 	/*trim the end*/
 	while (m->num_exentries &&
-		within_module_init(m->extable[m->num_exentries-1].insn, m))
+	       within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+				  m))
 		m->num_exentries--;
 }
 #endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
 		 * careful, the distance between value and insn
 		 * can be larger than MAX_LONG:
 		 */
-		if (mid->insn < value)
+		if (ex_to_insn(mid) < value)
 			first = mid + 1;
-		else if (mid->insn > value)
+		else if (ex_to_insn(mid) > value)
 			last = mid - 1;
 		else
 			return mid;
-        }
-        return NULL;
+	}
+	return NULL;
 }
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 09/22] extable: add support for relative extables to search and sort routines
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support to the generic search_extable() and sort_extable()
implementations for dealing with exception table entries whose fields
contain relative offsets rather than absolute addresses.

Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 lib/extable.c | 50 ++++++++++++++++----
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/lib/extable.c b/lib/extable.c
index 4cac81ec225e..0be02ad561e9 100644
--- a/lib/extable.c
+++ b/lib/extable.c
@@ -14,7 +14,37 @@
 #include <linux/sort.h>
 #include <asm/uaccess.h>
 
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define ex_to_insn(x)	((x)->insn)
+#else
+static inline unsigned long ex_to_insn(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+#endif
+
 #ifndef ARCH_HAS_SORT_EXTABLE
+#ifndef ARCH_HAS_RELATIVE_EXTABLE
+#define swap_ex		NULL
+#else
+static void swap_ex(void *a, void *b, int size)
+{
+	struct exception_table_entry *x = a, *y = b, tmp;
+	int delta = b - a;
+
+	tmp = *x;
+	x->insn = y->insn + delta;
+	y->insn = tmp.insn - delta;
+
+#ifdef swap_ex_entry_fixup
+	swap_ex_entry_fixup(x, y, tmp, delta);
+#else
+	x->fixup = y->fixup + delta;
+	y->fixup = tmp.fixup - delta;
+#endif
+}
+#endif /* ARCH_HAS_RELATIVE_EXTABLE */
+
 /*
  * The exception table needs to be sorted so that the binary
  * search that we use to find entries in it works properly.
@@ -26,9 +56,9 @@ static int cmp_ex(const void *a, const void *b)
 	const struct exception_table_entry *x = a, *y = b;
 
 	/* avoid overflow */
-	if (x->insn > y->insn)
+	if (ex_to_insn(x) > ex_to_insn(y))
 		return 1;
-	if (x->insn < y->insn)
+	if (ex_to_insn(x) < ex_to_insn(y))
 		return -1;
 	return 0;
 }
@@ -37,7 +67,7 @@ void sort_extable(struct exception_table_entry *start,
 		  struct exception_table_entry *finish)
 {
 	sort(start, finish - start, sizeof(struct exception_table_entry),
-	     cmp_ex, NULL);
+	     cmp_ex, swap_ex);
 }
 
 #ifdef CONFIG_MODULES
@@ -48,13 +78,15 @@ void sort_extable(struct exception_table_entry *start,
 void trim_init_extable(struct module *m)
 {
 	/*trim the beginning*/
-	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+	while (m->num_exentries &&
+	       within_module_init(ex_to_insn(&m->extable[0]), m)) {
 		m->extable++;
 		m->num_exentries--;
 	}
 	/*trim the end*/
 	while (m->num_exentries &&
-		within_module_init(m->extable[m->num_exentries-1].insn, m))
+	       within_module_init(ex_to_insn(&m->extable[m->num_exentries - 1]),
+				  m))
 		m->num_exentries--;
 }
 #endif /* CONFIG_MODULES */
@@ -81,13 +113,13 @@ search_extable(const struct exception_table_entry *first,
 		 * careful, the distance between value and insn
 		 * can be larger than MAX_LONG:
 		 */
-		if (mid->insn < value)
+		if (ex_to_insn(mid) < value)
 			first = mid + 1;
-		else if (mid->insn > value)
+		else if (ex_to_insn(mid) > value)
 			last = mid - 1;
 		else
 			return mid;
-        }
-        return NULL;
+	}
+	return NULL;
 }
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 10/22] arm64: switch to relative exception tables
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.

This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h      | 15 +++++++---
 arch/arm64/include/asm/futex.h          | 12 +++-----
 arch/arm64/include/asm/uaccess.h        | 30 +++++++++++---------
 arch/arm64/include/asm/word-at-a-time.h |  7 ++---
 arch/arm64/kernel/armv8_deprecated.c    |  7 ++---
 arch/arm64/mm/extable.c                 |  2 +-
 scripts/sortextable.c                   |  2 +-
 7 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
 	dmb	\opt
 	.endm
 
+/*
+ * Emit an entry into the exception table
+ */
+	.macro		_asm_extable, from, to
+	.pushsection	__ex_table, "a"
+	.align		3
+	.long		(\from - .), (\to - .)
+	.popsection
+	.endm
+
 #define USER(l, x...)				\
 9999:	x;					\
-	.section __ex_table,"a";		\
-	.align	3;				\
-	.quad	9999b,l;			\
-	.previous
+	_asm_extable	9999b, l
 
 /*
  * Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
 "4:	mov	%w0, %w5\n"						\
 "	b	3b\n"							\
 "	.popsection\n"							\
-"	.pushsection __ex_table,\"a\"\n"				\
-"	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
-"	.popsection\n"							\
+	_ASM_EXTABLE(1b, 4b)						\
+	_ASM_EXTABLE(2b, 4b)						\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
 	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp)	\
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "4:	mov	%w0, %w6\n"
 "	b	3b\n"
 "	.popsection\n"
-"	.pushsection __ex_table,\"a\"\n"
-"	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
-"	.popsection\n"
+	_ASM_EXTABLE(1b, 4b)
+	_ASM_EXTABLE(2b, 4b)
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
 	: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,9 +50,11 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
+#define ARCH_HAS_RELATIVE_EXTABLE
+
 extern int fixup_exception(struct pt_regs *regs);
 
 #define KERNEL_DS	(-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
 #define access_ok(type, addr, size)	__range_ok(addr, size)
 #define user_addr_max			get_fs
 
+#define _ASM_EXTABLE(from, to)						\
+	"	.pushsection	__ex_table, \"a\"\n"			\
+	"	.align		3\n"					\
+	"	.long		(" #from " - .), (" #to " - .)\n"	\
+	"	.popsection\n"
+
 /*
  * The "__xxx" versions of the user access functions do not verify the address
  * space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	mov	%1, #0\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
 
@@ -190,10 +195,7 @@ do {									\
 	"3:	mov	%w0, %3\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
 
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_WORD_AT_A_TIME_H
 #define __ASM_WORD_AT_A_TIME_H
 
+#include <asm/uaccess.h>
+
 #ifndef __AARCH64EB__
 
 #include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
 #endif
 	"	b	2b\n"
 	"	.popsection\n"
-	"	.pushsection __ex_table,\"a\"\n"
-	"	.align	3\n"
-	"	.quad	1b, 3b\n"
-	"	.popsection"
+	_ASM_EXTABLE(1b, 3b)
 	: "=&r" (ret), "=&r" (offset)
 	: "r" (addr), "Q" (*(unsigned long *)addr));
 
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
 	"4:	mov		%w0, %w5\n"			\
 	"	b		3b\n"				\
 	"	.popsection"					\
-	"	.pushsection	 __ex_table,\"a\"\n"		\
-	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
-	"	.popsection\n"					\
+	_ASM_EXTABLE(0b, 4b)					\
+	_ASM_EXTABLE(1b, 4b)					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
 	: "=&r" (res), "+r" (data), "=&r" (temp)		\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
 
 	return fixup != NULL;
 }
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 10/22] arm64: switch to relative exception tables
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.

This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h      | 15 +++++++---
 arch/arm64/include/asm/futex.h          | 12 +++-----
 arch/arm64/include/asm/uaccess.h        | 30 +++++++++++---------
 arch/arm64/include/asm/word-at-a-time.h |  7 ++---
 arch/arm64/kernel/armv8_deprecated.c    |  7 ++---
 arch/arm64/mm/extable.c                 |  2 +-
 scripts/sortextable.c                   |  2 +-
 7 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
 	dmb	\opt
 	.endm
 
+/*
+ * Emit an entry into the exception table
+ */
+	.macro		_asm_extable, from, to
+	.pushsection	__ex_table, "a"
+	.align		3
+	.long		(\from - .), (\to - .)
+	.popsection
+	.endm
+
 #define USER(l, x...)				\
 9999:	x;					\
-	.section __ex_table,"a";		\
-	.align	3;				\
-	.quad	9999b,l;			\
-	.previous
+	_asm_extable	9999b, l
 
 /*
  * Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
 "4:	mov	%w0, %w5\n"						\
 "	b	3b\n"							\
 "	.popsection\n"							\
-"	.pushsection __ex_table,\"a\"\n"				\
-"	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
-"	.popsection\n"							\
+	_ASM_EXTABLE(1b, 4b)						\
+	_ASM_EXTABLE(2b, 4b)						\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
 	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp)	\
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "4:	mov	%w0, %w6\n"
 "	b	3b\n"
 "	.popsection\n"
-"	.pushsection __ex_table,\"a\"\n"
-"	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
-"	.popsection\n"
+	_ASM_EXTABLE(1b, 4b)
+	_ASM_EXTABLE(2b, 4b)
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
 	: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,9 +50,11 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
+#define ARCH_HAS_RELATIVE_EXTABLE
+
 extern int fixup_exception(struct pt_regs *regs);
 
 #define KERNEL_DS	(-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
 #define access_ok(type, addr, size)	__range_ok(addr, size)
 #define user_addr_max			get_fs
 
+#define _ASM_EXTABLE(from, to)						\
+	"	.pushsection	__ex_table, \"a\"\n"			\
+	"	.align		3\n"					\
+	"	.long		(" #from " - .), (" #to " - .)\n"	\
+	"	.popsection\n"
+
 /*
  * The "__xxx" versions of the user access functions do not verify the address
  * space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	mov	%1, #0\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
 
@@ -190,10 +195,7 @@ do {									\
 	"3:	mov	%w0, %3\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
 
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_WORD_AT_A_TIME_H
 #define __ASM_WORD_AT_A_TIME_H
 
+#include <asm/uaccess.h>
+
 #ifndef __AARCH64EB__
 
 #include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
 #endif
 	"	b	2b\n"
 	"	.popsection\n"
-	"	.pushsection __ex_table,\"a\"\n"
-	"	.align	3\n"
-	"	.quad	1b, 3b\n"
-	"	.popsection"
+	_ASM_EXTABLE(1b, 3b)
 	: "=&r" (ret), "=&r" (offset)
 	: "r" (addr), "Q" (*(unsigned long *)addr));
 
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
 	"4:	mov		%w0, %w5\n"			\
 	"	b		3b\n"				\
 	"	.popsection"					\
-	"	.pushsection	 __ex_table,\"a\"\n"		\
-	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
-	"	.popsection\n"					\
+	_ASM_EXTABLE(0b, 4b)					\
+	_ASM_EXTABLE(1b, 4b)					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
 	: "=&r" (res), "+r" (data), "=&r" (temp)		\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
 
 	return fixup != NULL;
 }
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 10/22] arm64: switch to relative exception tables
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.

This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h      | 15 +++++++---
 arch/arm64/include/asm/futex.h          | 12 +++-----
 arch/arm64/include/asm/uaccess.h        | 30 +++++++++++---------
 arch/arm64/include/asm/word-at-a-time.h |  7 ++---
 arch/arm64/kernel/armv8_deprecated.c    |  7 ++---
 arch/arm64/mm/extable.c                 |  2 +-
 scripts/sortextable.c                   |  2 +-
 7 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..d8bfcc1ce923 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -94,12 +94,19 @@
 	dmb	\opt
 	.endm
 
+/*
+ * Emit an entry into the exception table
+ */
+	.macro		_asm_extable, from, to
+	.pushsection	__ex_table, "a"
+	.align		3
+	.long		(\from - .), (\to - .)
+	.popsection
+	.endm
+
 #define USER(l, x...)				\
 9999:	x;					\
-	.section __ex_table,"a";		\
-	.align	3;				\
-	.quad	9999b,l;			\
-	.previous
+	_asm_extable	9999b, l
 
 /*
  * Register aliases.
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..1ab15a3b5a0e 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -42,10 +42,8 @@
 "4:	mov	%w0, %w5\n"						\
 "	b	3b\n"							\
 "	.popsection\n"							\
-"	.pushsection __ex_table,\"a\"\n"				\
-"	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
-"	.popsection\n"							\
+	_ASM_EXTABLE(1b, 4b)						\
+	_ASM_EXTABLE(2b, 4b)						\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
 	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp)	\
@@ -133,10 +131,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "4:	mov	%w0, %w6\n"
 "	b	3b\n"
 "	.popsection\n"
-"	.pushsection __ex_table,\"a\"\n"
-"	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
-"	.popsection\n"
+	_ASM_EXTABLE(1b, 4b)
+	_ASM_EXTABLE(2b, 4b)
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
 	: "memory");
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..dc11577fab7e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,9 +50,11 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
+#define ARCH_HAS_RELATIVE_EXTABLE
+
 extern int fixup_exception(struct pt_regs *regs);
 
 #define KERNEL_DS	(-1UL)
@@ -105,6 +107,12 @@ static inline void set_fs(mm_segment_t fs)
 #define access_ok(type, addr, size)	__range_ok(addr, size)
 #define user_addr_max			get_fs
 
+#define _ASM_EXTABLE(from, to)						\
+	"	.pushsection	__ex_table, \"a\"\n"			\
+	"	.align		3\n"					\
+	"	.long		(" #from " - .), (" #to " - .)\n"	\
+	"	.popsection\n"
+
 /*
  * The "__xxx" versions of the user access functions do not verify the address
  * space - it must have been done previously with a separate "access_ok()"
@@ -123,10 +131,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	mov	%1, #0\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
 
@@ -190,10 +195,7 @@ do {									\
 	"3:	mov	%w0, %3\n"					\
 	"	b	2b\n"						\
 	"	.previous\n"						\
-	"	.section __ex_table,\"a\"\n"				\
-	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
-	"	.previous"						\
+	_ASM_EXTABLE(1b, 3b)						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
 
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf09e9d9..2b79b8a89457 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_WORD_AT_A_TIME_H
 #define __ASM_WORD_AT_A_TIME_H
 
+#include <asm/uaccess.h>
+
 #ifndef __AARCH64EB__
 
 #include <linux/kernel.h>
@@ -81,10 +83,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr)
 #endif
 	"	b	2b\n"
 	"	.popsection\n"
-	"	.pushsection __ex_table,\"a\"\n"
-	"	.align	3\n"
-	"	.quad	1b, 3b\n"
-	"	.popsection"
+	_ASM_EXTABLE(1b, 3b)
 	: "=&r" (ret), "=&r" (offset)
 	: "r" (addr), "Q" (*(unsigned long *)addr));
 
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207917b1..c37202c0c838 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -297,11 +297,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
 	"4:	mov		%w0, %w5\n"			\
 	"	b		3b\n"				\
 	"	.popsection"					\
-	"	.pushsection	 __ex_table,\"a\"\n"		\
-	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
-	"	.popsection\n"					\
+	_ASM_EXTABLE(0b, 4b)					\
+	_ASM_EXTABLE(1b, 4b)					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
 	: "=&r" (res), "+r" (data), "=&r" (temp)		\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..81acd4706878 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -11,7 +11,7 @@ int fixup_exception(struct pt_regs *regs)
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
 
 	return fixup != NULL;
 }
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.

So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 11 +++++++
 arch/arm64/kernel/head.S           |  6 ++--
 arch/arm64/kernel/image.h          | 32 ++++++++++++--------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	/*
+	 * Emit a 64-bit absolute little endian symbol reference in a way that
+	 * ensures that it will be resolved at build time, even when building a
+	 * PIE binary. This requires cooperation from the linker script, which
+	 * must emit the lo32/hi32 halves individually.
+	 */
+	.macro	le64sym, sym
+	.long	\sym\()_lo32
+	.long	\sym\()_hi32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64sym	_kernel_offset_le		// Image load offset from start of RAM, little-endian
+	le64sym	_kernel_size_le			// Effective size of kernel image, little-endian
+	le64sym	_kernel_flags_le		// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
+#define DEFINE_IMAGE_LE64(sym, data)				\
+	sym##_lo32 = DATA_LE32((data) & 0xffffffff);		\
+	sym##_hi32 = DATA_LE32((data) >> 32)
+
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define __HEAD_FLAG_BE	1
 #else
@@ -58,9 +64,9 @@
  * endian swapped in head.S, all are done here for consistency.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text);	\
+	DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET);	\
+	DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.

So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 11 +++++++
 arch/arm64/kernel/head.S           |  6 ++--
 arch/arm64/kernel/image.h          | 32 ++++++++++++--------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	/*
+	 * Emit a 64-bit absolute little endian symbol reference in a way that
+	 * ensures that it will be resolved at build time, even when building a
+	 * PIE binary. This requires cooperation from the linker script, which
+	 * must emit the lo32/hi32 halves individually.
+	 */
+	.macro	le64sym, sym
+	.long	\sym\()_lo32
+	.long	\sym\()_hi32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64sym	_kernel_offset_le		// Image load offset from start of RAM, little-endian
+	le64sym	_kernel_size_le			// Effective size of kernel image, little-endian
+	le64sym	_kernel_flags_le		// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
+#define DEFINE_IMAGE_LE64(sym, data)				\
+	sym##_lo32 = DATA_LE32((data) & 0xffffffff);		\
+	sym##_hi32 = DATA_LE32((data) >> 32)
+
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define __HEAD_FLAG_BE	1
 #else
@@ -58,9 +64,9 @@
  * endian swapped in head.S, all are done here for consistency.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text);	\
+	DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET);	\
+	DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.

So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 11 +++++++
 arch/arm64/kernel/head.S           |  6 ++--
 arch/arm64/kernel/image.h          | 32 ++++++++++++--------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d8bfcc1ce923..70f7b9e04598 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -222,4 +222,15 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	/*
+	 * Emit a 64-bit absolute little endian symbol reference in a way that
+	 * ensures that it will be resolved at build time, even when building a
+	 * PIE binary. This requires cooperation from the linker script, which
+	 * must emit the lo32/hi32 halves individually.
+	 */
+	.macro	le64sym, sym
+	.long	\sym\()_lo32
+	.long	\sym\()_hi32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85181cb60f46..2a39d4ab02bf 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -83,9 +83,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64sym	_kernel_offset_le		// Image load offset from start of RAM, little-endian
+	le64sym	_kernel_size_le			// Effective size of kernel image, little-endian
+	le64sym	_kernel_flags_le		// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..b64c9b0a4492 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,21 +26,27 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ *
+ * Note that, in order for this to work when building the ELF64 PIE executable
+ * (for KASLR), these values should not be referenced via R_AARCH64_ABS64
+ * relocations, since these are fixed up at runtime rather than at build time
+ * when PIE is in effect. So we need to split them up in 32-bit high and low
+ * words.
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
+#define DEFINE_IMAGE_LE64(sym, data)				\
+	sym##_lo32 = DATA_LE32((data) & 0xffffffff);		\
+	sym##_hi32 = DATA_LE32((data) >> 32)
+
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define __HEAD_FLAG_BE	1
 #else
@@ -58,9 +64,9 @@
  * endian swapped in head.S, all are done here for consistency.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text);	\
+	DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET);	\
+	DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -124,7 +123,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -408,6 +410,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with virtual
+	msr	vbar_el1, x8			// vector table address
+	isb
+
 	// Clear BSS
 	adr_l	x0, __bss_start
 	mov	x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	and	x0, x0, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x0			// save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -124,7 +123,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -408,6 +410,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with virtual
+	msr	vbar_el1, x8			// vector table address
+	isb
+
 	// Clear BSS
 	adr_l	x0, __bss_start
 	mov	x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	and	x0, x0, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x0			// save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2a39d4ab02bf..ab4c8e93e31c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -67,12 +67,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -94,14 +93,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -124,7 +123,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -139,7 +138,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -219,11 +218,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -391,7 +392,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -408,6 +410,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -415,6 +420,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with virtual
+	msr	vbar_el1, x8			// vector table address
+	isb
+
 	// Clear BSS
 	adr_l	x0, __bss_start
 	mov	x1, xzr
@@ -601,13 +610,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	and	x0, x0, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x0			// save thread_info
@@ -632,8 +647,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.

Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt         | 20 ++--
 arch/arm64/include/asm/boot.h           |  6 ++
 arch/arm64/include/asm/kernel-pgtable.h | 11 +++
 arch/arm64/include/asm/kvm_asm.h        |  2 +-
 arch/arm64/include/asm/memory.h         | 15 +--
 arch/arm64/kernel/head.S                |  6 +-
 arch/arm64/kernel/image.h               | 13 ++-
 arch/arm64/mm/init.c                    | 96 +++++++++++++++++++-
 arch/arm64/mm/mmu.c                     |  3 +
 9 files changed, 150 insertions(+), 22 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Kernel physical placement
+			0 - 2MB aligned base should be as close as possible
+			    to the base of DRAM, since memory below it is not
+			    accessible
+			1 - 2MB aligned base may be anywhere in physical
+			    memory
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN		SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
 #define SWAPPER_MM_MMUFLAGS	(PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
 #endif
 
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN	SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN	SZ_1G
+#endif
 
 #endif	/* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)		phys_to_virt((u64)&sym - kimage_voffset)
 
 #ifndef __ASSEMBLY__
 struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
 	and	x4, x4, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	sub	x4, x4, x24			// the kernel virtual and
+	str_l	x4, kimage_voffset, x5		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
 	sym##_hi32 = DATA_LE32((data) >> 32)
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
+#define __HEAD_FLAG_BE		1
 #else
-#define __HEAD_FLAG_BE	0
+#define __HEAD_FLAG_BE		0
 #endif
 
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE	((PAGE_SHIFT - 10) / 2)
 
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE	1
+
+#define __HEAD_FLAGS		((__HEAD_FLAG_BE << 0) |	\
+				 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+				 (__HEAD_FLAG_PHYS_BASE << 3))
 
 /*
  * These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ *                    we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+	u64 mem_size, to_remove;
+	int i;
+
+again:
+	mem_size = memblock_phys_mem_size();
+	if (mem_size <= memory_limit || max <= min)
+		return;
+
+	to_remove = mem_size - memory_limit;
+
+	for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+		struct memblock_region *r = memblock.memory.regions + i;
+		u64 start = max(min, r->base);
+		u64 end = min(max, r->base + r->size);
+
+		if (start >= max || end <= min)
+			continue;
+
+		if (end > min) {
+			u64 size = min(to_remove, end - max(start, min));
+
+			memblock_remove(end - size, size);
+		} else {
+			memblock_remove(start, min(max - start, to_remove));
+		}
+		goto again;
+	}
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   ARM64_MEMSTART_ALIGN);
+
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+		u64 kend = PAGE_ALIGN(__pa(_end));
+		u64 const sz_4g = 0x100000000UL;
+
+		/*
+		 * Clip memory in order of preference:
+		 * - above the kernel and above 4 GB
+		 * - between 4 GB and the start of the kernel (if the kernel
+		 *   is loaded high in memory)
+		 * - between the kernel and 4 GB (if the kernel is loaded
+		 *   low in memory)
+		 * - below 4 GB
+		 */
+		clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+		clip_mem_range(sz_4g, kbase);
+		clip_mem_range(kend, sz_4g);
+		clip_mem_range(0, min(kbase, sz_4g));
+	}
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
 
 __setup("keepinitrd", keepinitrd_setup);
 #endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+	} else {
+		pr_emerg("Memory Limit: none\n");
+	}
+	return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+	.notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &mem_limit_notifier);
+	return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.

Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt         | 20 ++--
 arch/arm64/include/asm/boot.h           |  6 ++
 arch/arm64/include/asm/kernel-pgtable.h | 11 +++
 arch/arm64/include/asm/kvm_asm.h        |  2 +-
 arch/arm64/include/asm/memory.h         | 15 +--
 arch/arm64/kernel/head.S                |  6 +-
 arch/arm64/kernel/image.h               | 13 ++-
 arch/arm64/mm/init.c                    | 96 +++++++++++++++++++-
 arch/arm64/mm/mmu.c                     |  3 +
 9 files changed, 150 insertions(+), 22 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Kernel physical placement
+			0 - 2MB aligned base should be as close as possible
+			    to the base of DRAM, since memory below it is not
+			    accessible
+			1 - 2MB aligned base may be anywhere in physical
+			    memory
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN		SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
 #define SWAPPER_MM_MMUFLAGS	(PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
 #endif
 
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN	SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN	SZ_1G
+#endif
 
 #endif	/* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)		phys_to_virt((u64)&sym - kimage_voffset)
 
 #ifndef __ASSEMBLY__
 struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
 	and	x4, x4, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	sub	x4, x4, x24			// the kernel virtual and
+	str_l	x4, kimage_voffset, x5		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
 	sym##_hi32 = DATA_LE32((data) >> 32)
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
+#define __HEAD_FLAG_BE		1
 #else
-#define __HEAD_FLAG_BE	0
+#define __HEAD_FLAG_BE		0
 #endif
 
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE	((PAGE_SHIFT - 10) / 2)
 
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE	1
+
+#define __HEAD_FLAGS		((__HEAD_FLAG_BE << 0) |	\
+				 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+				 (__HEAD_FLAG_PHYS_BASE << 3))
 
 /*
  * These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ *                    we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+	u64 mem_size, to_remove;
+	int i;
+
+again:
+	mem_size = memblock_phys_mem_size();
+	if (mem_size <= memory_limit || max <= min)
+		return;
+
+	to_remove = mem_size - memory_limit;
+
+	for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+		struct memblock_region *r = memblock.memory.regions + i;
+		u64 start = max(min, r->base);
+		u64 end = min(max, r->base + r->size);
+
+		if (start >= max || end <= min)
+			continue;
+
+		if (end > min) {
+			u64 size = min(to_remove, end - max(start, min));
+
+			memblock_remove(end - size, size);
+		} else {
+			memblock_remove(start, min(max - start, to_remove));
+		}
+		goto again;
+	}
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   ARM64_MEMSTART_ALIGN);
+
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+		u64 kend = PAGE_ALIGN(__pa(_end));
+		u64 const sz_4g = 0x100000000UL;
+
+		/*
+		 * Clip memory in order of preference:
+		 * - above the kernel and above 4 GB
+		 * - between 4 GB and the start of the kernel (if the kernel
+		 *   is loaded high in memory)
+		 * - between the kernel and 4 GB (if the kernel is loaded
+		 *   low in memory)
+		 * - below 4 GB
+		 */
+		clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+		clip_mem_range(sz_4g, kbase);
+		clip_mem_range(kend, sz_4g);
+		clip_mem_range(0, min(kbase, sz_4g));
+	}
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
 
 __setup("keepinitrd", keepinitrd_setup);
 #endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+	} else {
+		pr_emerg("Memory Limit: none\n");
+	}
+	return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+	.notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &mem_limit_notifier);
+	return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Note that limiting memory using mem= is not unambiguous anymore after
this change, considering that the kernel may be at the top of physical
memory, and clipping from the bottom rather than the top will discard
any 32-bit DMA addressable memory first. To deal with this, the handling
of mem= is reimplemented to clip top down, but take special care not to
clip memory that covers the kernel image.

Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt         | 20 ++--
 arch/arm64/include/asm/boot.h           |  6 ++
 arch/arm64/include/asm/kernel-pgtable.h | 11 +++
 arch/arm64/include/asm/kvm_asm.h        |  2 +-
 arch/arm64/include/asm/memory.h         | 15 +--
 arch/arm64/kernel/head.S                |  6 +-
 arch/arm64/kernel/image.h               | 13 ++-
 arch/arm64/mm/init.c                    | 96 +++++++++++++++++++-
 arch/arm64/mm/mmu.c                     |  3 +
 9 files changed, 150 insertions(+), 22 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..67484067ce4f 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Kernel physical placement
+			0 - 2MB aligned base should be as close as possible
+			    to the base of DRAM, since memory below it is not
+			    accessible
+			1 - 2MB aligned base may be anywhere in physical
+			    memory
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..ebf2481889c3 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN		SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..33bd22956cba 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -79,5 +79,16 @@
 #define SWAPPER_MM_MMUFLAGS	(PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
 #endif
 
+/*
+ * To make optimal use of block mappings when laying out the linear mapping,
+ * round down the base of physical memory to a size that can be mapped
+ * efficiently, i.e., either PUD_SIZE (4k) or PMD_SIZE (64k), or a multiple that
+ * can be mapped using contiguous bits in the page tables: 32 * PMD_SIZE (16k)
+ */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define ARM64_MEMSTART_ALIGN	SZ_512M
+#else
+#define ARM64_MEMSTART_ALIGN	SZ_1G
+#endif
 
 #endif	/* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 30e0e04c9f16..054ac25e7c2e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -26,7 +26,7 @@
 #define KVM_ARM64_DEBUG_DIRTY_SHIFT	0
 #define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)		phys_to_virt((u64)&sym - kimage_voffset)
 
 #ifndef __ASSEMBLY__
 struct kvm;
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 4388651d1f0d..61005e7dd6cb 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -127,13 +127,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab4c8e93e31c..b4be53923942 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -437,7 +437,11 @@ __mmap_switched:
 	and	x4, x4, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	sub	x4, x4, x24			// the kernel virtual and
+	str_l	x4, kimage_voffset, x5		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index b64c9b0a4492..ede0c32e38b3 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -48,15 +48,18 @@
 	sym##_hi32 = DATA_LE32((data) >> 32)
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
+#define __HEAD_FLAG_BE		1
 #else
-#define __HEAD_FLAG_BE	0
+#define __HEAD_FLAG_BE		0
 #endif
 
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+#define __HEAD_FLAG_PAGE_SIZE	((PAGE_SHIFT - 10) / 2)
 
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+#define __HEAD_FLAG_PHYS_BASE	1
+
+#define __HEAD_FLAGS		((__HEAD_FLAG_BE << 0) |	\
+				 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+				 (__HEAD_FLAG_PHYS_BASE << 3))
 
 /*
  * These will output as part of the Image header, which should be little-endian
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1d627cd8121c..02a9cfa84692 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,8 +35,10 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -158,9 +160,76 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+/*
+ * clip_mem_range() - remove memblock memory between @min and @max until
+ *                    we meet the limit in 'memory_limit'.
+ */
+static void __init clip_mem_range(u64 min, u64 max)
+{
+	u64 mem_size, to_remove;
+	int i;
+
+again:
+	mem_size = memblock_phys_mem_size();
+	if (mem_size <= memory_limit || max <= min)
+		return;
+
+	to_remove = mem_size - memory_limit;
+
+	for (i = memblock.memory.cnt - 1; i >= 0; i--) {
+		struct memblock_region *r = memblock.memory.regions + i;
+		u64 start = max(min, r->base);
+		u64 end = min(max, r->base + r->size);
+
+		if (start >= max || end <= min)
+			continue;
+
+		if (end > min) {
+			u64 size = min(to_remove, end - max(start, min));
+
+			memblock_remove(end - size, size);
+		} else {
+			memblock_remove(start, min(max - start, to_remove));
+		}
+		goto again;
+	}
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   ARM64_MEMSTART_ALIGN);
+
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	memblock_remove(memstart_addr + linear_region_size, ULLONG_MAX);
+
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		u64 kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+		u64 kend = PAGE_ALIGN(__pa(_end));
+		u64 const sz_4g = 0x100000000UL;
+
+		/*
+		 * Clip memory in order of preference:
+		 * - above the kernel and above 4 GB
+		 * - between 4 GB and the start of the kernel (if the kernel
+		 *   is loaded high in memory)
+		 * - between the kernel and 4 GB (if the kernel is loaded
+		 *   low in memory)
+		 * - below 4 GB
+		 */
+		clip_mem_range(max(sz_4g, kend), ULLONG_MAX);
+		clip_mem_range(sz_4g, kbase);
+		clip_mem_range(kend, sz_4g);
+		clip_mem_range(0, min(kbase, sz_4g));
+	}
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
@@ -381,3 +450,28 @@ static int __init keepinitrd_setup(char *__unused)
 
 __setup("keepinitrd", keepinitrd_setup);
 #endif
+
+/*
+ * Dump out memory limit information on panic.
+ */
+static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p)
+{
+	if (memory_limit != (phys_addr_t)ULLONG_MAX) {
+		pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20);
+	} else {
+		pr_emerg("Memory Limit: none\n");
+	}
+	return 0;
+}
+
+static struct notifier_block mem_limit_notifier = {
+	.notifier_call = dump_mem_limit,
+};
+
+static int __init register_mem_limit_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &mem_limit_notifier);
+	return 0;
+}
+__initcall(register_mem_limit_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 4c4b15932963..8dda38378959 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -46,6 +46,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
 #include <asm/ptrace.h>
 #include <asm/user.h>
 
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs)	\
-	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
 /*
  * AArch64 static relocation types.
  */
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
  */
 #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
 
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs)	\
+	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
 /*
  * When the program starts, a1 contains a pointer to a function to be
  * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
 
 #endif /* CONFIG_COMPAT */
 
+#endif /* !__ASSEMBLY__ */
+
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
 #include <asm/ptrace.h>
 #include <asm/user.h>
 
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs)	\
-	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
 /*
  * AArch64 static relocation types.
  */
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
  */
 #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
 
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs)	\
+	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
 /*
  * When the program starts, a1 contains a pointer to a function to be
  * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
 
 #endif /* CONFIG_COMPAT */
 
+#endif /* !__ASSEMBLY__ */
+
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df49e5b..435f55952e1f 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
 #include <asm/ptrace.h>
 #include <asm/user.h>
 
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs)	\
-	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
 /*
  * AArch64 static relocation types.
  */
@@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
  */
 #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
 
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs)	\
+	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
 /*
  * When the program starts, a1 contains a pointer to a function to be
  * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
@@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
 
 #endif /* CONFIG_COMPAT */
 
+#endif /* !__ASSEMBLY__ */
+
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 scripts/sortextable.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
@@ -304,7 +304,7 @@ do_file(char const *const fname)
 		if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
 		||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
 		if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
 		||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do64(ghdr, fname, custom_sort);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 scripts/sortextable.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
@@ -304,7 +304,7 @@ do_file(char const *const fname)
 		if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
 		||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
 		if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
 		||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do64(ghdr, fname, custom_sort);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Add support to scripts/sortextable for handling relocatable (PIE)
executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
support for the new type, no changes are needed.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 scripts/sortextable.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..19d83647846c 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
@@ -304,7 +304,7 @@ do_file(char const *const fname)
 		if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
 		||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do32(ehdr, fname, custom_sort);
@@ -314,7 +314,7 @@ do_file(char const *const fname)
 		if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
 		||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
 			fprintf(stderr,
-				"unrecognized ET_EXEC file: %s\n", fname);
+				"unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
 			fail_file();
 		}
 		do64(ghdr, fname, custom_sort);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel, Heiko Carstens,
	Michael Ellerman, Ingo Molnar, H. Peter Anvin,
	Benjamin Herrenschmidt, Michal Marek, Rusty Russell,
	Andrew Morton

Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.

On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits.  This saves several hundreds of kilobytes of permanent .rodata on
average.  In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values.  This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64.  Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.

Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.

Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 init/Kconfig            | 16 ++++
 kernel/kallsyms.c       | 38 +++++++--
 scripts/kallsyms.c      | 88 +++++++++++++++++---
 scripts/link-vmlinux.sh |  4 +
 scripts/namespace.pl    |  2 +
 5 files changed, 129 insertions(+), 19 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
 
 	   Say N unless you really need all symbols.
 
+config KALLSYMS_BASE_RELATIVE
+	bool
+	depends on KALLSYMS
+	default !IA64
+	help
+	  Instead of emitting them as absolute values in the native word size,
+	  emit the symbol references in the kallsyms table as 32-bit entries,
+	  each containing either an absolute value in the range [0, S32_MAX] or
+	  a relative value in the range [base, base + S32_MAX], where base is
+	  the lowest relative symbol address encountered in the image.
+
+	  On 64-bit builds, this reduces the size of the address table by 50%,
+	  but more importantly, it results in entries whose values are build
+	  time constants, and no relocation pass is required at runtime to fix
+	  up the entries based on the runtime load address of the kernel.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
  * during the second link stage.
  */
 extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
 extern const u8 kallsyms_names[] __weak;
 
 /*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
 extern const unsigned long kallsyms_num_syms
 __attribute__((weak, section(".rodata")));
 
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
 extern const u8 kallsyms_token_table[] __weak;
 extern const u16 kallsyms_token_index[] __weak;
 
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
 	return name - kallsyms_names;
 }
 
+static unsigned long kallsyms_sym_address(int idx)
+{
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		return kallsyms_addresses[idx];
+
+	/* positive offsets are absolute values */
+	if (kallsyms_offsets[idx] >= 0)
+		return kallsyms_offsets[idx];
+
+	/* negative offsets are relative to kallsyms_relative_base - 1 */
+	return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
 /* Lookup the address for this symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name)
 {
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
 
 		if (strcmp(namebuf, name) == 0)
-			return kallsyms_addresses[i];
+			return kallsyms_sym_address(i);
 	}
 	return module_kallsyms_lookup_name(name);
 }
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
-		ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+		ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
 		if (ret != 0)
 			return ret;
 	}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	unsigned long i, low, high, mid;
 
 	/* This kernel should never had been booted. */
-	BUG_ON(!kallsyms_addresses);
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		BUG_ON(!kallsyms_addresses);
+	else
+		BUG_ON(!kallsyms_offsets);
 
 	/* Do a binary search on the sorted kallsyms_addresses array. */
 	low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
 
 	while (high - low > 1) {
 		mid = low + (high - low) / 2;
-		if (kallsyms_addresses[mid] <= addr)
+		if (kallsyms_sym_address(mid) <= addr)
 			low = mid;
 		else
 			high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	 * Search for the first aliased symbol. Aliased
 	 * symbols are symbols with the same address.
 	 */
-	while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+	while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
 		--low;
 
-	symbol_start = kallsyms_addresses[low];
+	symbol_start = kallsyms_sym_address(low);
 
 	/* Search for next non-aliased symbol. */
 	for (i = low + 1; i < kallsyms_num_syms; i++) {
-		if (kallsyms_addresses[i] > symbol_start) {
-			symbol_end = kallsyms_addresses[i];
+		if (kallsyms_sym_address(i) > symbol_start) {
+			symbol_end = kallsyms_sym_address(i);
 			break;
 		}
 	}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
 	unsigned off = iter->nameoff;
 
 	iter->module_name[0] = '\0';
-	iter->value = kallsyms_addresses[iter->pos];
+	iter->value = kallsyms_sym_address(iter->pos);
 
 	iter->type = kallsyms_get_symbol_type(off);
 
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
+#include <limits.h>
 
 #ifndef ARRAY_SIZE
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
 };
 
 static unsigned long long _text;
+static unsigned long long relative_base;
 static struct addr_range text_ranges[] = {
 	{ "_stext",     "_etext"     },
 	{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
 
 int token_profit[0x10000];
 
@@ -75,7 +78,7 @@ static void usage(void)
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
 			"[--symbol-prefix=<prefix char>] "
 			"[--page-offset=<CONFIG_PAGE_OFFSET>] "
-			"< in.map > out.S\n");
+			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
 
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
 	 */
 	static char *special_symbols[] = {
 		"kallsyms_addresses",
+		"kallsyms_offsets",
+		"kallsyms_relative_base",
 		"kallsyms_num_syms",
 		"kallsyms_names",
 		"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
 
 	printf("\t.section .rodata, \"a\"\n");
 
-	/* Provide proper symbols relocatability by their '_text'
-	 * relativeness.  The symbol names cannot be used to construct
-	 * normal symbol references as the list of symbols contains
-	 * symbols that are declared static and are private to their
-	 * .o files.  This prevents .tmp_kallsyms.o or any other
-	 * object from referencing them.
+	/* Provide proper symbols relocatability by their relativeness
+	 * to a fixed anchor point in the runtime image, either '_text'
+	 * for absolute address tables, in which case the linker will
+	 * emit the final addresses at build time. Otherwise, use the
+	 * offset relative to the lowest value encountered of all relative
+	 * symbols, and emit non-relocatable fixed offsets that will be fixed
+	 * up at runtime.
+	 *
+	 * The symbol names cannot be used to construct normal symbol
+	 * references as the list of symbols contains symbols that are
+	 * declared static and are private to their .o files.  This prevents
+	 * .tmp_kallsyms.o or any other object from referencing them.
 	 */
-	output_label("kallsyms_addresses");
+	if (!base_relative)
+		output_label("kallsyms_addresses");
+	else
+		output_label("kallsyms_offsets");
+
 	for (i = 0; i < table_cnt; i++) {
-		if (!symbol_absolute(&table[i])) {
+		if (base_relative) {
+			long long offset;
+
+			if (symbol_absolute(&table[i])) {
+				offset = table[i].addr;
+				if (offset < 0 || offset > INT_MAX) {
+					fprintf(stderr, "kallsyms failure: "
+						"absolute symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			} else {
+				offset = relative_base - table[i].addr - 1;
+				if (offset < INT_MIN || offset >= 0) {
+					fprintf(stderr, "kallsyms failure: "
+						"relative symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			}
+			printf("\t.long\t%#x\n", (int)offset);
+		} else if (!symbol_absolute(&table[i])) {
 			if (_text <= table[i].addr)
 				printf("\tPTR\t_text + %#llx\n",
 					table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
 	}
 	printf("\n");
 
+	if (base_relative) {
+		output_label("kallsyms_relative_base");
+		printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+		printf("\n");
+	}
+
 	output_label("kallsyms_num_syms");
 	printf("\tPTR\t%d\n", table_cnt);
 	printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
 		}
 }
 
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+	unsigned int i;
+
+	if (kernel_start_addr > 0) {
+		/*
+		 * If the kernel start address was specified, use that as
+		 * the relative base rather than going through the table,
+		 * since it should be a reasonable default, and values below
+		 * it will be ignored anyway.
+		 */
+		relative_base = kernel_start_addr;
+	} else {
+		relative_base = -1ULL;
+		for (i = 0; i < table_cnt; i++)
+			if (!symbol_absolute(&table[i]) &&
+			    table[i].addr < relative_base)
+				relative_base = table[i].addr;
+	}
+}
+
 int main(int argc, char **argv)
 {
 	if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
 			} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
 				const char *p = &argv[i][14];
 				kernel_start_addr = strtoull(p, NULL, 16);
-			} else
+			} else if (strcmp(argv[i], "--base-relative") == 0)
+				base_relative = 1;
+			else
 				usage();
 		}
 	} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
 	read_map(stdin);
 	if (absolute_percpu)
 		make_percpus_absolute();
+	if (base_relative)
+		record_relative_base();
 	sort_symbols();
 	optimize_token_table();
 	write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --absolute-percpu"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+		kallsymopt="${kallsymopt} --base-relative"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
     'kallsyms_names'	=> 1,
     'kallsyms_num_syms'	=> 1,
     'kallsyms_addresses'=> 1,
+    'kallsyms_offsets'	=> 1,
+    'kallsyms_relative_base'=> 1,
     '__this_module'	=> 1,
     '_etext'		=> 1,
     '_edata'		=> 1,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.

On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits.  This saves several hundreds of kilobytes of permanent .rodata on
average.  In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values.  This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64.  Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.

Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.

Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 init/Kconfig            | 16 ++++
 kernel/kallsyms.c       | 38 +++++++--
 scripts/kallsyms.c      | 88 +++++++++++++++++---
 scripts/link-vmlinux.sh |  4 +
 scripts/namespace.pl    |  2 +
 5 files changed, 129 insertions(+), 19 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
 
 	   Say N unless you really need all symbols.
 
+config KALLSYMS_BASE_RELATIVE
+	bool
+	depends on KALLSYMS
+	default !IA64
+	help
+	  Instead of emitting them as absolute values in the native word size,
+	  emit the symbol references in the kallsyms table as 32-bit entries,
+	  each containing either an absolute value in the range [0, S32_MAX] or
+	  a relative value in the range [base, base + S32_MAX], where base is
+	  the lowest relative symbol address encountered in the image.
+
+	  On 64-bit builds, this reduces the size of the address table by 50%,
+	  but more importantly, it results in entries whose values are build
+	  time constants, and no relocation pass is required at runtime to fix
+	  up the entries based on the runtime load address of the kernel.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
  * during the second link stage.
  */
 extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
 extern const u8 kallsyms_names[] __weak;
 
 /*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
 extern const unsigned long kallsyms_num_syms
 __attribute__((weak, section(".rodata")));
 
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
 extern const u8 kallsyms_token_table[] __weak;
 extern const u16 kallsyms_token_index[] __weak;
 
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
 	return name - kallsyms_names;
 }
 
+static unsigned long kallsyms_sym_address(int idx)
+{
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		return kallsyms_addresses[idx];
+
+	/* positive offsets are absolute values */
+	if (kallsyms_offsets[idx] >= 0)
+		return kallsyms_offsets[idx];
+
+	/* negative offsets are relative to kallsyms_relative_base - 1 */
+	return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
 /* Lookup the address for this symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name)
 {
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
 
 		if (strcmp(namebuf, name) == 0)
-			return kallsyms_addresses[i];
+			return kallsyms_sym_address(i);
 	}
 	return module_kallsyms_lookup_name(name);
 }
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
-		ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+		ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
 		if (ret != 0)
 			return ret;
 	}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	unsigned long i, low, high, mid;
 
 	/* This kernel should never had been booted. */
-	BUG_ON(!kallsyms_addresses);
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		BUG_ON(!kallsyms_addresses);
+	else
+		BUG_ON(!kallsyms_offsets);
 
 	/* Do a binary search on the sorted kallsyms_addresses array. */
 	low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
 
 	while (high - low > 1) {
 		mid = low + (high - low) / 2;
-		if (kallsyms_addresses[mid] <= addr)
+		if (kallsyms_sym_address(mid) <= addr)
 			low = mid;
 		else
 			high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	 * Search for the first aliased symbol. Aliased
 	 * symbols are symbols with the same address.
 	 */
-	while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+	while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
 		--low;
 
-	symbol_start = kallsyms_addresses[low];
+	symbol_start = kallsyms_sym_address(low);
 
 	/* Search for next non-aliased symbol. */
 	for (i = low + 1; i < kallsyms_num_syms; i++) {
-		if (kallsyms_addresses[i] > symbol_start) {
-			symbol_end = kallsyms_addresses[i];
+		if (kallsyms_sym_address(i) > symbol_start) {
+			symbol_end = kallsyms_sym_address(i);
 			break;
 		}
 	}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
 	unsigned off = iter->nameoff;
 
 	iter->module_name[0] = '\0';
-	iter->value = kallsyms_addresses[iter->pos];
+	iter->value = kallsyms_sym_address(iter->pos);
 
 	iter->type = kallsyms_get_symbol_type(off);
 
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
+#include <limits.h>
 
 #ifndef ARRAY_SIZE
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
 };
 
 static unsigned long long _text;
+static unsigned long long relative_base;
 static struct addr_range text_ranges[] = {
 	{ "_stext",     "_etext"     },
 	{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
 
 int token_profit[0x10000];
 
@@ -75,7 +78,7 @@ static void usage(void)
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
 			"[--symbol-prefix=<prefix char>] "
 			"[--page-offset=<CONFIG_PAGE_OFFSET>] "
-			"< in.map > out.S\n");
+			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
 
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
 	 */
 	static char *special_symbols[] = {
 		"kallsyms_addresses",
+		"kallsyms_offsets",
+		"kallsyms_relative_base",
 		"kallsyms_num_syms",
 		"kallsyms_names",
 		"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
 
 	printf("\t.section .rodata, \"a\"\n");
 
-	/* Provide proper symbols relocatability by their '_text'
-	 * relativeness.  The symbol names cannot be used to construct
-	 * normal symbol references as the list of symbols contains
-	 * symbols that are declared static and are private to their
-	 * .o files.  This prevents .tmp_kallsyms.o or any other
-	 * object from referencing them.
+	/* Provide proper symbols relocatability by their relativeness
+	 * to a fixed anchor point in the runtime image, either '_text'
+	 * for absolute address tables, in which case the linker will
+	 * emit the final addresses at build time. Otherwise, use the
+	 * offset relative to the lowest value encountered of all relative
+	 * symbols, and emit non-relocatable fixed offsets that will be fixed
+	 * up at runtime.
+	 *
+	 * The symbol names cannot be used to construct normal symbol
+	 * references as the list of symbols contains symbols that are
+	 * declared static and are private to their .o files.  This prevents
+	 * .tmp_kallsyms.o or any other object from referencing them.
 	 */
-	output_label("kallsyms_addresses");
+	if (!base_relative)
+		output_label("kallsyms_addresses");
+	else
+		output_label("kallsyms_offsets");
+
 	for (i = 0; i < table_cnt; i++) {
-		if (!symbol_absolute(&table[i])) {
+		if (base_relative) {
+			long long offset;
+
+			if (symbol_absolute(&table[i])) {
+				offset = table[i].addr;
+				if (offset < 0 || offset > INT_MAX) {
+					fprintf(stderr, "kallsyms failure: "
+						"absolute symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			} else {
+				offset = relative_base - table[i].addr - 1;
+				if (offset < INT_MIN || offset >= 0) {
+					fprintf(stderr, "kallsyms failure: "
+						"relative symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			}
+			printf("\t.long\t%#x\n", (int)offset);
+		} else if (!symbol_absolute(&table[i])) {
 			if (_text <= table[i].addr)
 				printf("\tPTR\t_text + %#llx\n",
 					table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
 	}
 	printf("\n");
 
+	if (base_relative) {
+		output_label("kallsyms_relative_base");
+		printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+		printf("\n");
+	}
+
 	output_label("kallsyms_num_syms");
 	printf("\tPTR\t%d\n", table_cnt);
 	printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
 		}
 }
 
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+	unsigned int i;
+
+	if (kernel_start_addr > 0) {
+		/*
+		 * If the kernel start address was specified, use that as
+		 * the relative base rather than going through the table,
+		 * since it should be a reasonable default, and values below
+		 * it will be ignored anyway.
+		 */
+		relative_base = kernel_start_addr;
+	} else {
+		relative_base = -1ULL;
+		for (i = 0; i < table_cnt; i++)
+			if (!symbol_absolute(&table[i]) &&
+			    table[i].addr < relative_base)
+				relative_base = table[i].addr;
+	}
+}
+
 int main(int argc, char **argv)
 {
 	if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
 			} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
 				const char *p = &argv[i][14];
 				kernel_start_addr = strtoull(p, NULL, 16);
-			} else
+			} else if (strcmp(argv[i], "--base-relative") == 0)
+				base_relative = 1;
+			else
 				usage();
 		}
 	} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
 	read_map(stdin);
 	if (absolute_percpu)
 		make_percpus_absolute();
+	if (base_relative)
+		record_relative_base();
 	sort_symbols();
 	optimize_token_table();
 	write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --absolute-percpu"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+		kallsymopt="${kallsymopt} --base-relative"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
     'kallsyms_names'	=> 1,
     'kallsyms_num_syms'	=> 1,
     'kallsyms_addresses'=> 1,
+    'kallsyms_offsets'	=> 1,
+    'kallsyms_relative_base'=> 1,
     '__this_module'	=> 1,
     '_etext'		=> 1,
     '_edata'		=> 1,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel, Heiko Carstens,
	Michael Ellerman, Ingo Molnar, H. Peter Anvin,
	Benjamin Herrenschmidt, Michal Marek, Rusty Russell,
	Andrew Morton

Similar to how relative extables are implemented, it is possible to emit
the kallsyms table in such a way that it contains offsets relative to some
anchor point in the kernel image rather than absolute addresses.

On 64-bit architectures, it cuts the size of the kallsyms address table in
half, since offsets between kernel symbols can typically be expressed in
32 bits.  This saves several hundreds of kilobytes of permanent .rodata on
average.  In addition, the kallsyms address table is no longer subject to
dynamic relocation when CONFIG_RELOCATABLE is in effect, so the relocation
work done after decompression now doesn't have to do relocation updates
for all these values.  This saves up to 24 bytes (i.e., the size of a
ELF64 RELA relocation table entry) per value, which easily adds up to a
couple of megabytes of uncompressed __init data on ppc64 or arm64.  Even
if these relocation entries typically compress well, the combined size
reduction of 2.8 MB uncompressed for a ppc64_defconfig build (of which 2.4
MB is __init data) results in a ~500 KB space saving in the compressed
image.

Since it is useful for some architectures (like x86) to retain the ability
to emit absolute values as well, this patch adds support for both, by
emitting absolute addresses as positive 32-bit values, and addresses
relative to the lowest encountered relative symbol as negative values,
which are subtracted from the runtime address of this base symbol to
produce the actual address.

Support for the above is enabled by default for all architectures except
IA-64, whose symbols are too far apart to capture in this manner.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 init/Kconfig            | 16 ++++
 kernel/kallsyms.c       | 38 +++++++--
 scripts/kallsyms.c      | 88 +++++++++++++++++---
 scripts/link-vmlinux.sh |  4 +
 scripts/namespace.pl    |  2 +
 5 files changed, 129 insertions(+), 19 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 22320804fbaf..1cc72a068afc 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1420,6 +1420,22 @@ config KALLSYMS_ALL
 
 	   Say N unless you really need all symbols.
 
+config KALLSYMS_BASE_RELATIVE
+	bool
+	depends on KALLSYMS
+	default !IA64
+	help
+	  Instead of emitting them as absolute values in the native word size,
+	  emit the symbol references in the kallsyms table as 32-bit entries,
+	  each containing either an absolute value in the range [0, S32_MAX] or
+	  a relative value in the range [base, base + S32_MAX], where base is
+	  the lowest relative symbol address encountered in the image.
+
+	  On 64-bit builds, this reduces the size of the address table by 50%,
+	  but more importantly, it results in entries whose values are build
+	  time constants, and no relocation pass is required at runtime to fix
+	  up the entries based on the runtime load address of the kernel.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 5c5987f10819..10a8af9d5744 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -38,6 +38,7 @@
  * during the second link stage.
  */
 extern const unsigned long kallsyms_addresses[] __weak;
+extern const int kallsyms_offsets[] __weak;
 extern const u8 kallsyms_names[] __weak;
 
 /*
@@ -47,6 +48,9 @@ extern const u8 kallsyms_names[] __weak;
 extern const unsigned long kallsyms_num_syms
 __attribute__((weak, section(".rodata")));
 
+extern const unsigned long kallsyms_relative_base
+__attribute__((weak, section(".rodata")));
+
 extern const u8 kallsyms_token_table[] __weak;
 extern const u16 kallsyms_token_index[] __weak;
 
@@ -176,6 +180,19 @@ static unsigned int get_symbol_offset(unsigned long pos)
 	return name - kallsyms_names;
 }
 
+static unsigned long kallsyms_sym_address(int idx)
+{
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		return kallsyms_addresses[idx];
+
+	/* positive offsets are absolute values */
+	if (kallsyms_offsets[idx] >= 0)
+		return kallsyms_offsets[idx];
+
+	/* negative offsets are relative to kallsyms_relative_base - 1 */
+	return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
+}
+
 /* Lookup the address for this symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name)
 {
@@ -187,7 +204,7 @@ unsigned long kallsyms_lookup_name(const char *name)
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
 
 		if (strcmp(namebuf, name) == 0)
-			return kallsyms_addresses[i];
+			return kallsyms_sym_address(i);
 	}
 	return module_kallsyms_lookup_name(name);
 }
@@ -204,7 +221,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 
 	for (i = 0, off = 0; i < kallsyms_num_syms; i++) {
 		off = kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf));
-		ret = fn(data, namebuf, NULL, kallsyms_addresses[i]);
+		ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
 		if (ret != 0)
 			return ret;
 	}
@@ -220,7 +237,10 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	unsigned long i, low, high, mid;
 
 	/* This kernel should never had been booted. */
-	BUG_ON(!kallsyms_addresses);
+	if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
+		BUG_ON(!kallsyms_addresses);
+	else
+		BUG_ON(!kallsyms_offsets);
 
 	/* Do a binary search on the sorted kallsyms_addresses array. */
 	low = 0;
@@ -228,7 +248,7 @@ static unsigned long get_symbol_pos(unsigned long addr,
 
 	while (high - low > 1) {
 		mid = low + (high - low) / 2;
-		if (kallsyms_addresses[mid] <= addr)
+		if (kallsyms_sym_address(mid) <= addr)
 			low = mid;
 		else
 			high = mid;
@@ -238,15 +258,15 @@ static unsigned long get_symbol_pos(unsigned long addr,
 	 * Search for the first aliased symbol. Aliased
 	 * symbols are symbols with the same address.
 	 */
-	while (low && kallsyms_addresses[low-1] == kallsyms_addresses[low])
+	while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
 		--low;
 
-	symbol_start = kallsyms_addresses[low];
+	symbol_start = kallsyms_sym_address(low);
 
 	/* Search for next non-aliased symbol. */
 	for (i = low + 1; i < kallsyms_num_syms; i++) {
-		if (kallsyms_addresses[i] > symbol_start) {
-			symbol_end = kallsyms_addresses[i];
+		if (kallsyms_sym_address(i) > symbol_start) {
+			symbol_end = kallsyms_sym_address(i);
 			break;
 		}
 	}
@@ -470,7 +490,7 @@ static unsigned long get_ksymbol_core(struct kallsym_iter *iter)
 	unsigned off = iter->nameoff;
 
 	iter->module_name[0] = '\0';
-	iter->value = kallsyms_addresses[iter->pos];
+	iter->value = kallsyms_sym_address(iter->pos);
 
 	iter->type = kallsyms_get_symbol_type(off);
 
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index d39a1eeb080e..02473b71643b 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -22,6 +22,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
+#include <limits.h>
 
 #ifndef ARRAY_SIZE
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
@@ -43,6 +44,7 @@ struct addr_range {
 };
 
 static unsigned long long _text;
+static unsigned long long relative_base;
 static struct addr_range text_ranges[] = {
 	{ "_stext",     "_etext"     },
 	{ "_sinittext", "_einittext" },
@@ -62,6 +64,7 @@ static int all_symbols = 0;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static unsigned long long kernel_start_addr = 0;
+static int base_relative = 0;
 
 int token_profit[0x10000];
 
@@ -75,7 +78,7 @@ static void usage(void)
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
 			"[--symbol-prefix=<prefix char>] "
 			"[--page-offset=<CONFIG_PAGE_OFFSET>] "
-			"< in.map > out.S\n");
+			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
 
@@ -205,6 +208,8 @@ static int symbol_valid(struct sym_entry *s)
 	 */
 	static char *special_symbols[] = {
 		"kallsyms_addresses",
+		"kallsyms_offsets",
+		"kallsyms_relative_base",
 		"kallsyms_num_syms",
 		"kallsyms_names",
 		"kallsyms_markers",
@@ -349,16 +354,47 @@ static void write_src(void)
 
 	printf("\t.section .rodata, \"a\"\n");
 
-	/* Provide proper symbols relocatability by their '_text'
-	 * relativeness.  The symbol names cannot be used to construct
-	 * normal symbol references as the list of symbols contains
-	 * symbols that are declared static and are private to their
-	 * .o files.  This prevents .tmp_kallsyms.o or any other
-	 * object from referencing them.
+	/* Provide proper symbols relocatability by their relativeness
+	 * to a fixed anchor point in the runtime image, either '_text'
+	 * for absolute address tables, in which case the linker will
+	 * emit the final addresses at build time. Otherwise, use the
+	 * offset relative to the lowest value encountered of all relative
+	 * symbols, and emit non-relocatable fixed offsets that will be fixed
+	 * up at runtime.
+	 *
+	 * The symbol names cannot be used to construct normal symbol
+	 * references as the list of symbols contains symbols that are
+	 * declared static and are private to their .o files.  This prevents
+	 * .tmp_kallsyms.o or any other object from referencing them.
 	 */
-	output_label("kallsyms_addresses");
+	if (!base_relative)
+		output_label("kallsyms_addresses");
+	else
+		output_label("kallsyms_offsets");
+
 	for (i = 0; i < table_cnt; i++) {
-		if (!symbol_absolute(&table[i])) {
+		if (base_relative) {
+			long long offset;
+
+			if (symbol_absolute(&table[i])) {
+				offset = table[i].addr;
+				if (offset < 0 || offset > INT_MAX) {
+					fprintf(stderr, "kallsyms failure: "
+						"absolute symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			} else {
+				offset = relative_base - table[i].addr - 1;
+				if (offset < INT_MIN || offset >= 0) {
+					fprintf(stderr, "kallsyms failure: "
+						"relative symbol value %#llx out of range in relative mode\n",
+						table[i].addr);
+					exit(EXIT_FAILURE);
+				}
+			}
+			printf("\t.long\t%#x\n", (int)offset);
+		} else if (!symbol_absolute(&table[i])) {
 			if (_text <= table[i].addr)
 				printf("\tPTR\t_text + %#llx\n",
 					table[i].addr - _text);
@@ -371,6 +407,12 @@ static void write_src(void)
 	}
 	printf("\n");
 
+	if (base_relative) {
+		output_label("kallsyms_relative_base");
+		printf("\tPTR\t_text - %#llx\n", _text - relative_base);
+		printf("\n");
+	}
+
 	output_label("kallsyms_num_syms");
 	printf("\tPTR\t%d\n", table_cnt);
 	printf("\n");
@@ -695,6 +737,28 @@ static void make_percpus_absolute(void)
 		}
 }
 
+/* find the minimum non-absolute symbol address */
+static void record_relative_base(void)
+{
+	unsigned int i;
+
+	if (kernel_start_addr > 0) {
+		/*
+		 * If the kernel start address was specified, use that as
+		 * the relative base rather than going through the table,
+		 * since it should be a reasonable default, and values below
+		 * it will be ignored anyway.
+		 */
+		relative_base = kernel_start_addr;
+	} else {
+		relative_base = -1ULL;
+		for (i = 0; i < table_cnt; i++)
+			if (!symbol_absolute(&table[i]) &&
+			    table[i].addr < relative_base)
+				relative_base = table[i].addr;
+	}
+}
+
 int main(int argc, char **argv)
 {
 	if (argc >= 2) {
@@ -713,7 +777,9 @@ int main(int argc, char **argv)
 			} else if (strncmp(argv[i], "--page-offset=", 14) == 0) {
 				const char *p = &argv[i][14];
 				kernel_start_addr = strtoull(p, NULL, 16);
-			} else
+			} else if (strcmp(argv[i], "--base-relative") == 0)
+				base_relative = 1;
+			else
 				usage();
 		}
 	} else if (argc != 1)
@@ -722,6 +788,8 @@ int main(int argc, char **argv)
 	read_map(stdin);
 	if (absolute_percpu)
 		make_percpus_absolute();
+	if (base_relative)
+		record_relative_base();
 	sort_symbols();
 	optimize_token_table();
 	write_src();
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index ba6c34ea5429..b58bf908b153 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,6 +90,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --absolute-percpu"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_BASE_RELATIVE}" ]; then
+		kallsymopt="${kallsymopt} --base-relative"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
diff --git a/scripts/namespace.pl b/scripts/namespace.pl
index a71be6b7cdec..9f3c9d47a4a5 100755
--- a/scripts/namespace.pl
+++ b/scripts/namespace.pl
@@ -117,6 +117,8 @@ my %nameexception = (
     'kallsyms_names'	=> 1,
     'kallsyms_num_syms'	=> 1,
     'kallsyms_addresses'=> 1,
+    'kallsyms_offsets'	=> 1,
+    'kallsyms_relative_base'=> 1,
     '__this_module'	=> 1,
     '_etext'		=> 1,
     '_edata'		=> 1,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.

This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              | 12 ++++++++
 arch/arm64/Makefile             |  4 +++
 arch/arm64/include/asm/elf.h    |  2 ++
 arch/arm64/kernel/head.S        | 32 ++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
 5 files changed, 66 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
 	select IOMMU_DMA if IOMMU_SUPPORT
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
+	select KALLSYMS_TEXT_RELATIVE
 	select MODULES_USE_ELF_RELA
 	select NO_BOOTMEM
 	select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
 	select ARM64_MODULE_CMODEL_LARGE
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config RELOCATABLE
+	bool
+	help
+	  This builds the kernel as a Position Independent Executable (PIE),
+	  which retains all relocation metadata required to relocate the
+	  kernel binary at runtime to a different virtual address than the
+	  address it was linked at.
+	  Since AArch64 uses the RELA relocation format, this requires a
+	  relocation pass at runtime even if the kernel is loaded at the
+	  same address it was linked at.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
 #define R_AARCH64_MOVW_PREL_G2_NC	292
 #define R_AARCH64_MOVW_PREL_G3		293
 
+#define R_AARCH64_RELATIVE		1027
+
 /*
  * These are used to set parameters in the core dumps.
  */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/cache.h>
 #include <asm/cputype.h>
+#include <asm/elf.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
 	bl	__pi_memset
 	dsb	ishst				// Make zero page visible to PTW
 
+#ifdef CONFIG_RELOCATABLE
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	str	x13, [x11]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	add	x15, x13, x15
+	str	x15, [x11]
+	b	0b
+
+2:
+#endif
+
 	adr_l	sp, initial_sp, x4
 	mov	x4, sp
 	and	x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
 		EXIT_CALL
 		*(.discard)
 		*(.discard.*)
+		*(.interp .dynamic)
 	}
 
 	. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
+	.dynstr : {
+		*(.dynstr)
+	}
+	.hash : {
+		*(.hash)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.

This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              | 12 ++++++++
 arch/arm64/Makefile             |  4 +++
 arch/arm64/include/asm/elf.h    |  2 ++
 arch/arm64/kernel/head.S        | 32 ++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
 5 files changed, 66 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
 	select IOMMU_DMA if IOMMU_SUPPORT
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
+	select KALLSYMS_TEXT_RELATIVE
 	select MODULES_USE_ELF_RELA
 	select NO_BOOTMEM
 	select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
 	select ARM64_MODULE_CMODEL_LARGE
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config RELOCATABLE
+	bool
+	help
+	  This builds the kernel as a Position Independent Executable (PIE),
+	  which retains all relocation metadata required to relocate the
+	  kernel binary at runtime to a different virtual address than the
+	  address it was linked at.
+	  Since AArch64 uses the RELA relocation format, this requires a
+	  relocation pass at runtime even if the kernel is loaded at the
+	  same address it was linked at.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
 #define R_AARCH64_MOVW_PREL_G2_NC	292
 #define R_AARCH64_MOVW_PREL_G3		293
 
+#define R_AARCH64_RELATIVE		1027
+
 /*
  * These are used to set parameters in the core dumps.
  */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/cache.h>
 #include <asm/cputype.h>
+#include <asm/elf.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
 	bl	__pi_memset
 	dsb	ishst				// Make zero page visible to PTW
 
+#ifdef CONFIG_RELOCATABLE
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	str	x13, [x11]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	add	x15, x13, x15
+	str	x15, [x11]
+	b	0b
+
+2:
+#endif
+
 	adr_l	sp, initial_sp, x4
 	mov	x4, sp
 	and	x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
 		EXIT_CALL
 		*(.discard)
 		*(.discard.*)
+		*(.interp .dynamic)
 	}
 
 	. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
+	.dynstr : {
+		*(.dynstr)
+	}
+	.hash : {
+		*(.hash)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This implements CONFIG_RELOCATABLE, which links the final vmlinux image
with a dynamic relocation section, which allows the early boot code to
perform a relocation to a different virtual address at runtime.

This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              | 12 ++++++++
 arch/arm64/Makefile             |  4 +++
 arch/arm64/include/asm/elf.h    |  2 ++
 arch/arm64/kernel/head.S        | 32 ++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S | 16 ++++++++++
 5 files changed, 66 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 141f65ab0ed5..6aa86f86fd10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -84,6 +84,7 @@ config ARM64
 	select IOMMU_DMA if IOMMU_SUPPORT
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
+	select KALLSYMS_TEXT_RELATIVE
 	select MODULES_USE_ELF_RELA
 	select NO_BOOTMEM
 	select OF
@@ -762,6 +763,17 @@ config ARM64_MODULE_PLTS
 	select ARM64_MODULE_CMODEL_LARGE
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config RELOCATABLE
+	bool
+	help
+	  This builds the kernel as a Position Independent Executable (PIE),
+	  which retains all relocation metadata required to relocate the
+	  kernel binary at runtime to a different virtual address than the
+	  address it was linked at.
+	  Since AArch64 uses the RELA relocation format, this requires a
+	  relocation pass at runtime even if the kernel is loaded at the
+	  same address it was linked at.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index db462980c6be..c3eaa03f9020 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 435f55952e1f..24ed037f09fd 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -77,6 +77,8 @@
 #define R_AARCH64_MOVW_PREL_G2_NC	292
 #define R_AARCH64_MOVW_PREL_G3		293
 
+#define R_AARCH64_RELATIVE		1027
+
 /*
  * These are used to set parameters in the core dumps.
  */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b4be53923942..92f9c26632f3 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -29,6 +29,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/cache.h>
 #include <asm/cputype.h>
+#include <asm/elf.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
@@ -432,6 +433,37 @@ __mmap_switched:
 	bl	__pi_memset
 	dsb	ishst				// Make zero page visible to PTW
 
+#ifdef CONFIG_RELOCATABLE
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	str	x13, [x11]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	add	x15, x13, x15
+	str	x15, [x11]
+	b	0b
+
+2:
+#endif
+
 	adr_l	sp, initial_sp, x4
 	mov	x4, sp
 	and	x4, x4, #~(THREAD_SIZE - 1)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 282e3e64a17e..e3f6cd740ea3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -87,6 +87,7 @@ SECTIONS
 		EXIT_CALL
 		*(.discard)
 		*(.discard.*)
+		*(.interp .dynamic)
 	}
 
 	. = KIMAGE_VADDR + TEXT_OFFSET;
@@ -149,6 +150,21 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
+	.dynstr : {
+		*(.dynstr)
+	}
+	.hash : {
+		*(.hash)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 18/22] arm64: add support for kernel ASLR
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.

The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |  14 ++
 arch/arm64/include/asm/memory.h |   5 +-
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/head.S        |  59 ++++++-
 arch/arm64/kernel/kaslr.c       | 169 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   8 +-
 arch/arm64/kernel/setup.c       |  29 ++++
 arch/arm64/mm/mmu.c             |  33 ++--
 8 files changed, 298 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
 	  relocation pass at runtime even if the kernel is loaded at the
 	  same address it was linked at.
 
+config RANDOMIZE_BASE
+	bool "Randomize the address of the kernel image"
+	select ARM64_MODULE_PLTS
+	select RELOCATABLE
+	help
+	  Randomizes the virtual address at which the kernel image is
+	  loaded, as a security feature that deters exploit attempts
+	  relying on knowledge of the location of kernel internals.
+
+	  It is the bootloader's job to provide entropy, by passing a
+	  random u64 value in /chosen/kaslr-seed at kernel entry.
+
+	  If unsure, say N.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
 #define KIMAGE_VADDR		(MODULES_END)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
 #define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE		(SZ_64M)
+#define MODULES_VSIZE		(SZ_128M)
 #define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
 ENTRY(stext)
 	bl	preserve_boot_args
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
+	mov	x23, xzr			// KASLR offset, defaults to 0
 	adrp	x24, __PHYS_OFFSET
 	bl	set_cpu_boot_mode_flag
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
 __create_page_tables:
 	adrp	x25, idmap_pg_dir
 	adrp	x26, swapper_pg_dir
-	mov	x27, lr
+	mov	x28, lr
 
 	/*
 	 * Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+	add	x5, x5, x23			// add KASLR displacement
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
 	dmb	sy
 	bl	__inval_cache_range
 
-	mov	lr, x27
-	ret
+	ret	x28
 ENDPROC(__create_page_tables)
 
 kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	mov	x28, lr				// preserve LR
 	adr_l	x8, vectors			// load VBAR_EL1 with virtual
 	msr	vbar_el1, x8			// vector table address
 	isb
@@ -449,19 +451,26 @@ __mmap_switched:
 	ldr	x13, [x9, #-8]
 	cmp	w12, #R_AARCH64_RELATIVE
 	b.ne	1f
-	str	x13, [x11]
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
 	b	0b
 
 1:	cmp	w12, #R_AARCH64_ABS64
 	b.ne	0b
 	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
 	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
 	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
 	add	x15, x13, x15
-	str	x15, [x11]
+	str	x15, [x11, x23]
 	b	0b
 
-2:
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
 #endif
 
 	adr_l	sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x4, kimage_vaddr		// Save the offset between
 	sub	x4, x4, x24			// the kernel virtual and
 	str_l	x4, kimage_voffset, x5		// physical mappings
 
@@ -478,6 +487,15 @@ __mmap_switched:
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
 #endif
+#ifdef CONFIG_RANDOMIZE_BASE
+	cbnz	x23, 0f				// already running randomized?
+	mov	x0, x21				// pass FDT address in x0
+	bl	kaslr_early_init		// parse FDT for KASLR options
+	cbz	x0, 0f				// KASLR disabled? just proceed
+	ret	x28				// we must enable KASLR, return
+						// to __enable_mmu()
+0:
+#endif
 	b	start_kernel
 ENDPROC(__mmap_switched)
 
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot at EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
  */
 	.section	".idmap.text", "ax"
 __enable_mmu:
+	mrs	x18, sctlr_el1			// preserve old SCTLR_EL1 value
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
 	ic	iallu
 	dsb	nsh
 	isb
+#ifdef CONFIG_RANDOMIZE_BASE
+	mov	x19, x0				// preserve new SCTLR_EL1 value
+	blr	x27
+
+	/*
+	 * If we return here, we have a KASLR displacement in x0 which we need
+	 * to record and take into account by discarding the current kernel
+	 * mapping and creating a new one.
+	 */
+	mov	x23, x0				// record the KASLR offset
+	msr	sctlr_el1, x18			// disable the MMU
+	isb
+	bl	__create_page_tables		// recreate kernel mapping
+
+	msr	sctlr_el1, x19			// re-enable the MMU
+	isb
+	ic	ialluis				// flush instructions fetched
+	isb					// via old mapping
+	add	x27, x27, x23			// relocated __mmap_switched
+#endif
 	br	x27
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+	int node, len;
+	u64 *prop;
+	u64 ret;
+
+	node = fdt_path_offset(fdt, "/chosen");
+	if (node < 0)
+		return 0;
+
+	prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+	if (!prop || len != sizeof(u64))
+		return 0;
+
+	ret = fdt64_to_cpu(*prop);
+	*prop = 0;
+	return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+	static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+	if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+		int node;
+		const u8 *prop;
+
+		node = fdt_path_offset(fdt, "/chosen");
+		if (node < 0)
+			goto out;
+
+		prop = fdt_getprop(fdt, node, "bootargs", NULL);
+		if (!prop)
+			goto out;
+		return prop;
+	}
+out:
+	return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+	u64 stack_start = (u64)&init_thread_union.stack;
+	u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+	u32 crc;
+
+	crc = crc32_le(~0, _text, stack_start - (u64)_text);
+	crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+	return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+				       pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+	void *fdt;
+	u64 seed, offset, mask, module_range;
+	const u8 *cmdline, *str;
+	int size;
+	u32 crc;
+
+	/*
+	 * Record the CRC of the entire [_text, _edata] interval, except the
+	 * region we are using for the stack. If we detect any changes made
+	 * during the course of this function, we bail. This may seem a bit
+	 * drastic, but in this case , we have no way of guaranteeing we won't
+	 * corrupt anything by moving the kernel image before reentering it.
+	 */
+	crc = get_kernel_crc();
+
+	/*
+	 * Try to map the FDT early. If this fails, we simply bail,
+	 * and proceed with KASLR disabled. We will make another
+	 * attempt at mapping the FDT in setup_machine()
+	 */
+	early_fixmap_init();
+	fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+	if (!fdt)
+		return 0;
+
+	/*
+	 * Retrieve (and wipe) the seed from the FDT
+	 */
+	seed = get_kaslr_seed(fdt);
+	if (!seed)
+		return 0;
+
+	/*
+	 * Check if 'nokaslr' appears on the command line, and
+	 * return 0 if that is the case.
+	 */
+	cmdline = get_cmdline(fdt);
+	str = strstr(cmdline, "nokaslr");
+	if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+		return 0;
+
+	/* check if we made any inadvertent changes to the kernel text */
+	if (crc != get_kernel_crc())
+		return 0;
+
+	/*
+	 * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+	 * kernel image offset from the seed. Let's place the kernel in the
+	 * lower half of the VMALLOC area (VA_BITS - 2).
+	 * Even if we could randomize at page granularity for 16k and 64k pages,
+	 * let's always round to 2 MB so we don't interfere with the ability to
+	 * map using contiguous PTEs
+	 */
+	mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+	offset = seed & mask;
+
+	/*
+	 * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+	 * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+	 * happens, increase the KASLR offset by the size of the kernel image.
+	 */
+	if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+	    (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+		offset = (offset + (u64)(_end - _text)) & mask;
+
+	/*
+	 * Randomize the module region, by setting module_load_offset to
+	 * a PAGE_SIZE multiple in the interval [0, module_range). This
+	 * ensures that the resulting region still covers [_stext, _etext],
+	 * and that all relative branches can be resolved without veneers.
+	 */
+	module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+	module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+	return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
 void *module_alloc(unsigned long size)
 {
 	void *p;
+	u64 base = (u64)_etext - MODULES_VSIZE;
 
-	p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		extern u32 module_load_offset;
+		base += module_load_offset;
+	}
+
+	p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
 	return 0;
 }
 subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+			      void *p)
+{
+	u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+		pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+			 kaslr_offset, KIMAGE_VADDR);
+	} else {
+		pr_emerg("Kernel Offset: disabled\n");
+	}
+	return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+	.notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &kernel_offset_notifier);
+	return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+	if (CONFIG_PGTABLE_LEVELS > 3 &&
+	    !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
 		/*
 		 * We only end up here if the kernel mapping and the fixmap
 		 * share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
 	}
 }
 
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
 {
 	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
-	pgprot_t prot = PAGE_KERNEL_RO;
-	int size, offset;
+	int offset;
 	void *dt_virt;
 
 	/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 	if (fdt_check_header(dt_virt) != 0)
 		return NULL;
 
-	size = fdt_totalsize(dt_virt);
-	if (size > MAX_FDT_SIZE)
+	*size = fdt_totalsize(dt_virt);
+	if (*size > MAX_FDT_SIZE)
 		return NULL;
 
-	if (offset + size > SWAPPER_BLOCK_SIZE)
-		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
-			       round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+	if (offset + *size > SWAPPER_BLOCK_SIZE)
+		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+			       dt_virt_base,
+			       round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+			       prot);
 
-	memblock_reserve(dt_phys, size);
+	return dt_virt;
+}
 
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	void *dt_virt;
+	int size;
+
+	dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+	if (!dt_virt)
+		return NULL;
+
+	memblock_reserve(dt_phys, size);
 	return dt_virt;
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 18/22] arm64: add support for kernel ASLR
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.

The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |  14 ++
 arch/arm64/include/asm/memory.h |   5 +-
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/head.S        |  59 ++++++-
 arch/arm64/kernel/kaslr.c       | 169 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   8 +-
 arch/arm64/kernel/setup.c       |  29 ++++
 arch/arm64/mm/mmu.c             |  33 ++--
 8 files changed, 298 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
 	  relocation pass at runtime even if the kernel is loaded at the
 	  same address it was linked at.
 
+config RANDOMIZE_BASE
+	bool "Randomize the address of the kernel image"
+	select ARM64_MODULE_PLTS
+	select RELOCATABLE
+	help
+	  Randomizes the virtual address at which the kernel image is
+	  loaded, as a security feature that deters exploit attempts
+	  relying on knowledge of the location of kernel internals.
+
+	  It is the bootloader's job to provide entropy, by passing a
+	  random u64 value in /chosen/kaslr-seed at kernel entry.
+
+	  If unsure, say N.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
 #define KIMAGE_VADDR		(MODULES_END)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
 #define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE		(SZ_64M)
+#define MODULES_VSIZE		(SZ_128M)
 #define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
 ENTRY(stext)
 	bl	preserve_boot_args
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
+	mov	x23, xzr			// KASLR offset, defaults to 0
 	adrp	x24, __PHYS_OFFSET
 	bl	set_cpu_boot_mode_flag
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
 __create_page_tables:
 	adrp	x25, idmap_pg_dir
 	adrp	x26, swapper_pg_dir
-	mov	x27, lr
+	mov	x28, lr
 
 	/*
 	 * Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+	add	x5, x5, x23			// add KASLR displacement
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
 	dmb	sy
 	bl	__inval_cache_range
 
-	mov	lr, x27
-	ret
+	ret	x28
 ENDPROC(__create_page_tables)
 
 kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	mov	x28, lr				// preserve LR
 	adr_l	x8, vectors			// load VBAR_EL1 with virtual
 	msr	vbar_el1, x8			// vector table address
 	isb
@@ -449,19 +451,26 @@ __mmap_switched:
 	ldr	x13, [x9, #-8]
 	cmp	w12, #R_AARCH64_RELATIVE
 	b.ne	1f
-	str	x13, [x11]
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
 	b	0b
 
 1:	cmp	w12, #R_AARCH64_ABS64
 	b.ne	0b
 	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
 	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
 	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
 	add	x15, x13, x15
-	str	x15, [x11]
+	str	x15, [x11, x23]
 	b	0b
 
-2:
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
 #endif
 
 	adr_l	sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x4, kimage_vaddr		// Save the offset between
 	sub	x4, x4, x24			// the kernel virtual and
 	str_l	x4, kimage_voffset, x5		// physical mappings
 
@@ -478,6 +487,15 @@ __mmap_switched:
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
 #endif
+#ifdef CONFIG_RANDOMIZE_BASE
+	cbnz	x23, 0f				// already running randomized?
+	mov	x0, x21				// pass FDT address in x0
+	bl	kaslr_early_init		// parse FDT for KASLR options
+	cbz	x0, 0f				// KASLR disabled? just proceed
+	ret	x28				// we must enable KASLR, return
+						// to __enable_mmu()
+0:
+#endif
 	b	start_kernel
 ENDPROC(__mmap_switched)
 
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot at EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
  */
 	.section	".idmap.text", "ax"
 __enable_mmu:
+	mrs	x18, sctlr_el1			// preserve old SCTLR_EL1 value
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
 	ic	iallu
 	dsb	nsh
 	isb
+#ifdef CONFIG_RANDOMIZE_BASE
+	mov	x19, x0				// preserve new SCTLR_EL1 value
+	blr	x27
+
+	/*
+	 * If we return here, we have a KASLR displacement in x0 which we need
+	 * to record and take into account by discarding the current kernel
+	 * mapping and creating a new one.
+	 */
+	mov	x23, x0				// record the KASLR offset
+	msr	sctlr_el1, x18			// disable the MMU
+	isb
+	bl	__create_page_tables		// recreate kernel mapping
+
+	msr	sctlr_el1, x19			// re-enable the MMU
+	isb
+	ic	ialluis				// flush instructions fetched
+	isb					// via old mapping
+	add	x27, x27, x23			// relocated __mmap_switched
+#endif
 	br	x27
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+	int node, len;
+	u64 *prop;
+	u64 ret;
+
+	node = fdt_path_offset(fdt, "/chosen");
+	if (node < 0)
+		return 0;
+
+	prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+	if (!prop || len != sizeof(u64))
+		return 0;
+
+	ret = fdt64_to_cpu(*prop);
+	*prop = 0;
+	return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+	static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+	if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+		int node;
+		const u8 *prop;
+
+		node = fdt_path_offset(fdt, "/chosen");
+		if (node < 0)
+			goto out;
+
+		prop = fdt_getprop(fdt, node, "bootargs", NULL);
+		if (!prop)
+			goto out;
+		return prop;
+	}
+out:
+	return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+	u64 stack_start = (u64)&init_thread_union.stack;
+	u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+	u32 crc;
+
+	crc = crc32_le(~0, _text, stack_start - (u64)_text);
+	crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+	return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+				       pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+	void *fdt;
+	u64 seed, offset, mask, module_range;
+	const u8 *cmdline, *str;
+	int size;
+	u32 crc;
+
+	/*
+	 * Record the CRC of the entire [_text, _edata] interval, except the
+	 * region we are using for the stack. If we detect any changes made
+	 * during the course of this function, we bail. This may seem a bit
+	 * drastic, but in this case , we have no way of guaranteeing we won't
+	 * corrupt anything by moving the kernel image before reentering it.
+	 */
+	crc = get_kernel_crc();
+
+	/*
+	 * Try to map the FDT early. If this fails, we simply bail,
+	 * and proceed with KASLR disabled. We will make another
+	 * attempt@mapping the FDT in setup_machine()
+	 */
+	early_fixmap_init();
+	fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+	if (!fdt)
+		return 0;
+
+	/*
+	 * Retrieve (and wipe) the seed from the FDT
+	 */
+	seed = get_kaslr_seed(fdt);
+	if (!seed)
+		return 0;
+
+	/*
+	 * Check if 'nokaslr' appears on the command line, and
+	 * return 0 if that is the case.
+	 */
+	cmdline = get_cmdline(fdt);
+	str = strstr(cmdline, "nokaslr");
+	if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+		return 0;
+
+	/* check if we made any inadvertent changes to the kernel text */
+	if (crc != get_kernel_crc())
+		return 0;
+
+	/*
+	 * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+	 * kernel image offset from the seed. Let's place the kernel in the
+	 * lower half of the VMALLOC area (VA_BITS - 2).
+	 * Even if we could randomize at page granularity for 16k and 64k pages,
+	 * let's always round to 2 MB so we don't interfere with the ability to
+	 * map using contiguous PTEs
+	 */
+	mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+	offset = seed & mask;
+
+	/*
+	 * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+	 * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+	 * happens, increase the KASLR offset by the size of the kernel image.
+	 */
+	if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+	    (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+		offset = (offset + (u64)(_end - _text)) & mask;
+
+	/*
+	 * Randomize the module region, by setting module_load_offset to
+	 * a PAGE_SIZE multiple in the interval [0, module_range). This
+	 * ensures that the resulting region still covers [_stext, _etext],
+	 * and that all relative branches can be resolved without veneers.
+	 */
+	module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+	module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+	return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
 void *module_alloc(unsigned long size)
 {
 	void *p;
+	u64 base = (u64)_etext - MODULES_VSIZE;
 
-	p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		extern u32 module_load_offset;
+		base += module_load_offset;
+	}
+
+	p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
 	return 0;
 }
 subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+			      void *p)
+{
+	u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+		pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+			 kaslr_offset, KIMAGE_VADDR);
+	} else {
+		pr_emerg("Kernel Offset: disabled\n");
+	}
+	return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+	.notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &kernel_offset_notifier);
+	return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+	if (CONFIG_PGTABLE_LEVELS > 3 &&
+	    !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
 		/*
 		 * We only end up here if the kernel mapping and the fixmap
 		 * share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
 	}
 }
 
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
 {
 	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
-	pgprot_t prot = PAGE_KERNEL_RO;
-	int size, offset;
+	int offset;
 	void *dt_virt;
 
 	/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 	if (fdt_check_header(dt_virt) != 0)
 		return NULL;
 
-	size = fdt_totalsize(dt_virt);
-	if (size > MAX_FDT_SIZE)
+	*size = fdt_totalsize(dt_virt);
+	if (*size > MAX_FDT_SIZE)
 		return NULL;
 
-	if (offset + size > SWAPPER_BLOCK_SIZE)
-		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
-			       round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+	if (offset + *size > SWAPPER_BLOCK_SIZE)
+		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+			       dt_virt_base,
+			       round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+			       prot);
 
-	memblock_reserve(dt_phys, size);
+	return dt_virt;
+}
 
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	void *dt_virt;
+	int size;
+
+	dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+	if (!dt_virt)
+		return NULL;
+
+	memblock_reserve(dt_phys, size);
 	return dt_virt;
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 18/22] arm64: add support for kernel ASLR
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.

The module region is randomized by choosing a page aligned 128 MB region
inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between
10 and 14 bits of entropy (depending on page size), independently of the
kernel randomization, but still guarantees that modules are within the
range of relative branch and jump instructions (with the caveat that, since
the module region is shared with other uses of the vmalloc area, modules
may need to be loaded further away if the module region is exhausted)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |  14 ++
 arch/arm64/include/asm/memory.h |   5 +-
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/head.S        |  59 ++++++-
 arch/arm64/kernel/kaslr.c       | 169 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   8 +-
 arch/arm64/kernel/setup.c       |  29 ++++
 arch/arm64/mm/mmu.c             |  33 ++--
 8 files changed, 298 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6aa86f86fd10..d7e31454d421 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -774,6 +774,20 @@ config RELOCATABLE
 	  relocation pass at runtime even if the kernel is loaded at the
 	  same address it was linked at.
 
+config RANDOMIZE_BASE
+	bool "Randomize the address of the kernel image"
+	select ARM64_MODULE_PLTS
+	select RELOCATABLE
+	help
+	  Randomizes the virtual address at which the kernel image is
+	  loaded, as a security feature that deters exploit attempts
+	  relying on knowledge of the location of kernel internals.
+
+	  It is the bootloader's job to provide entropy, by passing a
+	  random u64 value in /chosen/kaslr-seed at kernel entry.
+
+	  If unsure, say N.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 61005e7dd6cb..083361531a61 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -52,7 +52,7 @@
 #define KIMAGE_VADDR		(MODULES_END)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
 #define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
-#define MODULES_VSIZE		(SZ_64M)
+#define MODULES_VSIZE		(SZ_128M)
 #define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
@@ -127,6 +127,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index e2f0a755beaa..c9aaecddb941 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 92f9c26632f3..8712a38c3de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -210,6 +210,7 @@ section_table:
 ENTRY(stext)
 	bl	preserve_boot_args
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
+	mov	x23, xzr			// KASLR offset, defaults to 0
 	adrp	x24, __PHYS_OFFSET
 	bl	set_cpu_boot_mode_flag
 	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
@@ -313,7 +314,7 @@ ENDPROC(preserve_boot_args)
 __create_page_tables:
 	adrp	x25, idmap_pg_dir
 	adrp	x26, swapper_pg_dir
-	mov	x27, lr
+	mov	x28, lr
 
 	/*
 	 * Invalidate the idmap and swapper page tables to avoid potential
@@ -392,6 +393,7 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+	add	x5, x5, x23			// add KASLR displacement
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -408,8 +410,7 @@ __create_page_tables:
 	dmb	sy
 	bl	__inval_cache_range
 
-	mov	lr, x27
-	ret
+	ret	x28
 ENDPROC(__create_page_tables)
 
 kernel_img_size:
@@ -421,6 +422,7 @@ kernel_img_size:
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	mov	x28, lr				// preserve LR
 	adr_l	x8, vectors			// load VBAR_EL1 with virtual
 	msr	vbar_el1, x8			// vector table address
 	isb
@@ -449,19 +451,26 @@ __mmap_switched:
 	ldr	x13, [x9, #-8]
 	cmp	w12, #R_AARCH64_RELATIVE
 	b.ne	1f
-	str	x13, [x11]
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
 	b	0b
 
 1:	cmp	w12, #R_AARCH64_ABS64
 	b.ne	0b
 	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
 	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
 	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
 	add	x15, x13, x15
-	str	x15, [x11]
+	str	x15, [x11, x23]
 	b	0b
 
-2:
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
 #endif
 
 	adr_l	sp, initial_sp, x4
@@ -470,7 +479,7 @@ __mmap_switched:
 	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x4, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x4, kimage_vaddr		// Save the offset between
 	sub	x4, x4, x24			// the kernel virtual and
 	str_l	x4, kimage_voffset, x5		// physical mappings
 
@@ -478,6 +487,15 @@ __mmap_switched:
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
 #endif
+#ifdef CONFIG_RANDOMIZE_BASE
+	cbnz	x23, 0f				// already running randomized?
+	mov	x0, x21				// pass FDT address in x0
+	bl	kaslr_early_init		// parse FDT for KASLR options
+	cbz	x0, 0f				// KASLR disabled? just proceed
+	ret	x28				// we must enable KASLR, return
+						// to __enable_mmu()
+0:
+#endif
 	b	start_kernel
 ENDPROC(__mmap_switched)
 
@@ -486,6 +504,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot at EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -646,7 +668,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
@@ -679,6 +701,7 @@ ENDPROC(__secondary_switched)
  */
 	.section	".idmap.text", "ax"
 __enable_mmu:
+	mrs	x18, sctlr_el1			// preserve old SCTLR_EL1 value
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
@@ -696,6 +719,26 @@ __enable_mmu:
 	ic	iallu
 	dsb	nsh
 	isb
+#ifdef CONFIG_RANDOMIZE_BASE
+	mov	x19, x0				// preserve new SCTLR_EL1 value
+	blr	x27
+
+	/*
+	 * If we return here, we have a KASLR displacement in x0 which we need
+	 * to record and take into account by discarding the current kernel
+	 * mapping and creating a new one.
+	 */
+	mov	x23, x0				// record the KASLR offset
+	msr	sctlr_el1, x18			// disable the MMU
+	isb
+	bl	__create_page_tables		// recreate kernel mapping
+
+	msr	sctlr_el1, x19			// re-enable the MMU
+	isb
+	ic	ialluis				// flush instructions fetched
+	isb					// via old mapping
+	add	x27, x27, x23			// relocated __mmap_switched
+#endif
 	br	x27
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
new file mode 100644
index 000000000000..9ddb01f65a1a
--- /dev/null
+++ b/arch/arm64/kernel/kaslr.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/libfdt.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/sections.h>
+
+u32 module_load_offset;
+
+static __init u64 get_kaslr_seed(void *fdt)
+{
+	int node, len;
+	u64 *prop;
+	u64 ret;
+
+	node = fdt_path_offset(fdt, "/chosen");
+	if (node < 0)
+		return 0;
+
+	prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
+	if (!prop || len != sizeof(u64))
+		return 0;
+
+	ret = fdt64_to_cpu(*prop);
+	*prop = 0;
+	return ret;
+}
+
+static __init const u8 *get_cmdline(void *fdt)
+{
+	static const u8 default_cmdline[] = CONFIG_CMDLINE;
+
+	if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) {
+		int node;
+		const u8 *prop;
+
+		node = fdt_path_offset(fdt, "/chosen");
+		if (node < 0)
+			goto out;
+
+		prop = fdt_getprop(fdt, node, "bootargs", NULL);
+		if (!prop)
+			goto out;
+		return prop;
+	}
+out:
+	return default_cmdline;
+}
+
+static u32 get_kernel_crc(void)
+{
+	u64 stack_start = (u64)&init_thread_union.stack;
+	u64 stack_end = stack_start + sizeof(init_thread_union.stack);
+	u32 crc;
+
+	crc = crc32_le(~0, _text, stack_start - (u64)_text);
+	crc = crc32_le(crc, (void *)stack_end, (u64)_edata - stack_end);
+
+	return crc;
+}
+
+extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
+				       pgprot_t prot);
+
+/*
+ * This routine will be executed with the kernel mapped at its default virtual
+ * address, and if it returns successfully, the kernel will be remapped, and
+ * start_kernel() will be executed from a randomized virtual offset. The
+ * relocation will result in all absolute references (e.g., static variables
+ * containing function pointers) to be reinitialized, and zero-initialized
+ * .bss variables will be reset to 0. However, other .data manipulations will
+ * persist across the change from the default mapping to the randomized mapping,
+ * and thus should not be performed before we have moved the kernel to its final
+ * address. This will be caught by the CRC check, and KASLR will be disabled if
+ * we catch any inadvertent modifications.
+ */
+u64 __init kaslr_early_init(u64 dt_phys)
+{
+	void *fdt;
+	u64 seed, offset, mask, module_range;
+	const u8 *cmdline, *str;
+	int size;
+	u32 crc;
+
+	/*
+	 * Record the CRC of the entire [_text, _edata] interval, except the
+	 * region we are using for the stack. If we detect any changes made
+	 * during the course of this function, we bail. This may seem a bit
+	 * drastic, but in this case , we have no way of guaranteeing we won't
+	 * corrupt anything by moving the kernel image before reentering it.
+	 */
+	crc = get_kernel_crc();
+
+	/*
+	 * Try to map the FDT early. If this fails, we simply bail,
+	 * and proceed with KASLR disabled. We will make another
+	 * attempt at mapping the FDT in setup_machine()
+	 */
+	early_fixmap_init();
+	fdt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL);
+	if (!fdt)
+		return 0;
+
+	/*
+	 * Retrieve (and wipe) the seed from the FDT
+	 */
+	seed = get_kaslr_seed(fdt);
+	if (!seed)
+		return 0;
+
+	/*
+	 * Check if 'nokaslr' appears on the command line, and
+	 * return 0 if that is the case.
+	 */
+	cmdline = get_cmdline(fdt);
+	str = strstr(cmdline, "nokaslr");
+	if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+		return 0;
+
+	/* check if we made any inadvertent changes to the kernel text */
+	if (crc != get_kernel_crc())
+		return 0;
+
+	/*
+	 * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+	 * kernel image offset from the seed. Let's place the kernel in the
+	 * lower half of the VMALLOC area (VA_BITS - 2).
+	 * Even if we could randomize at page granularity for 16k and 64k pages,
+	 * let's always round to 2 MB so we don't interfere with the ability to
+	 * map using contiguous PTEs
+	 */
+	mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1);
+	offset = seed & mask;
+
+	/*
+	 * The kernel Image should not extend across a 1GB/32MB/512MB alignment
+	 * boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
+	 * happens, increase the KASLR offset by the size of the kernel image.
+	 */
+	if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
+	    (((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
+		offset = (offset + (u64)(_end - _text)) & mask;
+
+	/*
+	 * Randomize the module region, by setting module_load_offset to
+	 * a PAGE_SIZE multiple in the interval [0, module_range). This
+	 * ensures that the resulting region still covers [_stext, _etext],
+	 * and that all relative branches can be resolved without veneers.
+	 */
+	module_range = MODULES_VSIZE - (u64)(_etext - _stext);
+	module_load_offset = ((module_range * (u16)seed) >> 16) & PAGE_MASK;
+
+	return offset;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 84113d3e1df1..54702d456680 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -33,8 +33,14 @@
 void *module_alloc(unsigned long size)
 {
 	void *p;
+	u64 base = (u64)_etext - MODULES_VSIZE;
 
-	p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		extern u32 module_load_offset;
+		base += module_load_offset;
+	}
+
+	p = __vmalloc_node_range(size, MODULE_ALIGN, base, base + MODULES_VSIZE,
 				GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
 				NUMA_NO_NODE, __builtin_return_address(0));
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..42371f69def3 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -388,3 +388,32 @@ static int __init topology_init(void)
 	return 0;
 }
 subsys_initcall(topology_init);
+
+/*
+ * Dump out kernel offset information on panic.
+ */
+static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
+			      void *p)
+{
+	u64 const kaslr_offset = kimage_vaddr - KIMAGE_VADDR;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset > 0) {
+		pr_emerg("Kernel Offset: 0x%llx from 0x%lx\n",
+			 kaslr_offset, KIMAGE_VADDR);
+	} else {
+		pr_emerg("Kernel Offset: disabled\n");
+	}
+	return 0;
+}
+
+static struct notifier_block kernel_offset_notifier = {
+	.notifier_call = dump_kernel_offset
+};
+
+static int __init register_kernel_offset_dumper(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &kernel_offset_notifier);
+	return 0;
+}
+__initcall(register_kernel_offset_dumper);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8dda38378959..5d7e0b801ab7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -636,7 +636,8 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	if (CONFIG_PGTABLE_LEVELS > 3 && !pgd_none(*pgd)) {
+	if (CONFIG_PGTABLE_LEVELS > 3 &&
+	    !(pgd_none(*pgd) || pgd_page_paddr(*pgd) == __pa(bm_pud))) {
 		/*
 		 * We only end up here if the kernel mapping and the fixmap
 		 * share the top level pgd entry, which should only happen on
@@ -693,11 +694,10 @@ void __set_fixmap(enum fixed_addresses idx,
 	}
 }
 
-void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
 {
 	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
-	pgprot_t prot = PAGE_KERNEL_RO;
-	int size, offset;
+	int offset;
 	void *dt_virt;
 
 	/*
@@ -736,16 +736,29 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 	if (fdt_check_header(dt_virt) != 0)
 		return NULL;
 
-	size = fdt_totalsize(dt_virt);
-	if (size > MAX_FDT_SIZE)
+	*size = fdt_totalsize(dt_virt);
+	if (*size > MAX_FDT_SIZE)
 		return NULL;
 
-	if (offset + size > SWAPPER_BLOCK_SIZE)
-		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
-			       round_up(offset + size, SWAPPER_BLOCK_SIZE), prot);
+	if (offset + *size > SWAPPER_BLOCK_SIZE)
+		create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+			       dt_virt_base,
+			       round_up(offset + *size, SWAPPER_BLOCK_SIZE),
+			       prot);
 
-	memblock_reserve(dt_phys, size);
+	return dt_virt;
+}
 
+void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
+{
+	void *dt_virt;
+	int size;
+
+	dt_virt = __fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL_RO);
+	if (!dt_virt)
+		return NULL;
+
+	memblock_reserve(dt_phys, size);
 	return dt_virt;
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/Makefile  |  2 +-
 drivers/firmware/efi/libstub/efistub.h |  3 ++
 drivers/firmware/efi/libstub/random.c  | 35 ++++++++++++++++++++
 include/linux/efi.h                    |  5 ++-
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB)	+= arm-stub.o fdt.o string.o \
 				   $(patsubst %.c,lib-%.o,$(arm-deps))
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64-stub.o
+lib-$(CONFIG_ARM64)		+= arm64-stub.o random.o
 CFLAGS_arm64-stub.o 		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 
 #
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     unsigned long desc_size, efi_memory_desc_t *runtime_map,
 		     int *count);
 
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+				  unsigned long size, u8 *out);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd;  <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+	efi_status_t (*get_info)(struct efi_rng_protocol *,
+				 unsigned long *, efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol *,
+				efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+				  unsigned long size, u8 *out)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol *rng;
+
+	status = efi_call_early(locate_protocol, &rng_proto, NULL,
+				(void **)&rng);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/Makefile  |  2 +-
 drivers/firmware/efi/libstub/efistub.h |  3 ++
 drivers/firmware/efi/libstub/random.c  | 35 ++++++++++++++++++++
 include/linux/efi.h                    |  5 ++-
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB)	+= arm-stub.o fdt.o string.o \
 				   $(patsubst %.c,lib-%.o,$(arm-deps))
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64-stub.o
+lib-$(CONFIG_ARM64)		+= arm64-stub.o random.o
 CFLAGS_arm64-stub.o 		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 
 #
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     unsigned long desc_size, efi_memory_desc_t *runtime_map,
 		     int *count);
 
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+				  unsigned long size, u8 *out);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd;  <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+	efi_status_t (*get_info)(struct efi_rng_protocol *,
+				 unsigned long *, efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol *,
+				efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+				  unsigned long size, u8 *out)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol *rng;
+
+	status = efi_call_early(locate_protocol, &rng_proto, NULL,
+				(void **)&rng);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new
function efi_get_random_bytes().

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/Makefile  |  2 +-
 drivers/firmware/efi/libstub/efistub.h |  3 ++
 drivers/firmware/efi/libstub/random.c  | 35 ++++++++++++++++++++
 include/linux/efi.h                    |  5 ++-
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index aaf9c0bab42e..ad077944aa0e 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,7 +36,7 @@ lib-$(CONFIG_EFI_ARMSTUB)	+= arm-stub.o fdt.o string.o \
 				   $(patsubst %.c,lib-%.o,$(arm-deps))
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64-stub.o
+lib-$(CONFIG_ARM64)		+= arm64-stub.o random.o
 CFLAGS_arm64-stub.o 		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 
 #
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 6b6548fda089..206b7252b9d1 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -43,4 +43,7 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     unsigned long desc_size, efi_memory_desc_t *runtime_map,
 		     int *count);
 
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
+				  unsigned long size, u8 *out);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
new file mode 100644
index 000000000000..97941ee5954f
--- /dev/null
+++ b/drivers/firmware/efi/libstub/random.c
@@ -0,0 +1,35 @@
+/*
+ * Copyright (C) 2016 Linaro Ltd;  <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+struct efi_rng_protocol {
+	efi_status_t (*get_info)(struct efi_rng_protocol *,
+				 unsigned long *, efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol *,
+				efi_guid_t *, unsigned long, u8 *out);
+};
+
+efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
+				  unsigned long size, u8 *out)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol *rng;
+
+	status = efi_call_early(locate_protocol, &rng_proto, NULL,
+				(void **)&rng);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, size, out);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.

This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/efistub.h |   4 +
 drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
 				  unsigned long size, u8 *out);
 
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size, unsigned long align,
+			      unsigned long *addr, unsigned long random_seed);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
 
 	return rng->get_rng(rng, NULL, size, out);
 }
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+					 unsigned long size,
+					 unsigned long align)
+{
+	u64 start, end;
+
+	if (md->type != EFI_CONVENTIONAL_MEMORY)
+		return 0;
+
+	start = round_up(md->phys_addr, align);
+	end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+			 align);
+
+	if (start > end)
+		return 0;
+
+	return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md)	((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size,
+			      unsigned long align,
+			      unsigned long *addr,
+			      unsigned long random_seed)
+{
+	unsigned long map_size, desc_size, total_slots = 0, target_slot;
+	efi_status_t status = EFI_NOT_FOUND;
+	efi_memory_desc_t *memory_map;
+	int map_offset;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	if (align < EFI_ALLOC_ALIGN)
+		align = EFI_ALLOC_ALIGN;
+
+	/* count the suitable slots in each memory map entry */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		unsigned long slots;
+
+		slots = get_entry_num_slots(md, size, align);
+		MD_NUM_SLOTS(md) = slots;
+		total_slots += slots;
+	}
+
+	/* find a random number between 0 and total_slots */
+	target_slot = (total_slots * (u16)random_seed) >> 16;
+
+	/*
+	 * target_slot is now a value in the range [0, total_slots), and so
+	 * it corresponds with exactly one of the suitable slots we recorded
+	 * when iterating over the memory map the first time around.
+	 *
+	 * So iterate over the memory map again, subtracting the number of
+	 * slots of each entry at each iteration, until we have found the entry
+	 * that covers our chosen slot. Use the residual value of target_slot
+	 * to calculate the randomly chosen address, and allocate it directly
+	 * using EFI_ALLOCATE_ADDRESS.
+	 */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		efi_physical_addr_t target;
+		unsigned long pages;
+
+		if (target_slot >= MD_NUM_SLOTS(md)) {
+			target_slot -= MD_NUM_SLOTS(md);
+			continue;
+		}
+
+		target = round_up(md->phys_addr, align) + target_slot * align;
+		pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+					EFI_LOADER_DATA, pages, &target);
+		if (status == EFI_SUCCESS)
+			*addr = target;
+		break;
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return status;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.

This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/efistub.h |   4 +
 drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
 				  unsigned long size, u8 *out);
 
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size, unsigned long align,
+			      unsigned long *addr, unsigned long random_seed);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
 
 	return rng->get_rng(rng, NULL, size, out);
 }
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+					 unsigned long size,
+					 unsigned long align)
+{
+	u64 start, end;
+
+	if (md->type != EFI_CONVENTIONAL_MEMORY)
+		return 0;
+
+	start = round_up(md->phys_addr, align);
+	end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+			 align);
+
+	if (start > end)
+		return 0;
+
+	return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md)	((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size,
+			      unsigned long align,
+			      unsigned long *addr,
+			      unsigned long random_seed)
+{
+	unsigned long map_size, desc_size, total_slots = 0, target_slot;
+	efi_status_t status = EFI_NOT_FOUND;
+	efi_memory_desc_t *memory_map;
+	int map_offset;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	if (align < EFI_ALLOC_ALIGN)
+		align = EFI_ALLOC_ALIGN;
+
+	/* count the suitable slots in each memory map entry */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		unsigned long slots;
+
+		slots = get_entry_num_slots(md, size, align);
+		MD_NUM_SLOTS(md) = slots;
+		total_slots += slots;
+	}
+
+	/* find a random number between 0 and total_slots */
+	target_slot = (total_slots * (u16)random_seed) >> 16;
+
+	/*
+	 * target_slot is now a value in the range [0, total_slots), and so
+	 * it corresponds with exactly one of the suitable slots we recorded
+	 * when iterating over the memory map the first time around.
+	 *
+	 * So iterate over the memory map again, subtracting the number of
+	 * slots of each entry at each iteration, until we have found the entry
+	 * that covers our chosen slot. Use the residual value of target_slot
+	 * to calculate the randomly chosen address, and allocate it directly
+	 * using EFI_ALLOCATE_ADDRESS.
+	 */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		efi_physical_addr_t target;
+		unsigned long pages;
+
+		if (target_slot >= MD_NUM_SLOTS(md)) {
+			target_slot -= MD_NUM_SLOTS(md);
+			continue;
+		}
+
+		target = round_up(md->phys_addr, align) + target_slot * align;
+		pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+					EFI_LOADER_DATA, pages, &target);
+		if (status == EFI_SUCCESS)
+			*addr = target;
+		break;
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return status;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

This implements efi_random_alloc(), which allocates a chunk of memory of
a certain size at a certain alignment, and uses the random_seed argument
it receives to randomize the address of the allocation.

This is implemented by iterating over the UEFI memory map, counting the
number of suitable slots (aligned offsets) within each region, and picking
a random number between 0 and 'number of slots - 1' to select the slot,
This should guarantee that each possible offset is chosen equally likely.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/efistub.h |   4 +
 drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 206b7252b9d1..5ed3d3f38166 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
 				  unsigned long size, u8 *out);
 
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size, unsigned long align,
+			      unsigned long *addr, unsigned long random_seed);
+
 #endif
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 97941ee5954f..b98346350230 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
 
 	return rng->get_rng(rng, NULL, size, out);
 }
+
+/*
+ * Return the number of slots covered by this entry, i.e., the number of
+ * addresses it covers that are suitably aligned and supply enough room
+ * for the allocation.
+ */
+static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
+					 unsigned long size,
+					 unsigned long align)
+{
+	u64 start, end;
+
+	if (md->type != EFI_CONVENTIONAL_MEMORY)
+		return 0;
+
+	start = round_up(md->phys_addr, align);
+	end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
+			 align);
+
+	if (start > end)
+		return 0;
+
+	return (end - start + 1) / align;
+}
+
+/*
+ * The UEFI memory descriptors have a virtual address field that is only used
+ * when installing the virtual mapping using SetVirtualAddressMap(). Since it
+ * is unused here, we can reuse it to keep track of each descriptor's slot
+ * count.
+ */
+#define MD_NUM_SLOTS(md)	((md)->virt_addr)
+
+efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
+			      unsigned long size,
+			      unsigned long align,
+			      unsigned long *addr,
+			      unsigned long random_seed)
+{
+	unsigned long map_size, desc_size, total_slots = 0, target_slot;
+	efi_status_t status = EFI_NOT_FOUND;
+	efi_memory_desc_t *memory_map;
+	int map_offset;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	if (align < EFI_ALLOC_ALIGN)
+		align = EFI_ALLOC_ALIGN;
+
+	/* count the suitable slots in each memory map entry */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		unsigned long slots;
+
+		slots = get_entry_num_slots(md, size, align);
+		MD_NUM_SLOTS(md) = slots;
+		total_slots += slots;
+	}
+
+	/* find a random number between 0 and total_slots */
+	target_slot = (total_slots * (u16)random_seed) >> 16;
+
+	/*
+	 * target_slot is now a value in the range [0, total_slots), and so
+	 * it corresponds with exactly one of the suitable slots we recorded
+	 * when iterating over the memory map the first time around.
+	 *
+	 * So iterate over the memory map again, subtracting the number of
+	 * slots of each entry at each iteration, until we have found the entry
+	 * that covers our chosen slot. Use the residual value of target_slot
+	 * to calculate the randomly chosen address, and allocate it directly
+	 * using EFI_ALLOCATE_ADDRESS.
+	 */
+	for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + map_offset;
+		efi_physical_addr_t target;
+		unsigned long pages;
+
+		if (target_slot >= MD_NUM_SLOTS(md)) {
+			target_slot -= MD_NUM_SLOTS(md);
+			continue;
+		}
+
+		target = round_up(md->phys_addr, align) + target_slot * align;
+		pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
+					EFI_LOADER_DATA, pages, &target);
+		if (status == EFI_SUCCESS)
+			*addr = target;
+		break;
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return status;
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 21/22] efi: stub: use high allocation for converted command line
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.

Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/x86/include/asm/efi.h                     | 2 ++
 drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
 #define EFI32_LOADER_SIGNATURE	"EL32"
 #define EFI64_LOADER_SIGNATURE	"EL64"
 
+#define MAX_CMDLINE_ADDRESS	UINT_MAX
+
 #ifdef CONFIG_X86_32
 
 
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
 	return dst;
 }
 
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS	ULONG_MAX
+#endif
+
 /*
  * Convert the unicode UEFI command line to ASCII to pass to kernel.
  * Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
 
 	options_bytes++;	/* NUL termination */
 
-	status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+	status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+				&cmdline_addr, MAX_CMDLINE_ADDRESS);
 	if (status != EFI_SUCCESS)
 		return NULL;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 21/22] efi: stub: use high allocation for converted command line
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.

Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/x86/include/asm/efi.h                     | 2 ++
 drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
 #define EFI32_LOADER_SIGNATURE	"EL32"
 #define EFI64_LOADER_SIGNATURE	"EL64"
 
+#define MAX_CMDLINE_ADDRESS	UINT_MAX
+
 #ifdef CONFIG_X86_32
 
 
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
 	return dst;
 }
 
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS	ULONG_MAX
+#endif
+
 /*
  * Convert the unicode UEFI command line to ASCII to pass to kernel.
  * Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
 
 	options_bytes++;	/* NUL termination */
 
-	status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+	status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+				&cmdline_addr, MAX_CMDLINE_ADDRESS);
 	if (status != EFI_SUCCESS)
 		return NULL;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 21/22] efi: stub: use high allocation for converted command line
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Before we can move the command line processing before the allocation
of the kernel, which is required for detecting the 'nokaslr' option
which controls that allocation, move the converted command line higher
up in memory, to prevent it from interfering with the kernel itself.

Since x86 needs the address to fit in 32 bits, use UINT_MAX as the upper
bound there. Otherwise, use ULONG_MAX (i.e., no limit)

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/x86/include/asm/efi.h                     | 2 ++
 drivers/firmware/efi/libstub/efi-stub-helper.c | 7 ++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0010c78c4998..08b1f2f6ea50 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -25,6 +25,8 @@
 #define EFI32_LOADER_SIGNATURE	"EL32"
 #define EFI64_LOADER_SIGNATURE	"EL64"
 
+#define MAX_CMDLINE_ADDRESS	UINT_MAX
+
 #ifdef CONFIG_X86_32
 
 
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index f07d4a67fa76..29ed2f9b218c 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -649,6 +649,10 @@ static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
 	return dst;
 }
 
+#ifndef MAX_CMDLINE_ADDRESS
+#define MAX_CMDLINE_ADDRESS	ULONG_MAX
+#endif
+
 /*
  * Convert the unicode UEFI command line to ASCII to pass to kernel.
  * Size of memory allocated return in *cmd_line_len.
@@ -684,7 +688,8 @@ char *efi_convert_cmdline(efi_system_table_t *sys_table_arg,
 
 	options_bytes++;	/* NUL termination */
 
-	status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr);
+	status = efi_high_alloc(sys_table_arg, options_bytes, 0,
+				&cmdline_addr, MAX_CMDLINE_ADDRESS);
 	if (status != EFI_SUCCESS)
 		return NULL;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:10   ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                        |  5 ++
 drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
 drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
 drivers/firmware/efi/libstub/fdt.c        |  9 +++
 4 files changed, 97 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
 	  It is the bootloader's job to provide entropy, by passing a
 	  random u64 value in /chosen/kaslr-seed at kernel entry.
 
+	  When booting via the UEFI stub, it will invoke the firmware's
+	  EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+	  to the kernel proper. In addition, it will randomise the physical
+	  location of the kernel Image as well.
+
 	  If unsure, say N.
 
 endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
 
 #include "efistub.h"
 
+bool __nokaslr;
+
 static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
 {
 	static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 		pr_efi_err(sys_table, "Failed to find DRAM base\n");
 		goto fail;
 	}
-	status = handle_kernel_image(sys_table, image_addr, &image_size,
-				     &reserve_addr,
-				     &reserve_size,
-				     dram_base, image);
-	if (status != EFI_SUCCESS) {
-		pr_efi_err(sys_table, "Failed to relocate kernel\n");
-		goto fail;
-	}
 
 	/*
 	 * Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
 	if (!cmdline_ptr) {
 		pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
-		goto fail_free_image;
+		goto fail;
+	}
+
+	/* check whether 'nokaslr' was passed on the command line */
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		static const u8 default_cmdline[] = CONFIG_CMDLINE;
+		const u8 *str, *cmdline = cmdline_ptr;
+
+		if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+			cmdline = default_cmdline;
+		str = strstr(cmdline, "nokaslr");
+		if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+			__nokaslr = true;
+	}
+
+	status = handle_kernel_image(sys_table, image_addr, &image_size,
+				     &reserve_addr,
+				     &reserve_size,
+				     dram_base, image);
+	if (status != EFI_SUCCESS) {
+		pr_efi_err(sys_table, "Failed to relocate kernel\n");
+		goto fail_free_cmdline;
 	}
 
 	status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table, "Failed to load device tree!\n");
-			goto fail_free_cmdline;
+			goto fail_free_image;
 		}
 	}
 
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	efi_free(sys_table, initrd_size, initrd_addr);
 	efi_free(sys_table, fdt_size, fdt_addr);
 
-fail_free_cmdline:
-	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
 fail_free_image:
 	efi_free(sys_table, image_size, *image_addr);
 	efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
 fail:
 	return EFI_ERROR;
 }
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+#include "efistub.h"
+
+extern bool __nokaslr;
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 {
 	efi_status_t status;
 	unsigned long kernel_size, kernel_memsize = 0;
-	unsigned long nr_pages;
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
+	u64 phys_seed = 0;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		if (!__nokaslr) {
+			status = efi_get_random_bytes(sys_table_arg,
+						      sizeof(phys_seed),
+						      (u8 *)&phys_seed);
+			if (status == EFI_NOT_FOUND) {
+				pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+			} else if (status != EFI_SUCCESS) {
+				pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+				return status;
+			}
+		} else {
+			pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+		}
+	}
 
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
 	 * long as the resulting offset equals or exceeds it.
 	 */
-	preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+	preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
 	if (preferred_offset < dram_base)
-		preferred_offset += SZ_2M;
+		preferred_offset += MIN_KIMG_ALIGN;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_random_alloc(sys_table_arg, *reserve_size,
+					  MIN_KIMG_ALIGN, reserve_addr,
+					  phys_seed);
 
+		*image_addr = *reserve_addr + TEXT_OFFSET;
+	} else {
 		/*
-		 * First, try a straight allocation at the preferred offset.
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
-		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
-			   EFI_PAGE_SIZE;
+		*reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
-					EFI_LOADER_DATA, nr_pages,
+					EFI_LOADER_DATA,
+					*reserve_size / EFI_PAGE_SIZE,
 					(efi_physical_addr_t *)reserve_addr);
-		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
+	}
 
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+	if (status != EFI_SUCCESS) {
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, *reserve_size,
+				       MIN_KIMG_ALIGN, reserve_addr);
+
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			*reserve_size = 0;
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
 
 	return EFI_SUCCESS;
 }
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 	if (status)
 		goto fdt_set_fail;
 
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+					      (u8 *)&fdt_val64);
+		if (status == EFI_SUCCESS)
+			status = fdt_setprop(fdt, node, "kaslr-seed",
+					     &fdt_val64, sizeof(fdt_val64));
+		else if (status != EFI_NOT_FOUND)
+			goto fdt_set_fail;
+	}
 	return EFI_SUCCESS;
 
 fdt_set_fail:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                        |  5 ++
 drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
 drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
 drivers/firmware/efi/libstub/fdt.c        |  9 +++
 4 files changed, 97 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
 	  It is the bootloader's job to provide entropy, by passing a
 	  random u64 value in /chosen/kaslr-seed at kernel entry.
 
+	  When booting via the UEFI stub, it will invoke the firmware's
+	  EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+	  to the kernel proper. In addition, it will randomise the physical
+	  location of the kernel Image as well.
+
 	  If unsure, say N.
 
 endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
 
 #include "efistub.h"
 
+bool __nokaslr;
+
 static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
 {
 	static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 		pr_efi_err(sys_table, "Failed to find DRAM base\n");
 		goto fail;
 	}
-	status = handle_kernel_image(sys_table, image_addr, &image_size,
-				     &reserve_addr,
-				     &reserve_size,
-				     dram_base, image);
-	if (status != EFI_SUCCESS) {
-		pr_efi_err(sys_table, "Failed to relocate kernel\n");
-		goto fail;
-	}
 
 	/*
 	 * Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
 	if (!cmdline_ptr) {
 		pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
-		goto fail_free_image;
+		goto fail;
+	}
+
+	/* check whether 'nokaslr' was passed on the command line */
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		static const u8 default_cmdline[] = CONFIG_CMDLINE;
+		const u8 *str, *cmdline = cmdline_ptr;
+
+		if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+			cmdline = default_cmdline;
+		str = strstr(cmdline, "nokaslr");
+		if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+			__nokaslr = true;
+	}
+
+	status = handle_kernel_image(sys_table, image_addr, &image_size,
+				     &reserve_addr,
+				     &reserve_size,
+				     dram_base, image);
+	if (status != EFI_SUCCESS) {
+		pr_efi_err(sys_table, "Failed to relocate kernel\n");
+		goto fail_free_cmdline;
 	}
 
 	status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table, "Failed to load device tree!\n");
-			goto fail_free_cmdline;
+			goto fail_free_image;
 		}
 	}
 
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	efi_free(sys_table, initrd_size, initrd_addr);
 	efi_free(sys_table, fdt_size, fdt_addr);
 
-fail_free_cmdline:
-	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
 fail_free_image:
 	efi_free(sys_table, image_size, *image_addr);
 	efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
 fail:
 	return EFI_ERROR;
 }
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+#include "efistub.h"
+
+extern bool __nokaslr;
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 {
 	efi_status_t status;
 	unsigned long kernel_size, kernel_memsize = 0;
-	unsigned long nr_pages;
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
+	u64 phys_seed = 0;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		if (!__nokaslr) {
+			status = efi_get_random_bytes(sys_table_arg,
+						      sizeof(phys_seed),
+						      (u8 *)&phys_seed);
+			if (status == EFI_NOT_FOUND) {
+				pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+			} else if (status != EFI_SUCCESS) {
+				pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+				return status;
+			}
+		} else {
+			pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+		}
+	}
 
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
 	 * long as the resulting offset equals or exceeds it.
 	 */
-	preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+	preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
 	if (preferred_offset < dram_base)
-		preferred_offset += SZ_2M;
+		preferred_offset += MIN_KIMG_ALIGN;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_random_alloc(sys_table_arg, *reserve_size,
+					  MIN_KIMG_ALIGN, reserve_addr,
+					  phys_seed);
 
+		*image_addr = *reserve_addr + TEXT_OFFSET;
+	} else {
 		/*
-		 * First, try a straight allocation at the preferred offset.
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel@the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
-		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
-			   EFI_PAGE_SIZE;
+		*reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
-					EFI_LOADER_DATA, nr_pages,
+					EFI_LOADER_DATA,
+					*reserve_size / EFI_PAGE_SIZE,
 					(efi_physical_addr_t *)reserve_addr);
-		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
+	}
 
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+	if (status != EFI_SUCCESS) {
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, *reserve_size,
+				       MIN_KIMG_ALIGN, reserve_addr);
+
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			*reserve_size = 0;
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
 
 	return EFI_SUCCESS;
 }
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 	if (status)
 		goto fdt_set_fail;
 
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+					      (u8 *)&fdt_val64);
+		if (status == EFI_SUCCESS)
+			status = fdt_setprop(fdt, node, "kaslr-seed",
+					     &fdt_val64, sizeof(fdt_val64));
+		else if (status != EFI_NOT_FOUND)
+			goto fdt_set_fail;
+	}
 	return EFI_SUCCESS;
 
 fdt_set_fail:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [kernel-hardening] [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-26 17:10   ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-26 17:10 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott, matt, Ard Biesheuvel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in the /chosen/kaslr-seed DT property upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                        |  5 ++
 drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
 drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
 drivers/firmware/efi/libstub/fdt.c        |  9 +++
 4 files changed, 97 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7e31454d421..c6b5f71996e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -786,6 +786,11 @@ config RANDOMIZE_BASE
 	  It is the bootloader's job to provide entropy, by passing a
 	  random u64 value in /chosen/kaslr-seed at kernel entry.
 
+	  When booting via the UEFI stub, it will invoke the firmware's
+	  EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+	  to the kernel proper. In addition, it will randomise the physical
+	  location of the kernel Image as well.
+
 	  If unsure, say N.
 
 endmenu
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 3397902e4040..4deb3e7faa0e 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -18,6 +18,8 @@
 
 #include "efistub.h"
 
+bool __nokaslr;
+
 static int efi_secureboot_enabled(efi_system_table_t *sys_table_arg)
 {
 	static efi_guid_t const var_guid = EFI_GLOBAL_VARIABLE_GUID;
@@ -207,14 +209,6 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 		pr_efi_err(sys_table, "Failed to find DRAM base\n");
 		goto fail;
 	}
-	status = handle_kernel_image(sys_table, image_addr, &image_size,
-				     &reserve_addr,
-				     &reserve_size,
-				     dram_base, image);
-	if (status != EFI_SUCCESS) {
-		pr_efi_err(sys_table, "Failed to relocate kernel\n");
-		goto fail;
-	}
 
 	/*
 	 * Get the command line from EFI, using the LOADED_IMAGE
@@ -224,7 +218,28 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size);
 	if (!cmdline_ptr) {
 		pr_efi_err(sys_table, "getting command line via LOADED_IMAGE_PROTOCOL\n");
-		goto fail_free_image;
+		goto fail;
+	}
+
+	/* check whether 'nokaslr' was passed on the command line */
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		static const u8 default_cmdline[] = CONFIG_CMDLINE;
+		const u8 *str, *cmdline = cmdline_ptr;
+
+		if (IS_ENABLED(CONFIG_CMDLINE_FORCE))
+			cmdline = default_cmdline;
+		str = strstr(cmdline, "nokaslr");
+		if (str == cmdline || (str > cmdline && *(str - 1) == ' '))
+			__nokaslr = true;
+	}
+
+	status = handle_kernel_image(sys_table, image_addr, &image_size,
+				     &reserve_addr,
+				     &reserve_size,
+				     dram_base, image);
+	if (status != EFI_SUCCESS) {
+		pr_efi_err(sys_table, "Failed to relocate kernel\n");
+		goto fail_free_cmdline;
 	}
 
 	status = efi_parse_options(cmdline_ptr);
@@ -244,7 +259,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 
 		if (status != EFI_SUCCESS) {
 			pr_efi_err(sys_table, "Failed to load device tree!\n");
-			goto fail_free_cmdline;
+			goto fail_free_image;
 		}
 	}
 
@@ -286,12 +301,11 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table,
 	efi_free(sys_table, initrd_size, initrd_addr);
 	efi_free(sys_table, fdt_size, fdt_addr);
 
-fail_free_cmdline:
-	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
-
 fail_free_image:
 	efi_free(sys_table, image_size, *image_addr);
 	efi_free(sys_table, reserve_size, reserve_addr);
+fail_free_cmdline:
+	efi_free(sys_table, cmdline_size, (unsigned long)cmdline_ptr);
 fail:
 	return EFI_ERROR;
 }
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..e0e6b74fef8f 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,10 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+#include "efistub.h"
+
+extern bool __nokaslr;
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -23,26 +27,52 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 {
 	efi_status_t status;
 	unsigned long kernel_size, kernel_memsize = 0;
-	unsigned long nr_pages;
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
+	u64 phys_seed = 0;
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		if (!__nokaslr) {
+			status = efi_get_random_bytes(sys_table_arg,
+						      sizeof(phys_seed),
+						      (u8 *)&phys_seed);
+			if (status == EFI_NOT_FOUND) {
+				pr_efi(sys_table_arg, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+			} else if (status != EFI_SUCCESS) {
+				pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+				return status;
+			}
+		} else {
+			pr_efi(sys_table_arg, "KASLR disabled on kernel command line\n");
+		}
+	}
 
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
 	 * long as the resulting offset equals or exceeds it.
 	 */
-	preferred_offset = round_down(dram_base, SZ_2M) + TEXT_OFFSET;
+	preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
 	if (preferred_offset < dram_base)
-		preferred_offset += SZ_2M;
+		preferred_offset += MIN_KIMG_ALIGN;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_random_alloc(sys_table_arg, *reserve_size,
+					  MIN_KIMG_ALIGN, reserve_addr,
+					  phys_seed);
 
+		*image_addr = *reserve_addr + TEXT_OFFSET;
+	} else {
 		/*
-		 * First, try a straight allocation at the preferred offset.
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +82,31 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
-		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
-			   EFI_PAGE_SIZE;
+		*reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
+
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
-					EFI_LOADER_DATA, nr_pages,
+					EFI_LOADER_DATA,
+					*reserve_size / EFI_PAGE_SIZE,
 					(efi_physical_addr_t *)reserve_addr);
-		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
+	}
 
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+	if (status != EFI_SUCCESS) {
+		*reserve_size = kernel_memsize + TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, *reserve_size,
+				       MIN_KIMG_ALIGN, reserve_addr);
+
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			*reserve_size = 0;
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
 
 	return EFI_SUCCESS;
 }
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index cf7b7d46302a..04c9302b0ef1 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 	if (status)
 		goto fdt_set_fail;
 
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
+					      (u8 *)&fdt_val64);
+		if (status == EFI_SUCCESS)
+			status = fdt_setprop(fdt, node, "kaslr-seed",
+					     &fdt_val64, sizeof(fdt_val64));
+		else if (status != EFI_NOT_FOUND)
+			goto fdt_set_fail;
+	}
 	return EFI_SUCCESS;
 
 fdt_set_fail:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:42     ` Mark Rutland
  -1 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall, labbott,
	matt

On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
>  #include <asm/ptrace.h>
>  #include <asm/user.h>
>  
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs)	\
> -	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
>  /*
>   * AArch64 static relocation types.
>   */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>   */
>  #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
>  
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs)	\
> +	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
>  /*
>   * When the program starts, a1 contains a pointer to a function to be
>   * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>  
>  #endif /* CONFIG_COMPAT */
>  
> +#endif /* !__ASSEMBLY__ */
> +
>  #endif
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:42     ` Mark Rutland
  0 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
>  #include <asm/ptrace.h>
>  #include <asm/user.h>
>  
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs)	\
> -	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
>  /*
>   * AArch64 static relocation types.
>   */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>   */
>  #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
>  
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs)	\
> +	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
>  /*
>   * When the program starts, a1 contains a pointer to a function to be
>   * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>  
>  #endif /* CONFIG_COMPAT */
>  
> +#endif /* !__ASSEMBLY__ */
> +
>  #endif
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 14/22] arm64: make asm/elf.h available to asm files
@ 2016-01-26 17:42     ` Mark Rutland
  0 siblings, 0 replies; 93+ messages in thread
From: Mark Rutland @ 2016-01-26 17:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall, labbott,
	matt

On Tue, Jan 26, 2016 at 06:10:41PM +0100, Ard Biesheuvel wrote:
> This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
> around its C definitions so that the CPP defines can be used in asm
> source files as well.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/elf.h | 22 ++++++++++++--------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index faad6df49e5b..435f55952e1f 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -24,15 +24,6 @@
>  #include <asm/ptrace.h>
>  #include <asm/user.h>
>  
> -typedef unsigned long elf_greg_t;
> -
> -#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> -#define ELF_CORE_COPY_REGS(dest, regs)	\
> -	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> -
> -typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> -typedef struct user_fpsimd_state elf_fpregset_t;
> -
>  /*
>   * AArch64 static relocation types.
>   */
> @@ -127,6 +118,17 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>   */
>  #define ELF_ET_DYN_BASE	(2 * TASK_SIZE_64 / 3)
>  
> +#ifndef __ASSEMBLY__
> +
> +typedef unsigned long elf_greg_t;
> +
> +#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> +#define ELF_CORE_COPY_REGS(dest, regs)	\
> +	*(struct user_pt_regs *)&(dest) = (regs)->user_regs;
> +
> +typedef elf_greg_t elf_gregset_t[ELF_NGREG];
> +typedef struct user_fpsimd_state elf_fpregset_t;
> +
>  /*
>   * When the program starts, a1 contains a pointer to a function to be
>   * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
> @@ -186,4 +188,6 @@ extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
>  
>  #endif /* CONFIG_COMPAT */
>  
> +#endif /* !__ASSEMBLY__ */
> +
>  #endif
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-26 17:45     ` Marc Zyngier
  -1 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, mark.rutland, leif.lindholm, keescook,
	linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall, labbott, matt

On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:45     ` Marc Zyngier
  0 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-26 17:45     ` Marc Zyngier
  0 siblings, 0 replies; 93+ messages in thread
From: Marc Zyngier @ 2016-01-26 17:45 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, mark.rutland, leif.lindholm, keescook,
	linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall, labbott, matt

On 26/01/16 17:10, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-26 23:25     ` Kees Cook
  -1 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
	Matt Fleming

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

Something I can actually test! :)

-Kees

> ---
>  scripts/sortextable.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       ||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
>                 if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
>                 ||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
>                 if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
>                 ||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 23:25     ` Kees Cook
  0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

Something I can actually test! :)

-Kees

> ---
>  scripts/sortextable.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       ||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
>                 if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
>                 ||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
>                 if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
>                 ||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries
@ 2016-01-26 23:25     ` Kees Cook
  0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
	Matt Fleming

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Add support to scripts/sortextable for handling relocatable (PIE)
> executables, whose ELF type is ET_DYN, not ET_EXEC. Other than adding
> support for the new type, no changes are needed.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

Something I can actually test! :)

-Kees

> ---
>  scripts/sortextable.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..19d83647846c 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       ||  (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> @@ -304,7 +304,7 @@ do_file(char const *const fname)
>                 if (r2(&ehdr->e_ehsize) != sizeof(Elf32_Ehdr)
>                 ||  r2(&ehdr->e_shentsize) != sizeof(Elf32_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do32(ehdr, fname, custom_sort);
> @@ -314,7 +314,7 @@ do_file(char const *const fname)
>                 if (r2(&ghdr->e_ehsize) != sizeof(Elf64_Ehdr)
>                 ||  r2(&ghdr->e_shentsize) != sizeof(Elf64_Shdr)) {
>                         fprintf(stderr,
> -                               "unrecognized ET_EXEC file: %s\n", fname);
> +                               "unrecognized ET_EXEC/ET_DYN file: %s\n", fname);
>                         fail_file();
>                 }
>                 do64(ghdr, fname, custom_sort);
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-26 23:52     ` Kees Cook
  -1 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
	Matt Fleming

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)

-Kees

> ---
>  drivers/firmware/efi/libstub/efistub.h |   4 +
>  drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
>  2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
>  efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
>                                   unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size, unsigned long align,
> +                             unsigned long *addr, unsigned long random_seed);
> +
>  #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
>         return rng->get_rng(rng, NULL, size, out);
>  }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> +                                        unsigned long size,
> +                                        unsigned long align)
> +{
> +       u64 start, end;
> +
> +       if (md->type != EFI_CONVENTIONAL_MEMORY)
> +               return 0;
> +
> +       start = round_up(md->phys_addr, align);
> +       end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> +                        align);
> +
> +       if (start > end)
> +               return 0;
> +
> +       return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)       ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size,
> +                             unsigned long align,
> +                             unsigned long *addr,
> +                             unsigned long random_seed)
> +{
> +       unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +       efi_status_t status = EFI_NOT_FOUND;
> +       efi_memory_desc_t *memory_map;
> +       int map_offset;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       if (align < EFI_ALLOC_ALIGN)
> +               align = EFI_ALLOC_ALIGN;
> +
> +       /* count the suitable slots in each memory map entry */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               unsigned long slots;
> +
> +               slots = get_entry_num_slots(md, size, align);
> +               MD_NUM_SLOTS(md) = slots;
> +               total_slots += slots;
> +       }
> +
> +       /* find a random number between 0 and total_slots */
> +       target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> +       /*
> +        * target_slot is now a value in the range [0, total_slots), and so
> +        * it corresponds with exactly one of the suitable slots we recorded
> +        * when iterating over the memory map the first time around.
> +        *
> +        * So iterate over the memory map again, subtracting the number of
> +        * slots of each entry at each iteration, until we have found the entry
> +        * that covers our chosen slot. Use the residual value of target_slot
> +        * to calculate the randomly chosen address, and allocate it directly
> +        * using EFI_ALLOCATE_ADDRESS.
> +        */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               efi_physical_addr_t target;
> +               unsigned long pages;
> +
> +               if (target_slot >= MD_NUM_SLOTS(md)) {
> +                       target_slot -= MD_NUM_SLOTS(md);
> +                       continue;
> +               }
> +
> +               target = round_up(md->phys_addr, align) + target_slot * align;
> +               pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> +                                       EFI_LOADER_DATA, pages, &target);
> +               if (status == EFI_SUCCESS)
> +                       *addr = target;
> +               break;
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return status;
> +}
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 23:52     ` Kees Cook
  0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)

-Kees

> ---
>  drivers/firmware/efi/libstub/efistub.h |   4 +
>  drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
>  2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
>  efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
>                                   unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size, unsigned long align,
> +                             unsigned long *addr, unsigned long random_seed);
> +
>  #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
>         return rng->get_rng(rng, NULL, size, out);
>  }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> +                                        unsigned long size,
> +                                        unsigned long align)
> +{
> +       u64 start, end;
> +
> +       if (md->type != EFI_CONVENTIONAL_MEMORY)
> +               return 0;
> +
> +       start = round_up(md->phys_addr, align);
> +       end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> +                        align);
> +
> +       if (start > end)
> +               return 0;
> +
> +       return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)       ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size,
> +                             unsigned long align,
> +                             unsigned long *addr,
> +                             unsigned long random_seed)
> +{
> +       unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +       efi_status_t status = EFI_NOT_FOUND;
> +       efi_memory_desc_t *memory_map;
> +       int map_offset;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       if (align < EFI_ALLOC_ALIGN)
> +               align = EFI_ALLOC_ALIGN;
> +
> +       /* count the suitable slots in each memory map entry */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               unsigned long slots;
> +
> +               slots = get_entry_num_slots(md, size, align);
> +               MD_NUM_SLOTS(md) = slots;
> +               total_slots += slots;
> +       }
> +
> +       /* find a random number between 0 and total_slots */
> +       target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> +       /*
> +        * target_slot is now a value in the range [0, total_slots), and so
> +        * it corresponds with exactly one of the suitable slots we recorded
> +        * when iterating over the memory map the first time around.
> +        *
> +        * So iterate over the memory map again, subtracting the number of
> +        * slots of each entry at each iteration, until we have found the entry
> +        * that covers our chosen slot. Use the residual value of target_slot
> +        * to calculate the randomly chosen address, and allocate it directly
> +        * using EFI_ALLOCATE_ADDRESS.
> +        */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               efi_physical_addr_t target;
> +               unsigned long pages;
> +
> +               if (target_slot >= MD_NUM_SLOTS(md)) {
> +                       target_slot -= MD_NUM_SLOTS(md);
> +                       continue;
> +               }
> +
> +               target = round_up(md->phys_addr, align) + target_slot * align;
> +               pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> +                                       EFI_LOADER_DATA, pages, &target);
> +               if (status == EFI_SUCCESS)
> +                       *addr = target;
> +               break;
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return status;
> +}
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-26 23:52     ` Kees Cook
  0 siblings, 0 replies; 93+ messages in thread
From: Kees Cook @ 2016-01-26 23:52 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall, Laura Abbott,
	Matt Fleming

On Tue, Jan 26, 2016 at 9:10 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This implements efi_random_alloc(), which allocates a chunk of memory of
> a certain size at a certain alignment, and uses the random_seed argument
> it receives to randomize the address of the allocation.
>
> This is implemented by iterating over the UEFI memory map, counting the
> number of suitable slots (aligned offsets) within each region, and picking
> a random number between 0 and 'number of slots - 1' to select the slot,
> This should guarantee that each possible offset is chosen equally likely.
>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

(When a third arch implements kASLR, we probably want to merge this
and the x86 logic into some kind of reusable code...)

-Kees

> ---
>  drivers/firmware/efi/libstub/efistub.h |   4 +
>  drivers/firmware/efi/libstub/random.c  | 100 ++++++++++++++++++++
>  2 files changed, 104 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
> index 206b7252b9d1..5ed3d3f38166 100644
> --- a/drivers/firmware/efi/libstub/efistub.h
> +++ b/drivers/firmware/efi/libstub/efistub.h
> @@ -46,4 +46,8 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
>  efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table,
>                                   unsigned long size, u8 *out);
>
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size, unsigned long align,
> +                             unsigned long *addr, unsigned long random_seed);
> +
>  #endif
> diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
> index 97941ee5954f..b98346350230 100644
> --- a/drivers/firmware/efi/libstub/random.c
> +++ b/drivers/firmware/efi/libstub/random.c
> @@ -33,3 +33,103 @@ efi_status_t efi_get_random_bytes(efi_system_table_t *sys_table_arg,
>
>         return rng->get_rng(rng, NULL, size, out);
>  }
> +
> +/*
> + * Return the number of slots covered by this entry, i.e., the number of
> + * addresses it covers that are suitably aligned and supply enough room
> + * for the allocation.
> + */
> +static unsigned long get_entry_num_slots(efi_memory_desc_t *md,
> +                                        unsigned long size,
> +                                        unsigned long align)
> +{
> +       u64 start, end;
> +
> +       if (md->type != EFI_CONVENTIONAL_MEMORY)
> +               return 0;
> +
> +       start = round_up(md->phys_addr, align);
> +       end = round_down(md->phys_addr + md->num_pages * EFI_PAGE_SIZE - size,
> +                        align);
> +
> +       if (start > end)
> +               return 0;
> +
> +       return (end - start + 1) / align;
> +}
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)       ((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +                             unsigned long size,
> +                             unsigned long align,
> +                             unsigned long *addr,
> +                             unsigned long random_seed)
> +{
> +       unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +       efi_status_t status = EFI_NOT_FOUND;
> +       efi_memory_desc_t *memory_map;
> +       int map_offset;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       if (align < EFI_ALLOC_ALIGN)
> +               align = EFI_ALLOC_ALIGN;
> +
> +       /* count the suitable slots in each memory map entry */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               unsigned long slots;
> +
> +               slots = get_entry_num_slots(md, size, align);
> +               MD_NUM_SLOTS(md) = slots;
> +               total_slots += slots;
> +       }
> +
> +       /* find a random number between 0 and total_slots */
> +       target_slot = (total_slots * (u16)random_seed) >> 16;
> +
> +       /*
> +        * target_slot is now a value in the range [0, total_slots), and so
> +        * it corresponds with exactly one of the suitable slots we recorded
> +        * when iterating over the memory map the first time around.
> +        *
> +        * So iterate over the memory map again, subtracting the number of
> +        * slots of each entry at each iteration, until we have found the entry
> +        * that covers our chosen slot. Use the residual value of target_slot
> +        * to calculate the randomly chosen address, and allocate it directly
> +        * using EFI_ALLOCATE_ADDRESS.
> +        */
> +       for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + map_offset;
> +               efi_physical_addr_t target;
> +               unsigned long pages;
> +
> +               if (target_slot >= MD_NUM_SLOTS(md)) {
> +                       target_slot -= MD_NUM_SLOTS(md);
> +                       continue;
> +               }
> +
> +               target = round_up(md->phys_addr, align) + target_slot * align;
> +               pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
> +                                       EFI_LOADER_DATA, pages, &target);
> +               if (status == EFI_SUCCESS)
> +                       *addr = target;
> +               break;
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return status;
> +}
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-29 15:39     ` Matt Fleming
  -1 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott

On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)	((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +			      unsigned long size,
> +			      unsigned long align,
> +			      unsigned long *addr,
> +			      unsigned long random_seed)
> +{
> +	unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +	efi_status_t status = EFI_NOT_FOUND;
> +	efi_memory_desc_t *memory_map;
> +	int map_offset;

The assignment of 'status' is not needed since it gets assigned a new
value immediately.

Otherwise looks good.

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-29 15:39     ` Matt Fleming
  0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)	((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +			      unsigned long size,
> +			      unsigned long align,
> +			      unsigned long *addr,
> +			      unsigned long random_seed)
> +{
> +	unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +	efi_status_t status = EFI_NOT_FOUND;
> +	efi_memory_desc_t *memory_map;
> +	int map_offset;

The assignment of 'status' is not needed since it gets assigned a new
value immediately.

Otherwise looks good.

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc()
@ 2016-01-29 15:39     ` Matt Fleming
  0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott

On Tue, 26 Jan, at 06:10:47PM, Ard Biesheuvel wrote:
> +
> +/*
> + * The UEFI memory descriptors have a virtual address field that is only used
> + * when installing the virtual mapping using SetVirtualAddressMap(). Since it
> + * is unused here, we can reuse it to keep track of each descriptor's slot
> + * count.
> + */
> +#define MD_NUM_SLOTS(md)	((md)->virt_addr)
> +
> +efi_status_t efi_random_alloc(efi_system_table_t *sys_table_arg,
> +			      unsigned long size,
> +			      unsigned long align,
> +			      unsigned long *addr,
> +			      unsigned long random_seed)
> +{
> +	unsigned long map_size, desc_size, total_slots = 0, target_slot;
> +	efi_status_t status = EFI_NOT_FOUND;
> +	efi_memory_desc_t *memory_map;
> +	int map_offset;

The assignment of 'status' is not needed since it gets assigned a new
value immediately.

Otherwise looks good.

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2016-01-26 17:10   ` Ard Biesheuvel
  (?)
@ 2016-01-29 15:57     ` Matt Fleming
  -1 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott

On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig                        |  5 ++
>  drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
>  drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
>  drivers/firmware/efi/libstub/fdt.c        |  9 +++
>  4 files changed, 97 insertions(+), 35 deletions(-)
 
[...]

> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
>  	if (status)
>  		goto fdt_set_fail;
>  
> +	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> +		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> +					      (u8 *)&fdt_val64);
> +		if (status == EFI_SUCCESS)
> +			status = fdt_setprop(fdt, node, "kaslr-seed",
> +					     &fdt_val64, sizeof(fdt_val64));
> +		else if (status != EFI_NOT_FOUND)
> +			goto fdt_set_fail;
> +	}
>  	return EFI_SUCCESS;
>  
>  fdt_set_fail:

I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-29 15:57     ` Matt Fleming
  0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig                        |  5 ++
>  drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
>  drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
>  drivers/firmware/efi/libstub/fdt.c        |  9 +++
>  4 files changed, 97 insertions(+), 35 deletions(-)
 
[...]

> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
>  	if (status)
>  		goto fdt_set_fail;
>  
> +	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> +		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> +					      (u8 *)&fdt_val64);
> +		if (status == EFI_SUCCESS)
> +			status = fdt_setprop(fdt, node, "kaslr-seed",
> +					     &fdt_val64, sizeof(fdt_val64));
> +		else if (status != EFI_NOT_FOUND)
> +			goto fdt_set_fail;
> +	}
>  	return EFI_SUCCESS;
>  
>  fdt_set_fail:

I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-29 15:57     ` Matt Fleming
  0 siblings, 0 replies; 93+ messages in thread
From: Matt Fleming @ 2016-01-29 15:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, labbott

On Tue, 26 Jan, at 06:10:49PM, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in the /chosen/kaslr-seed DT property upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/Kconfig                        |  5 ++
>  drivers/firmware/efi/libstub/arm-stub.c   | 40 ++++++----
>  drivers/firmware/efi/libstub/arm64-stub.c | 78 ++++++++++++++------
>  drivers/firmware/efi/libstub/fdt.c        |  9 +++
>  4 files changed, 97 insertions(+), 35 deletions(-)
 
[...]

> diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
> index cf7b7d46302a..04c9302b0ef1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -147,6 +147,15 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
>  	if (status)
>  		goto fdt_set_fail;
>  
> +	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> +		status = efi_get_random_bytes(sys_table, sizeof(fdt_val64),
> +					      (u8 *)&fdt_val64);
> +		if (status == EFI_SUCCESS)
> +			status = fdt_setprop(fdt, node, "kaslr-seed",
> +					     &fdt_val64, sizeof(fdt_val64));
> +		else if (status != EFI_NOT_FOUND)
> +			goto fdt_set_fail;
> +	}
>  	return EFI_SUCCESS;
>  
>  fdt_set_fail:

I think you want to handle the case where fdt_setprop() fails. With
this new code you'll silently return EFI_SUCCESS even if you fail to
set "kaslr-seed".

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 00/22] arm64: implement support for KASLR
  2016-01-26 17:10 ` Ard Biesheuvel
  (?)
@ 2016-01-29 18:26   ` Catalin Marinas
  -1 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, matt, arnd,
	bhupesh.sharma, stuart.yoder, marc.zyngier, christoffer.dall,
	labbott

Hi Ard,

On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.

So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:

1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes

1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:26   ` Catalin Marinas
  0 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.

So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:

1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes

1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:26   ` Catalin Marinas
  0 siblings, 0 replies; 93+ messages in thread
From: Catalin Marinas @ 2016-01-29 18:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, matt, arnd,
	bhupesh.sharma, stuart.yoder, marc.zyngier, christoffer.dall,
	labbott

Hi Ard,

On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a

The overall series looks fine but I'd like more time to review the KASLR
part together with the module PLT stuff.

So could you please split this series in 2-3 parts for easy merging
(possibly without the KASLR part, it depends on how the review goes)? It
looks like a mix of features in random order like huge-vmap, relative
extable, kernel memory layout changes, kernel load address, PIE and
KASLR. So something like:

1. relative extable
2. huge-vmap
3. kernel memory layout changes (moving kernel to the base of vmalloc
range)
4. allow kernel loading at different phys offsets
5. module PLTs, relocations, PIE, relative kallsyms, KASLR
6. efi_get_random_bytes

1-4 can be in the same branch but I currently find it hard to
cherry-pick the non-PIE/non-KASLR patches without conflicts.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v4 00/22] arm64: implement support for KASLR
  2016-01-29 18:26   ` Catalin Marinas
  (?)
@ 2016-01-29 18:49     ` Ard Biesheuvel
  -1 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Matt Fleming,
	Arnd Bergmann, Sharma Bhupesh, Stuart Yoder, Marc Zyngier,
	Christoffer Dall, Laura Abbott

giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?

Sure

> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable

This has been picked up by akpm in the mean time

> 2. huge-vmap

This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.

> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR

Relative kallsyms has also been picked up by akpm

> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>

I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday

-- 
Ard.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:49     ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
  To: linux-arm-kernel

giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?

Sure

> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable

This has been picked up by akpm in the mean time

> 2. huge-vmap

This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.

> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR

Relative kallsyms has also been picked up by akpm

> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>

I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday

-- 
Ard.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [kernel-hardening] Re: [PATCH v4 00/22] arm64: implement support for KASLR
@ 2016-01-29 18:49     ` Ard Biesheuvel
  0 siblings, 0 replies; 93+ messages in thread
From: Ard Biesheuvel @ 2016-01-29 18:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Matt Fleming,
	Arnd Bergmann, Sharma Bhupesh, Stuart Yoder, Marc Zyngier,
	Christoffer Dall, Laura Abbott

giOn 29 January 2016 at 19:26, Catalin Marinas <catalin.marinas@arm.com> wrote:
> Hi Ard,
>
> On Tue, Jan 26, 2016 at 06:10:27PM +0100, Ard Biesheuvel wrote:
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v4a
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v4a
>
> The overall series looks fine but I'd like more time to review the KASLR
> part together with the module PLT stuff.
>
> So could you please split this series in 2-3 parts for easy merging
> (possibly without the KASLR part, it depends on how the review goes)?

Sure

> It
> looks like a mix of features in random order like huge-vmap, relative
> extable, kernel memory layout changes, kernel load address, PIE and
> KASLR. So something like:
>
> 1. relative extable

This has been picked up by akpm in the mean time

> 2. huge-vmap

This is a single patch which is completely independent. I don't expect
it to conflict when applied in isolation.

> 3. kernel memory layout changes (moving kernel to the base of vmalloc
> range)
> 4. allow kernel loading at different phys offsets
> 5. module PLTs, relocations, PIE, relative kallsyms, KASLR

Relative kallsyms has also been picked up by akpm

> 6. efi_get_random_bytes
>
> 1-4 can be in the same branch but I currently find it hard to
> cherry-pick the non-PIE/non-KASLR patches without conflicts.
>

I think 3 logically coherent series that apply in sequence is
feasible. I will look into this on Monday

-- 
Ard.

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2016-01-29 18:49 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-26 17:10 [PATCH v4 00/22] arm64: implement support for KASLR Ard Biesheuvel
2016-01-26 17:10 ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10 ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 01/22] of/fdt: make memblock minimum physical address arch configurable Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 02/22] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 03/22] arm64: pgtable: implement static [pte|pmd|pud]_offset variants Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 04/22] arm64: decouple early fixmap init from linear mapping Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 05/22] arm64: kvm: deal with kernel symbols outside of " Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:45   ` Marc Zyngier
2016-01-26 17:45     ` [kernel-hardening] " Marc Zyngier
2016-01-26 17:45     ` Marc Zyngier
2016-01-26 17:10 ` [PATCH v4 06/22] arm64: add support for ioremap() block mappings Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 07/22] arm64: move kernel image to base of vmalloc area Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 08/22] arm64: add support for module PLTs Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 09/22] extable: add support for relative extables to search and sort routines Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 10/22] arm64: switch to relative exception tables Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 11/22] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 12/22] arm64: avoid dynamic relocations in early boot code Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 13/22] arm64: allow kernel Image to be loaded anywhere in physical memory Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 14/22] arm64: make asm/elf.h available to asm files Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:42   ` Mark Rutland
2016-01-26 17:42     ` [kernel-hardening] " Mark Rutland
2016-01-26 17:42     ` Mark Rutland
2016-01-26 17:10 ` [PATCH v4 15/22] scripts/sortextable: add support for ET_DYN binaries Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 23:25   ` Kees Cook
2016-01-26 23:25     ` [kernel-hardening] " Kees Cook
2016-01-26 23:25     ` Kees Cook
2016-01-26 17:10 ` [PATCH v4 16/22] kallsyms: add support for relative offsets in kallsyms address table Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 17/22] arm64: add support for building the kernel as a relocate PIE binary Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 18/22] arm64: add support for kernel ASLR Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 19/22] efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 20/22] efi: stub: add implementation of efi_random_alloc() Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 23:52   ` Kees Cook
2016-01-26 23:52     ` [kernel-hardening] " Kees Cook
2016-01-26 23:52     ` Kees Cook
2016-01-29 15:39   ` Matt Fleming
2016-01-29 15:39     ` [kernel-hardening] " Matt Fleming
2016-01-29 15:39     ` Matt Fleming
2016-01-26 17:10 ` [PATCH v4 21/22] efi: stub: use high allocation for converted command line Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-26 17:10 ` [PATCH v4 22/22] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness Ard Biesheuvel
2016-01-26 17:10   ` [kernel-hardening] " Ard Biesheuvel
2016-01-26 17:10   ` Ard Biesheuvel
2016-01-29 15:57   ` Matt Fleming
2016-01-29 15:57     ` [kernel-hardening] " Matt Fleming
2016-01-29 15:57     ` Matt Fleming
2016-01-29 18:26 ` [PATCH v4 00/22] arm64: implement support for KASLR Catalin Marinas
2016-01-29 18:26   ` [kernel-hardening] " Catalin Marinas
2016-01-29 18:26   ` Catalin Marinas
2016-01-29 18:49   ` Ard Biesheuvel
2016-01-29 18:49     ` [kernel-hardening] " Ard Biesheuvel
2016-01-29 18:49     ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.