All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] arm64: implement support for KASLR
@ 2015-12-30 15:25 ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:25 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. This v2 also implements physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in register x1
  upon kernel entry. Since we have no decompressor, it is simply not feasible
  to collect randomness in the head.S code path before mapping the kernel and
  enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.
- Only mildly tested. Help appreciated.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
individually.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #7 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range.

Patch #8 moves to the x86 version of the extable implementation so that it no
longer contains absolute addresses that require fixing up at relocation time,
but uses relative offsets instead.

Patch #9 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #10 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #12 implements the core KASLR, by taking randomness supplied in register x1
and using it to move the kernel inside the vmalloc area.

Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
kernel proper.

Ard Biesheuvel (13):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: use more granular reservations for static page table
    allocations
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  arm64: use relative references in exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: add support for relocatable kernel
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt           |  15 ++-
 arch/arm/include/asm/kvm_asm.h            |   2 +
 arch/arm/include/asm/kvm_mmu.h            |   2 +
 arch/arm/kvm/arm.c                        |   9 +-
 arch/arm/kvm/mmu.c                        |  12 +-
 arch/arm64/Kconfig                        |  18 +++
 arch/arm64/Makefile                       |  10 +-
 arch/arm64/include/asm/assembler.h        |  17 ++-
 arch/arm64/include/asm/boot.h             |   5 +
 arch/arm64/include/asm/compiler.h         |   2 +
 arch/arm64/include/asm/futex.h            |   4 +-
 arch/arm64/include/asm/kasan.h            |  17 +--
 arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
 arch/arm64/include/asm/kvm_asm.h          |  21 +--
 arch/arm64/include/asm/kvm_mmu.h          |   2 +
 arch/arm64/include/asm/memory.h           |  37 ++++--
 arch/arm64/include/asm/module.h           |  11 ++
 arch/arm64/include/asm/pgtable.h          |   7 -
 arch/arm64/include/asm/uaccess.h          |  16 +--
 arch/arm64/include/asm/virt.h             |   4 -
 arch/arm64/kernel/Makefile                |   1 +
 arch/arm64/kernel/armv8_deprecated.c      |   4 +-
 arch/arm64/kernel/efi-entry.S             |   9 +-
 arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
 arch/arm64/kernel/image.h                 |  37 ++----
 arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c                |   7 +
 arch/arm64/kernel/module.lds              |   4 +
 arch/arm64/kernel/setup.c                 |  15 ++-
 arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
 arch/arm64/kvm/debug.c                    |   4 +-
 arch/arm64/mm/dump.c                      |  12 +-
 arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
 arch/arm64/mm/init.c                      |  75 +++++++++--
 arch/arm64/mm/mmu.c                       | 132 +++++++------------
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
 drivers/of/fdt.c                          |   5 +-
 include/linux/efi.h                       |   5 +-
 scripts/sortextable.c                     |   6 +-
 virt/kvm/arm/vgic-v3.c                    |   2 +-
 41 files changed, 813 insertions(+), 257 deletions(-)
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds


EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe 
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface : 
     >> Supported RNG Algorithm (Count = 2) : 
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface : 
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds): 
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 00/13] arm64: implement support for KASLR
@ 2015-12-30 15:25 ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:25 UTC (permalink / raw)
  To: linux-arm-kernel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. This v2 also implements physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in register x1
  upon kernel entry. Since we have no decompressor, it is simply not feasible
  to collect randomness in the head.S code path before mapping the kernel and
  enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.
- Only mildly tested. Help appreciated.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
individually.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #7 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range.

Patch #8 moves to the x86 version of the extable implementation so that it no
longer contains absolute addresses that require fixing up at relocation time,
but uses relative offsets instead.

Patch #9 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #10 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #12 implements the core KASLR, by taking randomness supplied in register x1
and using it to move the kernel inside the vmalloc area.

Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
kernel proper.

Ard Biesheuvel (13):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: use more granular reservations for static page table
    allocations
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  arm64: use relative references in exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: add support for relocatable kernel
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt           |  15 ++-
 arch/arm/include/asm/kvm_asm.h            |   2 +
 arch/arm/include/asm/kvm_mmu.h            |   2 +
 arch/arm/kvm/arm.c                        |   9 +-
 arch/arm/kvm/mmu.c                        |  12 +-
 arch/arm64/Kconfig                        |  18 +++
 arch/arm64/Makefile                       |  10 +-
 arch/arm64/include/asm/assembler.h        |  17 ++-
 arch/arm64/include/asm/boot.h             |   5 +
 arch/arm64/include/asm/compiler.h         |   2 +
 arch/arm64/include/asm/futex.h            |   4 +-
 arch/arm64/include/asm/kasan.h            |  17 +--
 arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
 arch/arm64/include/asm/kvm_asm.h          |  21 +--
 arch/arm64/include/asm/kvm_mmu.h          |   2 +
 arch/arm64/include/asm/memory.h           |  37 ++++--
 arch/arm64/include/asm/module.h           |  11 ++
 arch/arm64/include/asm/pgtable.h          |   7 -
 arch/arm64/include/asm/uaccess.h          |  16 +--
 arch/arm64/include/asm/virt.h             |   4 -
 arch/arm64/kernel/Makefile                |   1 +
 arch/arm64/kernel/armv8_deprecated.c      |   4 +-
 arch/arm64/kernel/efi-entry.S             |   9 +-
 arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
 arch/arm64/kernel/image.h                 |  37 ++----
 arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c                |   7 +
 arch/arm64/kernel/module.lds              |   4 +
 arch/arm64/kernel/setup.c                 |  15 ++-
 arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
 arch/arm64/kvm/debug.c                    |   4 +-
 arch/arm64/mm/dump.c                      |  12 +-
 arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
 arch/arm64/mm/init.c                      |  75 +++++++++--
 arch/arm64/mm/mmu.c                       | 132 +++++++------------
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
 drivers/of/fdt.c                          |   5 +-
 include/linux/efi.h                       |   5 +-
 scripts/sortextable.c                     |   6 +-
 virt/kvm/arm/vgic-v3.c                    |   2 +-
 41 files changed, 813 insertions(+), 257 deletions(-)
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds


EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe 
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface : 
     >> Supported RNG Algorithm (Count = 2) : 
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface : 
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds): 
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 00/13] arm64: implement support for KASLR
@ 2015-12-30 15:25 ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:25 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This series implements KASLR for arm64, by building the kernel as a PIE
executable that can relocate itself at runtime, and moving it to a random
offset in the vmalloc area. This v2 also implements physical randomization,
i.e., it allows the kernel to deal with being loaded at any physical offset
(modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
UEFI stub to obtain random bits and perform the actual randomization of the
physical load address.

Changes since v1/RFC:
- This series now implements fully independent virtual and physical address
  randomization at load time. I have recycled some patches from this series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
  final UEFI stub patch to randomize the physical address as well.
- Added a patch to deal with the way KVM on arm64 makes assumptions about the
  relation between kernel symbols and the linear mapping (on which the HYP
  mapping is based), as these assumptions cease to be valid once we move the
  kernel Image out of the linear mapping.
- Updated the module PLT patch so it works on BE kernels as well.
- Moved the constant Image header values to head.S, and updated the linker
  script to provide the kernel size using R_AARCH64_ABS32 relocation rather
  than a R_AARCH64_ABS64 relocation, since those are always resolved at build
  time. This allows me to get rid of the post-build perl script to swab header
  values on BE kernels.
- Minor style tweaks.

Notes:
- These patches apply on top of Mark Rutland's pagetable rework series:
  http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
- The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
  24 bytes per relocation entry. This results in considerable bloat (i.e., a
  couple of MBs worth of relocation data in an .init section). However, no
  build time postprocessing is required, we rely fully on the toolchain to
  produce the image
- We have to rely on the bootloader to supply some randomness in register x1
  upon kernel entry. Since we have no decompressor, it is simply not feasible
  to collect randomness in the head.S code path before mapping the kernel and
  enabling the MMU.
- The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
  UEFI systems is not universally available. A QEMU/KVM firmware image that
  implements a pseudo-random version is available here:
  http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
  (requires access to PMCCNTR_EL0 and support for AES instructions)
  See below for instructions how to run the pseudo-random version on real
  hardware.
- Only mildly tested. Help appreciated.

Code can be found here:
git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2

Patch #1 updates the OF code to allow the minimum memblock physical address to
be overridden by the arch.

Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.

Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
individually.

Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
(i.e., the use of phys_to_virt() is avoided)

Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
are not covered by the linear mapping.

Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
module region which is kept right below it, as before.

Patch #7 adds support for PLTs in modules so that relative branches can be
resolved via a PLT if the target is out of range.

Patch #8 moves to the x86 version of the extable implementation so that it no
longer contains absolute addresses that require fixing up at relocation time,
but uses relative offsets instead.

Patch #9 reverts some changes to the Image header population code so we no
longer depend on the linker to populate the header fields. This is necessary
since the R_AARCH64_ABS relocations that are emitted for these fields are not
resolved at build time for PIE executables.

Patch #10 updates the code in head.S that needs to execute before relocation to
avoid the use of values that are subject to dynamic relocation. These values
will not be populated in PIE executables.

Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
decoupling PHYS_OFFSET from the base of the kernel image.

Patch #12 implements the core KASLR, by taking randomness supplied in register x1
and using it to move the kernel inside the vmalloc area.

Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
kernel proper.

Ard Biesheuvel (13):
  of/fdt: make memblock minimum physical address arch configurable
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: use more granular reservations for static page table
    allocations
  arm64: decouple early fixmap init from linear mapping
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: move kernel image to base of vmalloc area
  arm64: add support for module PLTs
  arm64: use relative references in exception tables
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: avoid dynamic relocations in early boot code
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: add support for relocatable kernel
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness

 Documentation/arm64/booting.txt           |  15 ++-
 arch/arm/include/asm/kvm_asm.h            |   2 +
 arch/arm/include/asm/kvm_mmu.h            |   2 +
 arch/arm/kvm/arm.c                        |   9 +-
 arch/arm/kvm/mmu.c                        |  12 +-
 arch/arm64/Kconfig                        |  18 +++
 arch/arm64/Makefile                       |  10 +-
 arch/arm64/include/asm/assembler.h        |  17 ++-
 arch/arm64/include/asm/boot.h             |   5 +
 arch/arm64/include/asm/compiler.h         |   2 +
 arch/arm64/include/asm/futex.h            |   4 +-
 arch/arm64/include/asm/kasan.h            |  17 +--
 arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
 arch/arm64/include/asm/kvm_asm.h          |  21 +--
 arch/arm64/include/asm/kvm_mmu.h          |   2 +
 arch/arm64/include/asm/memory.h           |  37 ++++--
 arch/arm64/include/asm/module.h           |  11 ++
 arch/arm64/include/asm/pgtable.h          |   7 -
 arch/arm64/include/asm/uaccess.h          |  16 +--
 arch/arm64/include/asm/virt.h             |   4 -
 arch/arm64/kernel/Makefile                |   1 +
 arch/arm64/kernel/armv8_deprecated.c      |   4 +-
 arch/arm64/kernel/efi-entry.S             |   9 +-
 arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
 arch/arm64/kernel/image.h                 |  37 ++----
 arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c                |   7 +
 arch/arm64/kernel/module.lds              |   4 +
 arch/arm64/kernel/setup.c                 |  15 ++-
 arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
 arch/arm64/kvm/debug.c                    |   4 +-
 arch/arm64/mm/dump.c                      |  12 +-
 arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
 arch/arm64/mm/init.c                      |  75 +++++++++--
 arch/arm64/mm/mmu.c                       | 132 +++++++------------
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
 drivers/of/fdt.c                          |   5 +-
 include/linux/efi.h                       |   5 +-
 scripts/sortextable.c                     |   6 +-
 virt/kvm/arm/vgic-v3.c                    |   2 +-
 41 files changed, 813 insertions(+), 257 deletions(-)
 create mode 100644 arch/arm64/kernel/module-plts.c
 create mode 100644 arch/arm64/kernel/module.lds


EFI_RNG_PROTOCOL on real hardware
=================================

To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
following executable and run it from the UEFI Shell:
http://people.linaro.org/~ard.biesheuvel/RngTest.efi

FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]

If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
values and permutes them using a couple of rounds of AES.
http://people.linaro.org/~ard.biesheuvel/RngDxe.efi

NOTE: not for production!! This is a quick and dirty hack to test the KASLR
code, and is not suitable for anything else.

FS0:\> rngdxe 
FS0:\> rngtest
UEFI RNG Protocol Testing :
----------------------------
 -- Locate UEFI RNG Protocol : [Pass]
 -- Call RNG->GetInfo() interface : 
     >> Supported RNG Algorithm (Count = 2) : 
          0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
          1) E43176D7-B6E8-4827-B784-7FFDC4B68561
 -- Call RNG->GetRNG() interface : 
     >> RNG with default algorithm : [Pass]
     >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
     >> RNG with SP800-90-CTR-256 : [Pass]
     >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
     >> RNG with X9.31-AES : [Fail - Status = Unsupported]
     >> RNG with RAW Entropy : [Pass]
 -- Random Number Generation Test with default RNG Algorithm (20 Rounds): 
          01) - 27
          02) - 61E8
          03) - 496FD8
          04) - DDD793BF
          05) - B6C37C8E23
          06) - 4D183C604A96
          07) - 9363311DB61298
          08) - 5715A7294F4E436E
          09) - F0D4D7BAA0DD52318E
          10) - C88C6EBCF4C0474D87C3
          11) - B5594602B482A643932172
          12) - CA7573F704B2089B726B9CF1
          13) - A93E9451CB533DCFBA87B97C33
          14) - 45AA7B83DB6044F7BBAB031F0D24
          15) - 3DD7A4D61F34ADCB400B5976730DCF
          16) - 4DD168D21FAB8F59708330D6A9BEB021
          17) - 4BBB225E61C465F174254159467E65939F
          18) - 030A156C9616337A20070941E702827DA8E1
          19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
          20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 01/13] of/fdt: make memblock minimum physical address arch configurable
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d2430298a309..0455564f8cbc 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -971,13 +971,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 01/13] of/fdt: make memblock minimum physical address arch configurable
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d2430298a309..0455564f8cbc 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -971,13 +971,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 01/13] of/fdt: make memblock minimum physical address arch configurable
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

By default, early_init_dt_add_memory_arch() ignores memory below
the base of the kernel image since it won't be addressable via the
linear mapping. However, this is not appropriate anymore once we
decouple the kernel text mapping from the linear mapping, so archs
may want to drop the low limit entirely. So allow the minimum to be
overridden by setting MIN_MEMBLOCK_ADDR.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/of/fdt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index d2430298a309..0455564f8cbc 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -971,13 +971,16 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK
+#ifndef MIN_MEMBLOCK_ADDR
+#define MIN_MEMBLOCK_ADDR	__pa(PAGE_OFFSET)
+#endif
 #ifndef MAX_MEMBLOCK_ADDR
 #define MAX_MEMBLOCK_ADDR	((phys_addr_t)~0)
 #endif
 
 void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	const u64 phys_offset = __pa(PAGE_OFFSET);
+	const u64 phys_offset = MIN_MEMBLOCK_ADDR;
 
 	if (!PAGE_ALIGNED(base)) {
 		if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..6434c844a0e4 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7de6c39858a5..ced0dedcabcc 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -88,7 +88,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..6434c844a0e4 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7de6c39858a5..ced0dedcabcc 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -88,7 +88,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/memory.h | 10 ++++++++--
 arch/arm64/kernel/head.S        |  2 +-
 arch/arm64/kernel/vmlinux.lds.S |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..bea9631b34a8 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,7 +51,8 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define MODULES_END		(PAGE_OFFSET)
+#define KIMAGE_VADDR		(PAGE_OFFSET)
+#define MODULES_END		(KIMAGE_VADDR)
 #define MODULES_VADDR		(MODULES_END - SZ_64M)
 #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
@@ -75,8 +76,13 @@
  * private definitions which should NOT be used outside memory.h
  * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
  */
-#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __virt_to_phys(x) ({						\
+	phys_addr_t __x = (phys_addr_t)(x);				\
+	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
+			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
 
 /*
  * Convert a page to/from a physical address
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..6434c844a0e4 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -389,7 +389,7 @@ __create_page_tables:
 	 * Map the kernel image (starting with PHYS_OFFSET).
 	 */
 	mov	x0, x26				// swapper_pg_dir
-	mov	x5, #PAGE_OFFSET
+	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
 	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
 	mov	x3, x24				// phys offset
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7de6c39858a5..ced0dedcabcc 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -88,7 +88,7 @@ SECTIONS
 		*(.discard.*)
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KIMAGE_VADDR + TEXT_OFFSET;
 
 	.head.text : {
 		_text = .;
@@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
-ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
+ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Before introducing new statically allocated page tables and increasing
their alignment in subsequent patches, update the reservation logic
so that only pages that are in actual use end up as reserved with
memblock.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/init.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bacba847923..8e678d05ad84 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
 	 * Register the kernel text, kernel data, initrd, and initial
 	 * pagetables with memblock.
 	 */
-	memblock_reserve(__pa(_text), _end - _text);
+	memblock_reserve(__pa(_text), __bss_stop - _text);
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (initrd_start)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
+	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Before introducing new statically allocated page tables and increasing
their alignment in subsequent patches, update the reservation logic
so that only pages that are in actual use end up as reserved with
memblock.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/init.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bacba847923..8e678d05ad84 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
 	 * Register the kernel text, kernel data, initrd, and initial
 	 * pagetables with memblock.
 	 */
-	memblock_reserve(__pa(_text), _end - _text);
+	memblock_reserve(__pa(_text), __bss_stop - _text);
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (initrd_start)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
+	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Before introducing new statically allocated page tables and increasing
their alignment in subsequent patches, update the reservation logic
so that only pages that are in actual use end up as reserved with
memblock.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/init.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bacba847923..8e678d05ad84 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -36,6 +36,7 @@
 #include <linux/swiotlb.h>
 
 #include <asm/fixmap.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
 	 * Register the kernel text, kernel data, initrd, and initial
 	 * pagetables with memblock.
 	 */
-	memblock_reserve(__pa(_text), _end - _text);
+	memblock_reserve(__pa(_text), __bss_stop - _text);
 #ifdef CONFIG_BLK_DEV_INITRD
 	if (initrd_start)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
+	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
virtual mapping, regardless of whether or not it intersects with the linear
mapping. This will allow us to move the kernel out of the linear mapping in
a subsequent patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/compiler.h |  2 +
 arch/arm64/kernel/vmlinux.lds.S   | 12 +--
 arch/arm64/mm/mmu.c               | 85 ++++++++------------
 3 files changed, 42 insertions(+), 57 deletions(-)

diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index ee35fd0f2236..dd342af63673 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -27,4 +27,6 @@
  */
 #define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
 
+#define __pgdir		__attribute__((section(".pgdir"),aligned(PAGE_SIZE)))
+
 #endif	/* __ASM_COMPILER_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index ced0dedcabcc..363c2f529951 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -160,11 +160,13 @@ SECTIONS
 
 	BSS_SECTION(0, 0, 0)
 
-	. = ALIGN(PAGE_SIZE);
-	idmap_pg_dir = .;
-	. += IDMAP_DIR_SIZE;
-	swapper_pg_dir = .;
-	. += SWAPPER_DIR_SIZE;
+	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
+		idmap_pg_dir = .;
+		. += IDMAP_DIR_SIZE;
+		swapper_pg_dir = .;
+		. += SWAPPER_DIR_SIZE;
+		*(.pgdir)
+	}
 
 	_end = .;
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 50b1de8e127b..a78fc5a882da 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -540,39 +540,11 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
-static inline pud_t * fixmap_pud(unsigned long addr)
-{
-	pgd_t *pgd = pgd_offset_k(addr);
-
-	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
-
-	return pud_offset(pgd, addr);
-}
-
-static inline pmd_t * fixmap_pmd(unsigned long addr)
-{
-	pud_t *pud = fixmap_pud(addr);
-
-	BUG_ON(pud_none(*pud) || pud_bad(*pud));
+static pte_t *__fixmap_pte;
 
-	return pmd_offset(pud, addr);
-}
-
-static inline pte_t * fixmap_pte(unsigned long addr)
+static inline pte_t *fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return __fixmap_pte + pte_index(addr);
 }
 
 void __init early_fixmap_init(void)
@@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
-	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
-	pmd_populate_kernel(&init_mm, pmd, bm_pte);
+#if CONFIG_PGTABLE_LEVELS > 3
+	if (pgd_none(*pgd)) {
+		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
+
+		pgd_populate(&init_mm, pgd, bm_pud);
+		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
+	}
+	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
+#else
+	pud = (pud_t *)pgd;
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	if (pud_none(*pud)) {
+		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
+
+		pud_populate(&init_mm, pud, bm_pmd);
+		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
+	}
+	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
+#else
+	pmd = (pmd_t *)pud;
+#endif
+	if (pmd_none(*pmd)) {
+		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
+
+		pmd_populate_kernel(&init_mm, pmd, bm_pte);
+		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
+	}
+	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-
-	if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
-	     || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
-		WARN_ON(1);
-		pr_warn("pmd %p != %p, %p\n",
-			pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
-			fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
-		pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
-			fix_to_virt(FIX_BTMAP_BEGIN));
-		pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
-			fix_to_virt(FIX_BTMAP_END));
-
-		pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
-		pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
-	}
 }
 
 void __set_fixmap(enum fixed_addresses idx,
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
virtual mapping, regardless of whether or not it intersects with the linear
mapping. This will allow us to move the kernel out of the linear mapping in
a subsequent patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/compiler.h |  2 +
 arch/arm64/kernel/vmlinux.lds.S   | 12 +--
 arch/arm64/mm/mmu.c               | 85 ++++++++------------
 3 files changed, 42 insertions(+), 57 deletions(-)

diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index ee35fd0f2236..dd342af63673 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -27,4 +27,6 @@
  */
 #define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
 
+#define __pgdir		__attribute__((section(".pgdir"),aligned(PAGE_SIZE)))
+
 #endif	/* __ASM_COMPILER_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index ced0dedcabcc..363c2f529951 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -160,11 +160,13 @@ SECTIONS
 
 	BSS_SECTION(0, 0, 0)
 
-	. = ALIGN(PAGE_SIZE);
-	idmap_pg_dir = .;
-	. += IDMAP_DIR_SIZE;
-	swapper_pg_dir = .;
-	. += SWAPPER_DIR_SIZE;
+	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
+		idmap_pg_dir = .;
+		. += IDMAP_DIR_SIZE;
+		swapper_pg_dir = .;
+		. += SWAPPER_DIR_SIZE;
+		*(.pgdir)
+	}
 
 	_end = .;
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 50b1de8e127b..a78fc5a882da 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -540,39 +540,11 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
-static inline pud_t * fixmap_pud(unsigned long addr)
-{
-	pgd_t *pgd = pgd_offset_k(addr);
-
-	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
-
-	return pud_offset(pgd, addr);
-}
-
-static inline pmd_t * fixmap_pmd(unsigned long addr)
-{
-	pud_t *pud = fixmap_pud(addr);
-
-	BUG_ON(pud_none(*pud) || pud_bad(*pud));
+static pte_t *__fixmap_pte;
 
-	return pmd_offset(pud, addr);
-}
-
-static inline pte_t * fixmap_pte(unsigned long addr)
+static inline pte_t *fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return __fixmap_pte + pte_index(addr);
 }
 
 void __init early_fixmap_init(void)
@@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
-	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
-	pmd_populate_kernel(&init_mm, pmd, bm_pte);
+#if CONFIG_PGTABLE_LEVELS > 3
+	if (pgd_none(*pgd)) {
+		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
+
+		pgd_populate(&init_mm, pgd, bm_pud);
+		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
+	}
+	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
+#else
+	pud = (pud_t *)pgd;
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	if (pud_none(*pud)) {
+		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
+
+		pud_populate(&init_mm, pud, bm_pmd);
+		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
+	}
+	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
+#else
+	pmd = (pmd_t *)pud;
+#endif
+	if (pmd_none(*pmd)) {
+		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
+
+		pmd_populate_kernel(&init_mm, pmd, bm_pte);
+		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
+	}
+	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-
-	if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
-	     || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
-		WARN_ON(1);
-		pr_warn("pmd %p != %p, %p\n",
-			pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
-			fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
-		pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
-			fix_to_virt(FIX_BTMAP_BEGIN));
-		pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
-			fix_to_virt(FIX_BTMAP_END));
-
-		pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
-		pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
-	}
 }
 
 void __set_fixmap(enum fixed_addresses idx,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Since the early fixmap page tables are populated using pages that are
part of the static footprint of the kernel, they are covered by the
initial kernel mapping, and we can refer to them without using __va/__pa
translations, which are tied to the linear mapping.

Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
virtual mapping, regardless of whether or not it intersects with the linear
mapping. This will allow us to move the kernel out of the linear mapping in
a subsequent patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/compiler.h |  2 +
 arch/arm64/kernel/vmlinux.lds.S   | 12 +--
 arch/arm64/mm/mmu.c               | 85 ++++++++------------
 3 files changed, 42 insertions(+), 57 deletions(-)

diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index ee35fd0f2236..dd342af63673 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -27,4 +27,6 @@
  */
 #define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
 
+#define __pgdir		__attribute__((section(".pgdir"),aligned(PAGE_SIZE)))
+
 #endif	/* __ASM_COMPILER_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index ced0dedcabcc..363c2f529951 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -160,11 +160,13 @@ SECTIONS
 
 	BSS_SECTION(0, 0, 0)
 
-	. = ALIGN(PAGE_SIZE);
-	idmap_pg_dir = .;
-	. += IDMAP_DIR_SIZE;
-	swapper_pg_dir = .;
-	. += SWAPPER_DIR_SIZE;
+	.pgdir (NOLOAD) : ALIGN(PAGE_SIZE) {
+		idmap_pg_dir = .;
+		. += IDMAP_DIR_SIZE;
+		swapper_pg_dir = .;
+		. += SWAPPER_DIR_SIZE;
+		*(.pgdir)
+	}
 
 	_end = .;
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 50b1de8e127b..a78fc5a882da 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -540,39 +540,11 @@ void vmemmap_free(unsigned long start, unsigned long end)
 }
 #endif	/* CONFIG_SPARSEMEM_VMEMMAP */
 
-static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
-#if CONFIG_PGTABLE_LEVELS > 2
-static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss;
-#endif
-#if CONFIG_PGTABLE_LEVELS > 3
-static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss;
-#endif
-
-static inline pud_t * fixmap_pud(unsigned long addr)
-{
-	pgd_t *pgd = pgd_offset_k(addr);
-
-	BUG_ON(pgd_none(*pgd) || pgd_bad(*pgd));
-
-	return pud_offset(pgd, addr);
-}
-
-static inline pmd_t * fixmap_pmd(unsigned long addr)
-{
-	pud_t *pud = fixmap_pud(addr);
-
-	BUG_ON(pud_none(*pud) || pud_bad(*pud));
+static pte_t *__fixmap_pte;
 
-	return pmd_offset(pud, addr);
-}
-
-static inline pte_t * fixmap_pte(unsigned long addr)
+static inline pte_t *fixmap_pte(unsigned long addr)
 {
-	pmd_t *pmd = fixmap_pmd(addr);
-
-	BUG_ON(pmd_none(*pmd) || pmd_bad(*pmd));
-
-	return pte_offset_kernel(pmd, addr);
+	return __fixmap_pte + pte_index(addr);
 }
 
 void __init early_fixmap_init(void)
@@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
 	unsigned long addr = FIXADDR_START;
 
 	pgd = pgd_offset_k(addr);
-	pgd_populate(&init_mm, pgd, bm_pud);
-	pud = pud_offset(pgd, addr);
-	pud_populate(&init_mm, pud, bm_pmd);
-	pmd = pmd_offset(pud, addr);
-	pmd_populate_kernel(&init_mm, pmd, bm_pte);
+#if CONFIG_PGTABLE_LEVELS > 3
+	if (pgd_none(*pgd)) {
+		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
+
+		pgd_populate(&init_mm, pgd, bm_pud);
+		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
+	}
+	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
+#else
+	pud = (pud_t *)pgd;
+#endif
+#if CONFIG_PGTABLE_LEVELS > 2
+	if (pud_none(*pud)) {
+		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
+
+		pud_populate(&init_mm, pud, bm_pmd);
+		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
+	}
+	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
+#else
+	pmd = (pmd_t *)pud;
+#endif
+	if (pmd_none(*pmd)) {
+		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
+
+		pmd_populate_kernel(&init_mm, pmd, bm_pte);
+		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
+	}
+	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
 
 	/*
 	 * The boot-ioremap range spans multiple pmds, for which
-	 * we are not preparted:
+	 * we are not prepared:
 	 */
 	BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
 		     != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
-
-	if ((pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)))
-	     || pmd != fixmap_pmd(fix_to_virt(FIX_BTMAP_END))) {
-		WARN_ON(1);
-		pr_warn("pmd %p != %p, %p\n",
-			pmd, fixmap_pmd(fix_to_virt(FIX_BTMAP_BEGIN)),
-			fixmap_pmd(fix_to_virt(FIX_BTMAP_END)));
-		pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
-			fix_to_virt(FIX_BTMAP_BEGIN));
-		pr_warn("fix_to_virt(FIX_BTMAP_END):   %08lx\n",
-			fix_to_virt(FIX_BTMAP_END));
-
-		pr_warn("FIX_BTMAP_END:       %d\n", FIX_BTMAP_END);
-		pr_warn("FIX_BTMAP_BEGIN:     %d\n", FIX_BTMAP_BEGIN);
-	}
 }
 
 void __set_fixmap(enum fixed_addresses idx,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

To prevent inadvertent direct references from sneaking in later, change
the type of all extern declarations to HYP kernel symbols to the opaque
'struct kvm_ksym', which does not decay to a pointer type like char arrays
and function references. This is not bullet proof, but at least forces the
user to take the address explicitly rather than referencing it directly.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h   |  2 ++
 arch/arm/include/asm/kvm_mmu.h   |  2 ++
 arch/arm/kvm/arm.c               |  9 +++++----
 arch/arm/kvm/mmu.c               | 12 +++++------
 arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
 arch/arm64/include/asm/kvm_mmu.h |  2 ++
 arch/arm64/include/asm/virt.h    |  4 ----
 arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
 arch/arm64/kvm/debug.c           |  4 +++-
 virt/kvm/arm/vgic-v3.c           |  2 +-
 10 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..484ffdf7c70b 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+
+extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 #endif
 
 #endif /* __ARM_KVM_ASM_H__ */
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..412b363f79e9 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -30,6 +30,8 @@
 #define HYP_PAGE_OFFSET		PAGE_OFFSET
 #define KERN_TO_HYP(kva)	(kva)
 
+#define kvm_ksym_ref(kva)	(kva)
+
 /*
  * Our virtual mapping for the boot-time MMU-enable code. Must be
  * shared across all the page-tables. Conveniently, we use the vectors
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index e06fd299de08..014b542ea658 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
 		 * shareable domain to make sure all data structures are
 		 * clean.
 		 */
-		kvm_call_hyp(__kvm_flush_vm_context);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
 	}
 
 	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
@@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		__kvm_guest_enter();
 		vcpu->mode = IN_GUEST_MODE;
 
-		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
+		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
 
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		/*
@@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7dace909d5cf..7c448b943e3a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -31,8 +31,6 @@
 
 #include "trace.h"
 
-extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
-
 static pgd_t *boot_hyp_pgd;
 static pgd_t *hyp_pgd;
 static pgd_t *merged_hyp_pgd;
@@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
  */
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 	 * anything there.
 	 */
 	if (kvm)
-		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
 }
 
 /*
@@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
 {
 	int err;
 
-	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
-	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
-	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
+	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
+	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
+	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
 
 	/*
 	 * We rely on the linker script to ensure at build time that the HYP
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 5e377101f919..830402f847e0 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -105,24 +105,27 @@
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
+struct kvm_ksym;
 
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern char __kvm_hyp_vector[];
+extern struct kvm_ksym __kvm_hyp_vector;
 
-#define	__kvm_hyp_code_start	__hyp_text_start
-#define	__kvm_hyp_code_end	__hyp_text_end
+extern struct kvm_ksym __kvm_hyp_code_start;
+extern struct kvm_ksym __kvm_hyp_code_end;
 
-extern void __kvm_flush_vm_context(void);
-extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
-extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
+extern struct kvm_ksym __kvm_flush_vm_context;
+extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
+extern struct kvm_ksym __kvm_tlb_flush_vmid;
 
-extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+extern struct kvm_ksym __kvm_vcpu_run;
 
-extern u64 __vgic_v3_get_ich_vtr_el2(void);
+extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
 
-extern u32 __kvm_get_mdcr_el2(void);
+extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
+
+extern struct kvm_ksym __kvm_get_mdcr_el2;
 
 #endif
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..0899026a2821 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,6 +73,8 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 /*
  * We currently only support a 40bit IPA.
  */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..215ad4649dd7 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
 }
 
-/* The section containing the hypervisor text */
-extern char __hyp_text_start[];
-extern char __hyp_text_end[];
-
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 363c2f529951..f935f082188d 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -35,9 +35,9 @@ jiffies = jiffies_64;
 	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
 	*(.hyp.idmap.text)				\
 	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
-	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
+	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
 	*(.hyp.text)					\
-	VMLINUX_SYMBOL(__hyp_text_end) = .;
+	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
 
 #define IDMAP_TEXT					\
 	. = ALIGN(SZ_4K);				\
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 47e5f0feaee8..99e5a403af4e 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -24,6 +24,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_mmu.h>
 
 #include "trace.h"
 
@@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
 
 void kvm_arm_init_debug(void)
 {
-	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
+	__this_cpu_write(mdcr_el2,
+			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
 }
 
 /**
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 487d6357b7e7..58f5a6521307 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
 		goto out;
 	}
 
-	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
+	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
 
 	/*
 	 * The ListRegs field is 5 bits, but there is a architectural
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

To prevent inadvertent direct references from sneaking in later, change
the type of all extern declarations to HYP kernel symbols to the opaque
'struct kvm_ksym', which does not decay to a pointer type like char arrays
and function references. This is not bullet proof, but at least forces the
user to take the address explicitly rather than referencing it directly.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h   |  2 ++
 arch/arm/include/asm/kvm_mmu.h   |  2 ++
 arch/arm/kvm/arm.c               |  9 +++++----
 arch/arm/kvm/mmu.c               | 12 +++++------
 arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
 arch/arm64/include/asm/kvm_mmu.h |  2 ++
 arch/arm64/include/asm/virt.h    |  4 ----
 arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
 arch/arm64/kvm/debug.c           |  4 +++-
 virt/kvm/arm/vgic-v3.c           |  2 +-
 10 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..484ffdf7c70b 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+
+extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 #endif
 
 #endif /* __ARM_KVM_ASM_H__ */
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..412b363f79e9 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -30,6 +30,8 @@
 #define HYP_PAGE_OFFSET		PAGE_OFFSET
 #define KERN_TO_HYP(kva)	(kva)
 
+#define kvm_ksym_ref(kva)	(kva)
+
 /*
  * Our virtual mapping for the boot-time MMU-enable code. Must be
  * shared across all the page-tables. Conveniently, we use the vectors
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index e06fd299de08..014b542ea658 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
 		 * shareable domain to make sure all data structures are
 		 * clean.
 		 */
-		kvm_call_hyp(__kvm_flush_vm_context);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
 	}
 
 	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
@@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		__kvm_guest_enter();
 		vcpu->mode = IN_GUEST_MODE;
 
-		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
+		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
 
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		/*
@@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7dace909d5cf..7c448b943e3a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -31,8 +31,6 @@
 
 #include "trace.h"
 
-extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
-
 static pgd_t *boot_hyp_pgd;
 static pgd_t *hyp_pgd;
 static pgd_t *merged_hyp_pgd;
@@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
  */
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 	 * anything there.
 	 */
 	if (kvm)
-		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
 }
 
 /*
@@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
 {
 	int err;
 
-	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
-	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
-	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
+	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
+	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
+	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
 
 	/*
 	 * We rely on the linker script to ensure at build time that the HYP
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 5e377101f919..830402f847e0 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -105,24 +105,27 @@
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
+struct kvm_ksym;
 
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern char __kvm_hyp_vector[];
+extern struct kvm_ksym __kvm_hyp_vector;
 
-#define	__kvm_hyp_code_start	__hyp_text_start
-#define	__kvm_hyp_code_end	__hyp_text_end
+extern struct kvm_ksym __kvm_hyp_code_start;
+extern struct kvm_ksym __kvm_hyp_code_end;
 
-extern void __kvm_flush_vm_context(void);
-extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
-extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
+extern struct kvm_ksym __kvm_flush_vm_context;
+extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
+extern struct kvm_ksym __kvm_tlb_flush_vmid;
 
-extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+extern struct kvm_ksym __kvm_vcpu_run;
 
-extern u64 __vgic_v3_get_ich_vtr_el2(void);
+extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
 
-extern u32 __kvm_get_mdcr_el2(void);
+extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
+
+extern struct kvm_ksym __kvm_get_mdcr_el2;
 
 #endif
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..0899026a2821 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,6 +73,8 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 /*
  * We currently only support a 40bit IPA.
  */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..215ad4649dd7 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
 }
 
-/* The section containing the hypervisor text */
-extern char __hyp_text_start[];
-extern char __hyp_text_end[];
-
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 363c2f529951..f935f082188d 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -35,9 +35,9 @@ jiffies = jiffies_64;
 	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
 	*(.hyp.idmap.text)				\
 	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
-	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
+	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
 	*(.hyp.text)					\
-	VMLINUX_SYMBOL(__hyp_text_end) = .;
+	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
 
 #define IDMAP_TEXT					\
 	. = ALIGN(SZ_4K);				\
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 47e5f0feaee8..99e5a403af4e 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -24,6 +24,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_mmu.h>
 
 #include "trace.h"
 
@@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
 
 void kvm_arm_init_debug(void)
 {
-	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
+	__this_cpu_write(mdcr_el2,
+			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
 }
 
 /**
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 487d6357b7e7..58f5a6521307 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
 		goto out;
 	}
 
-	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
+	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
 
 	/*
 	 * The ListRegs field is 5 bits, but there is a architectural
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

To prevent inadvertent direct references from sneaking in later, change
the type of all extern declarations to HYP kernel symbols to the opaque
'struct kvm_ksym', which does not decay to a pointer type like char arrays
and function references. This is not bullet proof, but at least forces the
user to take the address explicitly rather than referencing it directly.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/include/asm/kvm_asm.h   |  2 ++
 arch/arm/include/asm/kvm_mmu.h   |  2 ++
 arch/arm/kvm/arm.c               |  9 +++++----
 arch/arm/kvm/mmu.c               | 12 +++++------
 arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
 arch/arm64/include/asm/kvm_mmu.h |  2 ++
 arch/arm64/include/asm/virt.h    |  4 ----
 arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
 arch/arm64/kvm/debug.c           |  4 +++-
 virt/kvm/arm/vgic-v3.c           |  2 +-
 10 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b610ff..484ffdf7c70b 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+
+extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 #endif
 
 #endif /* __ARM_KVM_ASM_H__ */
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..412b363f79e9 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -30,6 +30,8 @@
 #define HYP_PAGE_OFFSET		PAGE_OFFSET
 #define KERN_TO_HYP(kva)	(kva)
 
+#define kvm_ksym_ref(kva)	(kva)
+
 /*
  * Our virtual mapping for the boot-time MMU-enable code. Must be
  * shared across all the page-tables. Conveniently, we use the vectors
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index e06fd299de08..014b542ea658 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
 		 * shareable domain to make sure all data structures are
 		 * clean.
 		 */
-		kvm_call_hyp(__kvm_flush_vm_context);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
 	}
 
 	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
@@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		__kvm_guest_enter();
 		vcpu->mode = IN_GUEST_MODE;
 
-		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
+		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
 
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		/*
@@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)__kvm_hyp_vector;
+	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
 
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
@@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
 	/*
 	 * Map the Hyp-code called directly from the host
 	 */
-	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+				  kvm_ksym_ref(__kvm_hyp_code_end));
 	if (err) {
 		kvm_err("Cannot map world-switch code\n");
 		goto out_free_mappings;
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7dace909d5cf..7c448b943e3a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -31,8 +31,6 @@
 
 #include "trace.h"
 
-extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
-
 static pgd_t *boot_hyp_pgd;
 static pgd_t *hyp_pgd;
 static pgd_t *merged_hyp_pgd;
@@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
  */
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 	 * anything there.
 	 */
 	if (kvm)
-		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
+		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
 }
 
 /*
@@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
 {
 	int err;
 
-	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
-	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
-	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
+	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
+	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
+	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
 
 	/*
 	 * We rely on the linker script to ensure at build time that the HYP
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 5e377101f919..830402f847e0 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -105,24 +105,27 @@
 #ifndef __ASSEMBLY__
 struct kvm;
 struct kvm_vcpu;
+struct kvm_ksym;
 
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern char __kvm_hyp_vector[];
+extern struct kvm_ksym __kvm_hyp_vector;
 
-#define	__kvm_hyp_code_start	__hyp_text_start
-#define	__kvm_hyp_code_end	__hyp_text_end
+extern struct kvm_ksym __kvm_hyp_code_start;
+extern struct kvm_ksym __kvm_hyp_code_end;
 
-extern void __kvm_flush_vm_context(void);
-extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
-extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
+extern struct kvm_ksym __kvm_flush_vm_context;
+extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
+extern struct kvm_ksym __kvm_tlb_flush_vmid;
 
-extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+extern struct kvm_ksym __kvm_vcpu_run;
 
-extern u64 __vgic_v3_get_ich_vtr_el2(void);
+extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
 
-extern u32 __kvm_get_mdcr_el2(void);
+extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
+
+extern struct kvm_ksym __kvm_get_mdcr_el2;
 
 #endif
 
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..0899026a2821 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,6 +73,8 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
+#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+
 /*
  * We currently only support a 40bit IPA.
  */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..215ad4649dd7 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
 }
 
-/* The section containing the hypervisor text */
-extern char __hyp_text_start[];
-extern char __hyp_text_end[];
-
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 363c2f529951..f935f082188d 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -35,9 +35,9 @@ jiffies = jiffies_64;
 	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
 	*(.hyp.idmap.text)				\
 	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
-	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
+	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
 	*(.hyp.text)					\
-	VMLINUX_SYMBOL(__hyp_text_end) = .;
+	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
 
 #define IDMAP_TEXT					\
 	. = ALIGN(SZ_4K);				\
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 47e5f0feaee8..99e5a403af4e 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -24,6 +24,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_mmu.h>
 
 #include "trace.h"
 
@@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
 
 void kvm_arm_init_debug(void)
 {
-	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
+	__this_cpu_write(mdcr_el2,
+			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
 }
 
 /**
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 487d6357b7e7..58f5a6521307 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
 		goto out;
 	}
 
-	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
+	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
 
 	/*
 	 * The ListRegs field is 5 bits, but there is a architectural
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 06/13] arm64: move kernel image to base of vmalloc area
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing kASLR, where the kernel
image can be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h          | 17 ++++-----
 arch/arm64/include/asm/kernel-pgtable.h |  5 +--
 arch/arm64/include/asm/memory.h         | 17 ++++++---
 arch/arm64/include/asm/pgtable.h        |  7 ----
 arch/arm64/kernel/setup.c               | 13 +++++++
 arch/arm64/mm/dump.c                    | 12 +++----
 arch/arm64/mm/init.c                    | 20 +++++------
 arch/arm64/mm/mmu.c                     | 37 ++------------------
 8 files changed, 56 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 2774fa384c47..476d56e0f04c 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -1,19 +1,16 @@
 #ifndef __ASM_KASAN_H
 #define __ASM_KASAN_H
 
-#ifndef __ASSEMBLY__
-
 #ifdef CONFIG_KASAN
 
 #include <linux/linkage.h>
-#include <asm/memory.h>
 
 /*
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
-#define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_START	(VA_START)
+#define KASAN_SHADOW_END	(KASAN_SHADOW_START + (_AC(1, UL) << (VA_BITS - 3)))
 
 /*
  * This value is used to map an address to the corresponding shadow
@@ -25,14 +22,18 @@
  * should satisfy the following equation:
  *      KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61)
  */
-#define KASAN_SHADOW_OFFSET     (KASAN_SHADOW_END - (1ULL << (64 - 3)))
+#define KASAN_SHADOW_OFFSET	(KASAN_SHADOW_END - (_AC(1, ULL) << (64 - 3)))
 
+#ifndef __ASSEMBLY__
 void kasan_init(void);
 asmlinkage void kasan_early_init(void);
+#endif
 
 #else
+
+#ifndef __ASSEMBLY__
 static inline void kasan_init(void) { }
 #endif
 
-#endif
-#endif
+#endif /* CONFIG_KASAN */
+#endif /* __ASM_KASAN_H */
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..daa8a7b9917a 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -70,8 +70,9 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | \
+				 PMD_SECT_UXN)
 
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..1dcbf142d36c 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,14 +51,23 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
 
+#ifndef CONFIG_KASAN
+#define MODULES_VADDR		(VA_START)
+#else
+#include <asm/kasan.h>
+#define MODULES_VADDR		(KASAN_SHADOW_END)
+#endif
+
+#define MODULES_END		(MODULES_VADDR + SZ_64M)
+
+#define KIMAGE_VADDR		(MODULES_END)
+#define VMALLOC_START		(MODULES_END)
+
 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32		UL(0x100000000)
 #define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0664468466fb..93203a6b9574 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -42,13 +42,6 @@
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..96177a7c0f05 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -53,6 +53,7 @@
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -291,6 +292,18 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
+	static struct vm_struct vmlinux_vm __initdata = {
+		.addr		= (void *)KIMAGE_VADDR,
+		.size		= 0,
+		.flags		= VM_IOREMAP,
+		.caller		= setup_arch,
+	};
+
+	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
+				   1 << SWAPPER_BLOCK_SHIFT);
+	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	vm_area_add_early(&vmlinux_vm);
+
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
 
 	sprintf(init_utsname()->machine, ELF_PLATFORM);
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8e678d05ad84..2cfc9c54bf51 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -305,22 +305,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -329,11 +333,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..6275d183c005 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -322,40 +322,6 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, late_alloc);
 }
 
-static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
-{
-
-	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
-
-	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
-	 */
-
-	/* No overlap with the kernel. */
-	if (end < kernel_start || start >= kernel_end) {
-		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
-				     end - start, PAGE_KERNEL, early_alloc);
-		return;
-	}
-
-	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
-	 * don't overlap.
-	 */
-	if (start < kernel_start)
-		__create_pgd_mapping(pgd, start,
-				     __phys_to_virt(start),
-				     kernel_start - start, PAGE_KERNEL,
-				     early_alloc);
-	if (kernel_end < end)
-		__create_pgd_mapping(pgd, kernel_end,
-				     __phys_to_virt(kernel_end),
-				     end - kernel_end, PAGE_KERNEL,
-				     early_alloc);
-}
-
 static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
@@ -370,7 +336,8 @@ static void __init map_mem(pgd_t *pgd)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgd, start, end);
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL, early_alloc);
 	}
 }
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 06/13] arm64: move kernel image to base of vmalloc area
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing kASLR, where the kernel
image can be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h          | 17 ++++-----
 arch/arm64/include/asm/kernel-pgtable.h |  5 +--
 arch/arm64/include/asm/memory.h         | 17 ++++++---
 arch/arm64/include/asm/pgtable.h        |  7 ----
 arch/arm64/kernel/setup.c               | 13 +++++++
 arch/arm64/mm/dump.c                    | 12 +++----
 arch/arm64/mm/init.c                    | 20 +++++------
 arch/arm64/mm/mmu.c                     | 37 ++------------------
 8 files changed, 56 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 2774fa384c47..476d56e0f04c 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -1,19 +1,16 @@
 #ifndef __ASM_KASAN_H
 #define __ASM_KASAN_H
 
-#ifndef __ASSEMBLY__
-
 #ifdef CONFIG_KASAN
 
 #include <linux/linkage.h>
-#include <asm/memory.h>
 
 /*
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
-#define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_START	(VA_START)
+#define KASAN_SHADOW_END	(KASAN_SHADOW_START + (_AC(1, UL) << (VA_BITS - 3)))
 
 /*
  * This value is used to map an address to the corresponding shadow
@@ -25,14 +22,18 @@
  * should satisfy the following equation:
  *      KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61)
  */
-#define KASAN_SHADOW_OFFSET     (KASAN_SHADOW_END - (1ULL << (64 - 3)))
+#define KASAN_SHADOW_OFFSET	(KASAN_SHADOW_END - (_AC(1, ULL) << (64 - 3)))
 
+#ifndef __ASSEMBLY__
 void kasan_init(void);
 asmlinkage void kasan_early_init(void);
+#endif
 
 #else
+
+#ifndef __ASSEMBLY__
 static inline void kasan_init(void) { }
 #endif
 
-#endif
-#endif
+#endif /* CONFIG_KASAN */
+#endif /* __ASM_KASAN_H */
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..daa8a7b9917a 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -70,8 +70,9 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | \
+				 PMD_SECT_UXN)
 
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..1dcbf142d36c 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,14 +51,23 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
 
+#ifndef CONFIG_KASAN
+#define MODULES_VADDR		(VA_START)
+#else
+#include <asm/kasan.h>
+#define MODULES_VADDR		(KASAN_SHADOW_END)
+#endif
+
+#define MODULES_END		(MODULES_VADDR + SZ_64M)
+
+#define KIMAGE_VADDR		(MODULES_END)
+#define VMALLOC_START		(MODULES_END)
+
 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32		UL(0x100000000)
 #define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0664468466fb..93203a6b9574 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -42,13 +42,6 @@
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..96177a7c0f05 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -53,6 +53,7 @@
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -291,6 +292,18 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
+	static struct vm_struct vmlinux_vm __initdata = {
+		.addr		= (void *)KIMAGE_VADDR,
+		.size		= 0,
+		.flags		= VM_IOREMAP,
+		.caller		= setup_arch,
+	};
+
+	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
+				   1 << SWAPPER_BLOCK_SHIFT);
+	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	vm_area_add_early(&vmlinux_vm);
+
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
 
 	sprintf(init_utsname()->machine, ELF_PLATFORM);
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8e678d05ad84..2cfc9c54bf51 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -305,22 +305,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -329,11 +333,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..6275d183c005 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -322,40 +322,6 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, late_alloc);
 }
 
-static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
-{
-
-	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
-
-	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
-	 */
-
-	/* No overlap with the kernel. */
-	if (end < kernel_start || start >= kernel_end) {
-		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
-				     end - start, PAGE_KERNEL, early_alloc);
-		return;
-	}
-
-	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
-	 * don't overlap.
-	 */
-	if (start < kernel_start)
-		__create_pgd_mapping(pgd, start,
-				     __phys_to_virt(start),
-				     kernel_start - start, PAGE_KERNEL,
-				     early_alloc);
-	if (kernel_end < end)
-		__create_pgd_mapping(pgd, kernel_end,
-				     __phys_to_virt(kernel_end),
-				     end - kernel_end, PAGE_KERNEL,
-				     early_alloc);
-}
-
 static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
@@ -370,7 +336,8 @@ static void __init map_mem(pgd_t *pgd)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgd, start, end);
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL, early_alloc);
 	}
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 06/13] arm64: move kernel image to base of vmalloc area
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This moves the module area to right before the vmalloc area, and
moves the kernel image to the base of the vmalloc area. This is
an intermediate step towards implementing kASLR, where the kernel
image can be located anywhere in the vmalloc area.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/kasan.h          | 17 ++++-----
 arch/arm64/include/asm/kernel-pgtable.h |  5 +--
 arch/arm64/include/asm/memory.h         | 17 ++++++---
 arch/arm64/include/asm/pgtable.h        |  7 ----
 arch/arm64/kernel/setup.c               | 13 +++++++
 arch/arm64/mm/dump.c                    | 12 +++----
 arch/arm64/mm/init.c                    | 20 +++++------
 arch/arm64/mm/mmu.c                     | 37 ++------------------
 8 files changed, 56 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 2774fa384c47..476d56e0f04c 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -1,19 +1,16 @@
 #ifndef __ASM_KASAN_H
 #define __ASM_KASAN_H
 
-#ifndef __ASSEMBLY__
-
 #ifdef CONFIG_KASAN
 
 #include <linux/linkage.h>
-#include <asm/memory.h>
 
 /*
  * KASAN_SHADOW_START: beginning of the kernel virtual addresses.
  * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
  */
-#define KASAN_SHADOW_START      (VA_START)
-#define KASAN_SHADOW_END        (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_START	(VA_START)
+#define KASAN_SHADOW_END	(KASAN_SHADOW_START + (_AC(1, UL) << (VA_BITS - 3)))
 
 /*
  * This value is used to map an address to the corresponding shadow
@@ -25,14 +22,18 @@
  * should satisfy the following equation:
  *      KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61)
  */
-#define KASAN_SHADOW_OFFSET     (KASAN_SHADOW_END - (1ULL << (64 - 3)))
+#define KASAN_SHADOW_OFFSET	(KASAN_SHADOW_END - (_AC(1, ULL) << (64 - 3)))
 
+#ifndef __ASSEMBLY__
 void kasan_init(void);
 asmlinkage void kasan_early_init(void);
+#endif
 
 #else
+
+#ifndef __ASSEMBLY__
 static inline void kasan_init(void) { }
 #endif
 
-#endif
-#endif
+#endif /* CONFIG_KASAN */
+#endif /* __ASM_KASAN_H */
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714ee29e..daa8a7b9917a 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -70,8 +70,9 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | \
+				 PMD_SECT_UXN)
 
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index bea9631b34a8..1dcbf142d36c 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -51,14 +51,23 @@
 #define VA_BITS			(CONFIG_ARM64_VA_BITS)
 #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
-#define KIMAGE_VADDR		(PAGE_OFFSET)
-#define MODULES_END		(KIMAGE_VADDR)
-#define MODULES_VADDR		(MODULES_END - SZ_64M)
-#define PCI_IO_END		(MODULES_VADDR - SZ_2M)
+#define PCI_IO_END		(PAGE_OFFSET - SZ_2M)
 #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
 #define FIXADDR_TOP		(PCI_IO_START - SZ_2M)
 #define TASK_SIZE_64		(UL(1) << VA_BITS)
 
+#ifndef CONFIG_KASAN
+#define MODULES_VADDR		(VA_START)
+#else
+#include <asm/kasan.h>
+#define MODULES_VADDR		(KASAN_SHADOW_END)
+#endif
+
+#define MODULES_END		(MODULES_VADDR + SZ_64M)
+
+#define KIMAGE_VADDR		(MODULES_END)
+#define VMALLOC_START		(MODULES_END)
+
 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32		UL(0x100000000)
 #define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0664468466fb..93203a6b9574 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -42,13 +42,6 @@
  */
 #define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
 
-#ifndef CONFIG_KASAN
-#define VMALLOC_START		(VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START		(KASAN_SHADOW_END + SZ_64K)
-#endif
-
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
 #define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index cfed56f0ad26..96177a7c0f05 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -53,6 +53,7 @@
 #include <asm/cpufeature.h>
 #include <asm/cpu_ops.h>
 #include <asm/kasan.h>
+#include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -291,6 +292,18 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
+	static struct vm_struct vmlinux_vm __initdata = {
+		.addr		= (void *)KIMAGE_VADDR,
+		.size		= 0,
+		.flags		= VM_IOREMAP,
+		.caller		= setup_arch,
+	};
+
+	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
+				   1 << SWAPPER_BLOCK_SHIFT);
+	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	vm_area_add_early(&vmlinux_vm);
+
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
 
 	sprintf(init_utsname()->machine, ELF_PLATFORM);
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a119a74c..e83ffb00560c 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -35,7 +35,9 @@ struct addr_marker {
 };
 
 enum address_markers_idx {
-	VMALLOC_START_NR = 0,
+	MODULES_START_NR = 0,
+	MODULES_END_NR,
+	VMALLOC_START_NR,
 	VMALLOC_END_NR,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 	VMEMMAP_START_NR,
@@ -45,12 +47,12 @@ enum address_markers_idx {
 	FIXADDR_END_NR,
 	PCI_START_NR,
 	PCI_END_NR,
-	MODULES_START_NR,
-	MODUELS_END_NR,
 	KERNEL_SPACE_NR,
 };
 
 static struct addr_marker address_markers[] = {
+	{ MODULES_VADDR,	"Modules start" },
+	{ MODULES_END,		"Modules end" },
 	{ VMALLOC_START,	"vmalloc() Area" },
 	{ VMALLOC_END,		"vmalloc() End" },
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -61,9 +63,7 @@ static struct addr_marker address_markers[] = {
 	{ FIXADDR_TOP,		"Fixmap end" },
 	{ PCI_IO_START,		"PCI I/O start" },
 	{ PCI_IO_END,		"PCI I/O end" },
-	{ MODULES_VADDR,	"Modules start" },
-	{ MODULES_END,		"Modules end" },
-	{ PAGE_OFFSET,		"Kernel Mapping" },
+	{ PAGE_OFFSET,		"Linear Mapping" },
 	{ -1,			NULL },
 };
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8e678d05ad84..2cfc9c54bf51 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -305,22 +305,26 @@ void __init mem_init(void)
 #ifdef CONFIG_KASAN
 		  "    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n"
 #endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n"
 		  "              0x%16lx - 0x%16lx   (%6ld MB actual)\n"
 #endif
 		  "    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n"
 		  "    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
-		  "      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n"
-		  "      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
 #ifdef CONFIG_KASAN
 		  MLG(KASAN_SHADOW_START, KASAN_SHADOW_END),
 #endif
+		  MLM(MODULES_VADDR, MODULES_END),
 		  MLG(VMALLOC_START, VMALLOC_END),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 		  MLG((unsigned long)vmemmap,
 		      (unsigned long)vmemmap + VMEMMAP_SIZE),
@@ -329,11 +333,7 @@ void __init mem_init(void)
 #endif
 		  MLK(FIXADDR_START, FIXADDR_TOP),
 		  MLM(PCI_IO_START, PCI_IO_END),
-		  MLM(MODULES_VADDR, MODULES_END),
-		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
-		  MLK_ROUNDUP(__init_begin, __init_end),
-		  MLK_ROUNDUP(_text, _etext),
-		  MLK_ROUNDUP(_sdata, _edata));
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory));
 
 #undef MLK
 #undef MLM
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..6275d183c005 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -322,40 +322,6 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, late_alloc);
 }
 
-static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
-{
-
-	unsigned long kernel_start = __pa(_stext);
-	unsigned long kernel_end = __pa(_end);
-
-	/*
-	 * The kernel itself is mapped at page granularity. Map all other
-	 * memory, making sure we don't overwrite the existing kernel mappings.
-	 */
-
-	/* No overlap with the kernel. */
-	if (end < kernel_start || start >= kernel_end) {
-		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
-				     end - start, PAGE_KERNEL, early_alloc);
-		return;
-	}
-
-	/*
-	 * This block overlaps the kernel mapping. Map the portion(s) which
-	 * don't overlap.
-	 */
-	if (start < kernel_start)
-		__create_pgd_mapping(pgd, start,
-				     __phys_to_virt(start),
-				     kernel_start - start, PAGE_KERNEL,
-				     early_alloc);
-	if (kernel_end < end)
-		__create_pgd_mapping(pgd, kernel_end,
-				     __phys_to_virt(kernel_end),
-				     end - kernel_end, PAGE_KERNEL,
-				     early_alloc);
-}
-
 static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
@@ -370,7 +336,8 @@ static void __init map_mem(pgd_t *pgd)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgd, start, end);
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL, early_alloc);
 	}
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 07/13] arm64: add support for module PLTs
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR,
which may place the kernel and the modules anywhere in the vmalloc
area, making it likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |   4 +
 arch/arm64/Makefile             |   4 +
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   7 +
 arch/arm64/kernel/module.lds    |   4 +
 7 files changed, 168 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 871f21783866..827e78f33944 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -704,6 +704,10 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_PLTS
+	bool
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..d4654830e536 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -45,6 +45,10 @@ ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..7b8cd3dc9d8e 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*core_plt;
+	struct elf64_shdr	*init_plt;
+	int			core_plt_count;
+	int			init_plt_count;
+};
+#endif
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f8b13a..f42b0fff607f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..4a8ef9ea01ee
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2014-2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+struct plt_entry {
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+} __aligned(8);
+
+static bool in_init(const struct module *mod, void *addr)
+{
+	return (u64)addr - (u64)mod->module_init < mod->init_size;
+}
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val)
+{
+	struct plt_entry entry = {
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	}, *plt;
+	int i, *count;
+
+	if (in_init(mod, loc)) {
+		plt = (struct plt_entry *)mod->arch.init_plt->sh_addr;
+		count = &mod->arch.init_plt_count;
+	} else {
+		plt = (struct plt_entry *)mod->arch.core_plt->sh_addr;
+		count = &mod->arch.core_plt_count;
+	}
+
+	/* Look for an existing entry pointing to 'val' */
+	for (i = 0; i < *count; i++)
+		if (plt[i].mov0 == entry.mov0 &&
+		    plt[i].mov1 == entry.mov1 &&
+		    plt[i].mov2 == entry.mov2)
+			return (u64)&plt[i];
+
+	i = (*count)++;
+	plt[i] = entry;
+	return (u64)&plt[i];
+}
+
+static int duplicate_rel(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	int i;
+
+	for (i = 0; i < num; i++) {
+		if (rela[i].r_info == rela[num].r_info &&
+		    rela[i].r_addend == rela[num].r_addend)
+			return 1;
+	}
+	return 0;
+}
+
+/* Count how many PLT entries we may need */
+static unsigned int count_plts(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	int i;
+
+	/*
+	 * Sure, this is order(n^2), but it's usually short, and not
+	 * time critical
+	 */
+	for (i = 0; i < num; i++)
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			if (!duplicate_rel(base, rela, i))
+				ret++;
+			break;
+		}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long core_plts = 0, init_plts = 0;
+	Elf64_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum;
+
+	/*
+	 * To store the PLTs, we expand the .text section for core module code
+	 * and the .init.text section for initialization code.
+	 */
+	for (s = sechdrs; s < sechdrs_end; ++s)
+		if (strcmp(".core.plt", secstrings + s->sh_name) == 0)
+			mod->arch.core_plt = s;
+		else if (strcmp(".init.plt", secstrings + s->sh_name) == 0)
+			mod->arch.init_plt = s;
+
+	if (!mod->arch.core_plt || !mod->arch.init_plt) {
+		pr_err("%s: sections missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (s = sechdrs + 1; s < sechdrs_end; ++s) {
+		const Elf64_Rela *rels = (void *)ehdr + s->sh_offset;
+		int numrels = s->sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + s->sh_info;
+
+		if (s->sh_type != SHT_RELA)
+			continue;
+
+		if (strstr(secstrings + s->sh_name, ".init"))
+			init_plts += count_plts(dstsec->sh_addr, rels, numrels);
+		else
+			core_plts += count_plts(dstsec->sh_addr, rels, numrels);
+	}
+
+	mod->arch.core_plt->sh_type = SHT_NOBITS;
+	mod->arch.core_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.core_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.core_plt->sh_size = core_plts * sizeof(struct plt_entry);
+	mod->arch.core_plt_count = 0;
+
+	mod->arch.init_plt->sh_type = SHT_NOBITS;
+	mod->arch.init_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.init_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.init_plt->sh_size = init_plts * sizeof(struct plt_entry);
+	mod->arch.init_plt_count = 0;
+	pr_debug("%s: core.plt=%lld, init.plt=%lld\n", __func__,
+		 mod->arch.core_plt->sh_size, mod->arch.init_plt->sh_size);
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index f4bc779e62e8..14880d77ec6f 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -388,6 +388,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = get_module_plt(me, loc, val);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..3682fa107918
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,4 @@
+SECTIONS {
+        .core.plt : { BYTE(0) }
+        .init.plt : { BYTE(0) }
+}
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 07/13] arm64: add support for module PLTs
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR,
which may place the kernel and the modules anywhere in the vmalloc
area, making it likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |   4 +
 arch/arm64/Makefile             |   4 +
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   7 +
 arch/arm64/kernel/module.lds    |   4 +
 7 files changed, 168 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 871f21783866..827e78f33944 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -704,6 +704,10 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_PLTS
+	bool
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..d4654830e536 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -45,6 +45,10 @@ ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..7b8cd3dc9d8e 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*core_plt;
+	struct elf64_shdr	*init_plt;
+	int			core_plt_count;
+	int			init_plt_count;
+};
+#endif
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f8b13a..f42b0fff607f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..4a8ef9ea01ee
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2014-2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+struct plt_entry {
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+} __aligned(8);
+
+static bool in_init(const struct module *mod, void *addr)
+{
+	return (u64)addr - (u64)mod->module_init < mod->init_size;
+}
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val)
+{
+	struct plt_entry entry = {
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	}, *plt;
+	int i, *count;
+
+	if (in_init(mod, loc)) {
+		plt = (struct plt_entry *)mod->arch.init_plt->sh_addr;
+		count = &mod->arch.init_plt_count;
+	} else {
+		plt = (struct plt_entry *)mod->arch.core_plt->sh_addr;
+		count = &mod->arch.core_plt_count;
+	}
+
+	/* Look for an existing entry pointing to 'val' */
+	for (i = 0; i < *count; i++)
+		if (plt[i].mov0 == entry.mov0 &&
+		    plt[i].mov1 == entry.mov1 &&
+		    plt[i].mov2 == entry.mov2)
+			return (u64)&plt[i];
+
+	i = (*count)++;
+	plt[i] = entry;
+	return (u64)&plt[i];
+}
+
+static int duplicate_rel(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	int i;
+
+	for (i = 0; i < num; i++) {
+		if (rela[i].r_info == rela[num].r_info &&
+		    rela[i].r_addend == rela[num].r_addend)
+			return 1;
+	}
+	return 0;
+}
+
+/* Count how many PLT entries we may need */
+static unsigned int count_plts(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	int i;
+
+	/*
+	 * Sure, this is order(n^2), but it's usually short, and not
+	 * time critical
+	 */
+	for (i = 0; i < num; i++)
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			if (!duplicate_rel(base, rela, i))
+				ret++;
+			break;
+		}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long core_plts = 0, init_plts = 0;
+	Elf64_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum;
+
+	/*
+	 * To store the PLTs, we expand the .text section for core module code
+	 * and the .init.text section for initialization code.
+	 */
+	for (s = sechdrs; s < sechdrs_end; ++s)
+		if (strcmp(".core.plt", secstrings + s->sh_name) == 0)
+			mod->arch.core_plt = s;
+		else if (strcmp(".init.plt", secstrings + s->sh_name) == 0)
+			mod->arch.init_plt = s;
+
+	if (!mod->arch.core_plt || !mod->arch.init_plt) {
+		pr_err("%s: sections missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (s = sechdrs + 1; s < sechdrs_end; ++s) {
+		const Elf64_Rela *rels = (void *)ehdr + s->sh_offset;
+		int numrels = s->sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + s->sh_info;
+
+		if (s->sh_type != SHT_RELA)
+			continue;
+
+		if (strstr(secstrings + s->sh_name, ".init"))
+			init_plts += count_plts(dstsec->sh_addr, rels, numrels);
+		else
+			core_plts += count_plts(dstsec->sh_addr, rels, numrels);
+	}
+
+	mod->arch.core_plt->sh_type = SHT_NOBITS;
+	mod->arch.core_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.core_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.core_plt->sh_size = core_plts * sizeof(struct plt_entry);
+	mod->arch.core_plt_count = 0;
+
+	mod->arch.init_plt->sh_type = SHT_NOBITS;
+	mod->arch.init_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.init_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.init_plt->sh_size = init_plts * sizeof(struct plt_entry);
+	mod->arch.init_plt_count = 0;
+	pr_debug("%s: core.plt=%lld, init.plt=%lld\n", __func__,
+		 mod->arch.core_plt->sh_size, mod->arch.init_plt->sh_size);
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index f4bc779e62e8..14880d77ec6f 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -388,6 +388,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = get_module_plt(me, loc, val);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..3682fa107918
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,4 @@
+SECTIONS {
+        .core.plt : { BYTE(0) }
+        .init.plt : { BYTE(0) }
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 07/13] arm64: add support for module PLTs
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR,
which may place the kernel and the modules anywhere in the vmalloc
area, making it likely that branch target offsets exceed the maximum
range of +/- 128 MB.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig              |   4 +
 arch/arm64/Makefile             |   4 +
 arch/arm64/include/asm/module.h |  11 ++
 arch/arm64/kernel/Makefile      |   1 +
 arch/arm64/kernel/module-plts.c | 137 ++++++++++++++++++++
 arch/arm64/kernel/module.c      |   7 +
 arch/arm64/kernel/module.lds    |   4 +
 7 files changed, 168 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 871f21783866..827e78f33944 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -704,6 +704,10 @@ config ARM64_LSE_ATOMICS
 
 endmenu
 
+config ARM64_MODULE_PLTS
+	bool
+	select HAVE_MOD_ARCH_SPECIFIC
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index cd822d8454c0..d4654830e536 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -45,6 +45,10 @@ ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE	+= -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
 # Default value
 head-y		:= arch/arm64/kernel/head.o
 
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232b730e..7b8cd3dc9d8e 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -20,4 +20,15 @@
 
 #define MODULE_ARCH_VERMAGIC	"aarch64"
 
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+	struct elf64_shdr	*core_plt;
+	struct elf64_shdr	*init_plt;
+	int			core_plt_count;
+	int			init_plt_count;
+};
+#endif
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val);
+
 #endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f8b13a..f42b0fff607f 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   ../../arm/kernel/opcodes.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
 arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
new file mode 100644
index 000000000000..4a8ef9ea01ee
--- /dev/null
+++ b/arch/arm64/kernel/module-plts.c
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2014-2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+struct plt_entry {
+	__le32	mov0;	/* movn	x16, #0x....			*/
+	__le32	mov1;	/* movk	x16, #0x...., lsl #16		*/
+	__le32	mov2;	/* movk	x16, #0x...., lsl #32		*/
+	__le32	br;	/* br	x16				*/
+} __aligned(8);
+
+static bool in_init(const struct module *mod, void *addr)
+{
+	return (u64)addr - (u64)mod->module_init < mod->init_size;
+}
+
+u64 get_module_plt(struct module *mod, void *loc, u64 val)
+{
+	struct plt_entry entry = {
+		cpu_to_le32(0x92800010 | (((~val      ) & 0xffff)) << 5),
+		cpu_to_le32(0xf2a00010 | ((( val >> 16) & 0xffff)) << 5),
+		cpu_to_le32(0xf2c00010 | ((( val >> 32) & 0xffff)) << 5),
+		cpu_to_le32(0xd61f0200)
+	}, *plt;
+	int i, *count;
+
+	if (in_init(mod, loc)) {
+		plt = (struct plt_entry *)mod->arch.init_plt->sh_addr;
+		count = &mod->arch.init_plt_count;
+	} else {
+		plt = (struct plt_entry *)mod->arch.core_plt->sh_addr;
+		count = &mod->arch.core_plt_count;
+	}
+
+	/* Look for an existing entry pointing to 'val' */
+	for (i = 0; i < *count; i++)
+		if (plt[i].mov0 == entry.mov0 &&
+		    plt[i].mov1 == entry.mov1 &&
+		    plt[i].mov2 == entry.mov2)
+			return (u64)&plt[i];
+
+	i = (*count)++;
+	plt[i] = entry;
+	return (u64)&plt[i];
+}
+
+static int duplicate_rel(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	int i;
+
+	for (i = 0; i < num; i++) {
+		if (rela[i].r_info == rela[num].r_info &&
+		    rela[i].r_addend == rela[num].r_addend)
+			return 1;
+	}
+	return 0;
+}
+
+/* Count how many PLT entries we may need */
+static unsigned int count_plts(Elf64_Addr base, const Elf64_Rela *rela, int num)
+{
+	unsigned int ret = 0;
+	int i;
+
+	/*
+	 * Sure, this is order(n^2), but it's usually short, and not
+	 * time critical
+	 */
+	for (i = 0; i < num; i++)
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			if (!duplicate_rel(base, rela, i))
+				ret++;
+			break;
+		}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long core_plts = 0, init_plts = 0;
+	Elf64_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum;
+
+	/*
+	 * To store the PLTs, we expand the .text section for core module code
+	 * and the .init.text section for initialization code.
+	 */
+	for (s = sechdrs; s < sechdrs_end; ++s)
+		if (strcmp(".core.plt", secstrings + s->sh_name) == 0)
+			mod->arch.core_plt = s;
+		else if (strcmp(".init.plt", secstrings + s->sh_name) == 0)
+			mod->arch.init_plt = s;
+
+	if (!mod->arch.core_plt || !mod->arch.init_plt) {
+		pr_err("%s: sections missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (s = sechdrs + 1; s < sechdrs_end; ++s) {
+		const Elf64_Rela *rels = (void *)ehdr + s->sh_offset;
+		int numrels = s->sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + s->sh_info;
+
+		if (s->sh_type != SHT_RELA)
+			continue;
+
+		if (strstr(secstrings + s->sh_name, ".init"))
+			init_plts += count_plts(dstsec->sh_addr, rels, numrels);
+		else
+			core_plts += count_plts(dstsec->sh_addr, rels, numrels);
+	}
+
+	mod->arch.core_plt->sh_type = SHT_NOBITS;
+	mod->arch.core_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.core_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.core_plt->sh_size = core_plts * sizeof(struct plt_entry);
+	mod->arch.core_plt_count = 0;
+
+	mod->arch.init_plt->sh_type = SHT_NOBITS;
+	mod->arch.init_plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.init_plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.init_plt->sh_size = init_plts * sizeof(struct plt_entry);
+	mod->arch.init_plt_count = 0;
+	pr_debug("%s: core.plt=%lld, init.plt=%lld\n", __func__,
+		 mod->arch.core_plt->sh_size, mod->arch.init_plt->sh_size);
+	return 0;
+}
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index f4bc779e62e8..14880d77ec6f 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -388,6 +388,13 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
 					     AARCH64_INSN_IMM_26);
+
+			if (IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) &&
+			    ovf == -ERANGE) {
+				val = get_module_plt(me, loc, val);
+				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
+						     26, AARCH64_INSN_IMM_26);
+			}
 			break;
 
 		default:
diff --git a/arch/arm64/kernel/module.lds b/arch/arm64/kernel/module.lds
new file mode 100644
index 000000000000..3682fa107918
--- /dev/null
+++ b/arch/arm64/kernel/module.lds
@@ -0,0 +1,4 @@
+SECTIONS {
+        .core.plt : { BYTE(0) }
+        .init.plt : { BYTE(0) }
+}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 08/13] arm64: use relative references in exception tables
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Instead of using absolute addresses for both the exception location and
the fixup, use offsets relative to the exception table entry values. This
is a prerequisite for KASLR, since absolute exception table entries are
subject to dynamic relocation, which is incompatible with the sorting of
the exception table that occurs at build time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                   |   1 +
 arch/arm64/include/asm/assembler.h   |   2 +-
 arch/arm64/include/asm/futex.h       |   4 +-
 arch/arm64/include/asm/uaccess.h     |  16 +--
 arch/arm64/kernel/armv8_deprecated.c |   4 +-
 arch/arm64/mm/extable.c              | 102 +++++++++++++++++++-
 scripts/sortextable.c                |   2 +-
 7 files changed, 116 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 827e78f33944..54eeab140bca 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_SORT_EXTABLE
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..8094d50f05bc 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -98,7 +98,7 @@
 9999:	x;					\
 	.section __ex_table,"a";		\
 	.align	3;				\
-	.quad	9999b,l;			\
+	.long	(9999b - .), (l - .);		\
 	.previous
 
 /*
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..35e73e255ad3 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -44,7 +44,7 @@
 "	.popsection\n"							\
 "	.pushsection __ex_table,\"a\"\n"				\
 "	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"		\
 "	.popsection\n"							\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
@@ -135,7 +135,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "	.popsection\n"
 "	.pushsection __ex_table,\"a\"\n"
 "	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"
 "	.popsection\n"
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..064efe4b0063 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,7 +50,7 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
 extern int fixup_exception(struct pt_regs *regs);
@@ -125,7 +125,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
@@ -192,7 +192,7 @@ do {									\
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 937f5e58a4d3..8f21b1363387 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -299,8 +299,8 @@ static void register_insn_emulation_sysctl(struct ctl_table *table)
 	"	.popsection"					\
 	"	.pushsection	 __ex_table,\"a\"\n"		\
 	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
+	"	.long		(0b - .), (4b - .)\n"		\
+	"	.long		(1b - .), (4b - .)\n"		\
 	"	.popsection\n"					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..d803e3e5d3da 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -3,15 +3,115 @@
  */
 
 #include <linux/module.h>
+#include <linux/sort.h>
 #include <linux/uaccess.h>
 
+static unsigned long ex_insn_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+
+static unsigned long ex_fixup_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->fixup + x->fixup;
+}
+
 int fixup_exception(struct pt_regs *regs)
 {
 	const struct exception_table_entry *fixup;
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = ex_fixup_addr(fixup);
 
 	return fixup != NULL;
 }
+
+/*
+ * Search one exception table for an entry corresponding to the
+ * given instruction address, and return the address of the entry,
+ * or NULL if none is found.
+ * We use a binary search, and thus we assume that the table is
+ * already sorted.
+ */
+const struct exception_table_entry *
+search_extable(const struct exception_table_entry *first,
+	       const struct exception_table_entry *last,
+	       unsigned long value)
+{
+	while (first <= last) {
+		const struct exception_table_entry *mid;
+		unsigned long addr;
+
+		mid = ((last - first) >> 1) + first;
+		addr = ex_insn_addr(mid);
+		if (addr < value)
+			first = mid + 1;
+		else if (addr > value)
+			last = mid - 1;
+		else
+			return mid;
+        }
+        return NULL;
+}
+
+static int cmp_ex(const void *a, const void *b)
+{
+	const struct exception_table_entry *x = a, *y = b;
+
+	return x->insn - y->insn;
+}
+
+/*
+ * The exception table needs to be sorted so that the binary
+ * search that we use to find entries in it works properly.
+ * This is used both for the kernel exception table and for
+ * the exception tables of modules that get loaded.
+ *
+ */
+void sort_extable(struct exception_table_entry *start,
+		  struct exception_table_entry *finish)
+{
+	struct exception_table_entry *p;
+	int i;
+
+	/* Convert all entries to being relative to the start of the section */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn += i;
+		i += 4;
+		p->fixup += i;
+		i += 4;
+	}
+
+	sort(start, finish - start, sizeof(struct exception_table_entry),
+	     cmp_ex, NULL);
+
+	/* Denormalize all entries */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn -= i;
+		i += 4;
+		p->fixup -= i;
+		i += 4;
+	}
+}
+
+#ifdef CONFIG_MODULES
+/*
+ * If the exception table is sorted, any referring to the module init
+ * will be at the beginning or the end.
+ */
+void trim_init_extable(struct module *m)
+{
+	/* trim the beginning */
+	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+		m->extable++;
+		m->num_exentries--;
+	}
+	/* trim the end */
+	while (m->num_exentries &&
+		within_module_init(m->extable[m->num_exentries-1].insn, m))
+		m->num_exentries--;
+}
+#endif /* CONFIG_MODULES */
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 08/13] arm64: use relative references in exception tables
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Instead of using absolute addresses for both the exception location and
the fixup, use offsets relative to the exception table entry values. This
is a prerequisite for KASLR, since absolute exception table entries are
subject to dynamic relocation, which is incompatible with the sorting of
the exception table that occurs at build time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                   |   1 +
 arch/arm64/include/asm/assembler.h   |   2 +-
 arch/arm64/include/asm/futex.h       |   4 +-
 arch/arm64/include/asm/uaccess.h     |  16 +--
 arch/arm64/kernel/armv8_deprecated.c |   4 +-
 arch/arm64/mm/extable.c              | 102 +++++++++++++++++++-
 scripts/sortextable.c                |   2 +-
 7 files changed, 116 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 827e78f33944..54eeab140bca 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_SORT_EXTABLE
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..8094d50f05bc 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -98,7 +98,7 @@
 9999:	x;					\
 	.section __ex_table,"a";		\
 	.align	3;				\
-	.quad	9999b,l;			\
+	.long	(9999b - .), (l - .);		\
 	.previous
 
 /*
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..35e73e255ad3 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -44,7 +44,7 @@
 "	.popsection\n"							\
 "	.pushsection __ex_table,\"a\"\n"				\
 "	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"		\
 "	.popsection\n"							\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
@@ -135,7 +135,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "	.popsection\n"
 "	.pushsection __ex_table,\"a\"\n"
 "	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"
 "	.popsection\n"
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..064efe4b0063 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,7 +50,7 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
 extern int fixup_exception(struct pt_regs *regs);
@@ -125,7 +125,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
@@ -192,7 +192,7 @@ do {									\
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 937f5e58a4d3..8f21b1363387 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -299,8 +299,8 @@ static void register_insn_emulation_sysctl(struct ctl_table *table)
 	"	.popsection"					\
 	"	.pushsection	 __ex_table,\"a\"\n"		\
 	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
+	"	.long		(0b - .), (4b - .)\n"		\
+	"	.long		(1b - .), (4b - .)\n"		\
 	"	.popsection\n"					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..d803e3e5d3da 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -3,15 +3,115 @@
  */
 
 #include <linux/module.h>
+#include <linux/sort.h>
 #include <linux/uaccess.h>
 
+static unsigned long ex_insn_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+
+static unsigned long ex_fixup_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->fixup + x->fixup;
+}
+
 int fixup_exception(struct pt_regs *regs)
 {
 	const struct exception_table_entry *fixup;
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = ex_fixup_addr(fixup);
 
 	return fixup != NULL;
 }
+
+/*
+ * Search one exception table for an entry corresponding to the
+ * given instruction address, and return the address of the entry,
+ * or NULL if none is found.
+ * We use a binary search, and thus we assume that the table is
+ * already sorted.
+ */
+const struct exception_table_entry *
+search_extable(const struct exception_table_entry *first,
+	       const struct exception_table_entry *last,
+	       unsigned long value)
+{
+	while (first <= last) {
+		const struct exception_table_entry *mid;
+		unsigned long addr;
+
+		mid = ((last - first) >> 1) + first;
+		addr = ex_insn_addr(mid);
+		if (addr < value)
+			first = mid + 1;
+		else if (addr > value)
+			last = mid - 1;
+		else
+			return mid;
+        }
+        return NULL;
+}
+
+static int cmp_ex(const void *a, const void *b)
+{
+	const struct exception_table_entry *x = a, *y = b;
+
+	return x->insn - y->insn;
+}
+
+/*
+ * The exception table needs to be sorted so that the binary
+ * search that we use to find entries in it works properly.
+ * This is used both for the kernel exception table and for
+ * the exception tables of modules that get loaded.
+ *
+ */
+void sort_extable(struct exception_table_entry *start,
+		  struct exception_table_entry *finish)
+{
+	struct exception_table_entry *p;
+	int i;
+
+	/* Convert all entries to being relative to the start of the section */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn += i;
+		i += 4;
+		p->fixup += i;
+		i += 4;
+	}
+
+	sort(start, finish - start, sizeof(struct exception_table_entry),
+	     cmp_ex, NULL);
+
+	/* Denormalize all entries */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn -= i;
+		i += 4;
+		p->fixup -= i;
+		i += 4;
+	}
+}
+
+#ifdef CONFIG_MODULES
+/*
+ * If the exception table is sorted, any referring to the module init
+ * will be at the beginning or the end.
+ */
+void trim_init_extable(struct module *m)
+{
+	/* trim the beginning */
+	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+		m->extable++;
+		m->num_exentries--;
+	}
+	/* trim the end */
+	while (m->num_exentries &&
+		within_module_init(m->extable[m->num_exentries-1].insn, m))
+		m->num_exentries--;
+}
+#endif /* CONFIG_MODULES */
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 08/13] arm64: use relative references in exception tables
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Instead of using absolute addresses for both the exception location and
the fixup, use offsets relative to the exception table entry values. This
is a prerequisite for KASLR, since absolute exception table entries are
subject to dynamic relocation, which is incompatible with the sorting of
the exception table that occurs at build time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/Kconfig                   |   1 +
 arch/arm64/include/asm/assembler.h   |   2 +-
 arch/arm64/include/asm/futex.h       |   4 +-
 arch/arm64/include/asm/uaccess.h     |  16 +--
 arch/arm64/kernel/armv8_deprecated.c |   4 +-
 arch/arm64/mm/extable.c              | 102 +++++++++++++++++++-
 scripts/sortextable.c                |   2 +-
 7 files changed, 116 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 827e78f33944..54eeab140bca 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_SORT_EXTABLE
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..8094d50f05bc 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -98,7 +98,7 @@
 9999:	x;					\
 	.section __ex_table,"a";		\
 	.align	3;				\
-	.quad	9999b,l;			\
+	.long	(9999b - .), (l - .);		\
 	.previous
 
 /*
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 007a69fc4f40..35e73e255ad3 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -44,7 +44,7 @@
 "	.popsection\n"							\
 "	.pushsection __ex_table,\"a\"\n"				\
 "	.align	3\n"							\
-"	.quad	1b, 4b, 2b, 4b\n"					\
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"		\
 "	.popsection\n"							\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,		\
 		    CONFIG_ARM64_PAN)					\
@@ -135,7 +135,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 "	.popsection\n"
 "	.pushsection __ex_table,\"a\"\n"
 "	.align	3\n"
-"	.quad	1b, 4b, 2b, 4b\n"
+"	.long	(1b - .), (4b - .), (2b - .), (4b - .)\n"
 "	.popsection\n"
 	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
 	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index b2ede967fe7d..064efe4b0063 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -36,11 +36,11 @@
 #define VERIFY_WRITE 1
 
 /*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue.  No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
  *
  * All the routines below use bits of fixup code that are out of line
  * with the main instruction path.  This means when everything is well,
@@ -50,7 +50,7 @@
 
 struct exception_table_entry
 {
-	unsigned long insn, fixup;
+	int insn, fixup;
 };
 
 extern int fixup_exception(struct pt_regs *regs);
@@ -125,7 +125,7 @@ static inline void set_fs(mm_segment_t fs)
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err), "=&r" (x)						\
 	: "r" (addr), "i" (-EFAULT))
@@ -192,7 +192,7 @@ do {									\
 	"	.previous\n"						\
 	"	.section __ex_table,\"a\"\n"				\
 	"	.align	3\n"						\
-	"	.quad	1b, 3b\n"					\
+	"	.long	(1b - .), (3b - .)\n"				\
 	"	.previous"						\
 	: "+r" (err)							\
 	: "r" (x), "r" (addr), "i" (-EFAULT))
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 937f5e58a4d3..8f21b1363387 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -299,8 +299,8 @@ static void register_insn_emulation_sysctl(struct ctl_table *table)
 	"	.popsection"					\
 	"	.pushsection	 __ex_table,\"a\"\n"		\
 	"	.align		3\n"				\
-	"	.quad		0b, 4b\n"			\
-	"	.quad		1b, 4b\n"			\
+	"	.long		(0b - .), (4b - .)\n"		\
+	"	.long		(1b - .), (4b - .)\n"		\
 	"	.popsection\n"					\
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index 79444279ba8c..d803e3e5d3da 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -3,15 +3,115 @@
  */
 
 #include <linux/module.h>
+#include <linux/sort.h>
 #include <linux/uaccess.h>
 
+static unsigned long ex_insn_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->insn + x->insn;
+}
+
+static unsigned long ex_fixup_addr(const struct exception_table_entry *x)
+{
+	return (unsigned long)&x->fixup + x->fixup;
+}
+
 int fixup_exception(struct pt_regs *regs)
 {
 	const struct exception_table_entry *fixup;
 
 	fixup = search_exception_tables(instruction_pointer(regs));
 	if (fixup)
-		regs->pc = fixup->fixup;
+		regs->pc = ex_fixup_addr(fixup);
 
 	return fixup != NULL;
 }
+
+/*
+ * Search one exception table for an entry corresponding to the
+ * given instruction address, and return the address of the entry,
+ * or NULL if none is found.
+ * We use a binary search, and thus we assume that the table is
+ * already sorted.
+ */
+const struct exception_table_entry *
+search_extable(const struct exception_table_entry *first,
+	       const struct exception_table_entry *last,
+	       unsigned long value)
+{
+	while (first <= last) {
+		const struct exception_table_entry *mid;
+		unsigned long addr;
+
+		mid = ((last - first) >> 1) + first;
+		addr = ex_insn_addr(mid);
+		if (addr < value)
+			first = mid + 1;
+		else if (addr > value)
+			last = mid - 1;
+		else
+			return mid;
+        }
+        return NULL;
+}
+
+static int cmp_ex(const void *a, const void *b)
+{
+	const struct exception_table_entry *x = a, *y = b;
+
+	return x->insn - y->insn;
+}
+
+/*
+ * The exception table needs to be sorted so that the binary
+ * search that we use to find entries in it works properly.
+ * This is used both for the kernel exception table and for
+ * the exception tables of modules that get loaded.
+ *
+ */
+void sort_extable(struct exception_table_entry *start,
+		  struct exception_table_entry *finish)
+{
+	struct exception_table_entry *p;
+	int i;
+
+	/* Convert all entries to being relative to the start of the section */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn += i;
+		i += 4;
+		p->fixup += i;
+		i += 4;
+	}
+
+	sort(start, finish - start, sizeof(struct exception_table_entry),
+	     cmp_ex, NULL);
+
+	/* Denormalize all entries */
+	i = 0;
+	for (p = start; p < finish; p++) {
+		p->insn -= i;
+		i += 4;
+		p->fixup -= i;
+		i += 4;
+	}
+}
+
+#ifdef CONFIG_MODULES
+/*
+ * If the exception table is sorted, any referring to the module init
+ * will be at the beginning or the end.
+ */
+void trim_init_extable(struct module *m)
+{
+	/* trim the beginning */
+	while (m->num_exentries && within_module_init(m->extable[0].insn, m)) {
+		m->extable++;
+		m->num_exentries--;
+	}
+	/* trim the end */
+	while (m->num_exentries &&
+		within_module_init(m->extable[m->num_exentries-1].insn, m))
+		m->num_exentries--;
+}
+#endif /* CONFIG_MODULES */
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index c2423d913b46..af247c70fb66 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -282,12 +282,12 @@ do_file(char const *const fname)
 	case EM_386:
 	case EM_X86_64:
 	case EM_S390:
+	case EM_AARCH64:
 		custom_sort = sort_relative_table;
 		break;
 	case EM_ARCOMPACT:
 	case EM_ARCV2:
 	case EM_ARM:
-	case EM_AARCH64:
 	case EM_MICROBLAZE:
 	case EM_MIPS:
 	case EM_XTENSA:
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 09/13] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which we will resolve
at runtime, not at build time, and the places targeted by those
relocations will contain zeroes before that.

So move back to assembly time constants or R_AARCH64_ABS32 relocations
(which, interestingly enough, do get resolved at build time)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 15 ++++++++
 arch/arm64/kernel/head.S           | 17 +++++++--
 arch/arm64/kernel/image.h          | 37 ++++++--------------
 3 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 8094d50f05bc..39bf158255d7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -204,4 +204,19 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	.macro	le16, val
+	.byte	\val & 0xff
+	.byte	(\val >> 8) & 0xff
+	.endm
+
+	.macro	le32, val
+	le16	\val
+	le16	\val >> 16
+	.endm
+
+	.macro	le64, val
+	le32	\val
+	le32	\val >> 32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6434c844a0e4..ccbb1bd46026 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -51,6 +51,17 @@
 #define KERNEL_START	_text
 #define KERNEL_END	_end
 
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define __HEAD_FLAG_BE	1
+#else
+#define __HEAD_FLAG_BE	0
+#endif
+
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+
+#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
+			 (__HEAD_FLAG_PAGE_SIZE << 1))
+
 /*
  * Kernel startup entry point.
  * ---------------------------
@@ -83,9 +94,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64	TEXT_OFFSET			// Image load offset from start of RAM, little-endian
+	.long	_kernel_size_le, 0		// Effective size of kernel image, little-endian
+	le64	__HEAD_FLAGS			// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..bb6b0e69d0a4 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,41 +26,26 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ * Note that this will not work for 64-bit values: these are resolved using
+ * R_AARCH64_ABS64 relocations, which are fixed up at runtime rather than at
+ * build time when building the PIE executable (for KASLR).
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
-#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
-#else
-#define __HEAD_FLAG_BE	0
-#endif
-
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
-
 /*
  * These will output as part of the Image header, which should be little-endian
- * regardless of the endianness of the kernel. While constant values could be
- * endian swapped in head.S, all are done here for consistency.
+ * regardless of the endianness of the kernel.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	_kernel_size_le		= DATA_LE32(_end - _text);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 09/13] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which we will resolve
at runtime, not at build time, and the places targeted by those
relocations will contain zeroes before that.

So move back to assembly time constants or R_AARCH64_ABS32 relocations
(which, interestingly enough, do get resolved at build time)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 15 ++++++++
 arch/arm64/kernel/head.S           | 17 +++++++--
 arch/arm64/kernel/image.h          | 37 ++++++--------------
 3 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 8094d50f05bc..39bf158255d7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -204,4 +204,19 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	.macro	le16, val
+	.byte	\val & 0xff
+	.byte	(\val >> 8) & 0xff
+	.endm
+
+	.macro	le32, val
+	le16	\val
+	le16	\val >> 16
+	.endm
+
+	.macro	le64, val
+	le32	\val
+	le32	\val >> 32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6434c844a0e4..ccbb1bd46026 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -51,6 +51,17 @@
 #define KERNEL_START	_text
 #define KERNEL_END	_end
 
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define __HEAD_FLAG_BE	1
+#else
+#define __HEAD_FLAG_BE	0
+#endif
+
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+
+#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
+			 (__HEAD_FLAG_PAGE_SIZE << 1))
+
 /*
  * Kernel startup entry point.
  * ---------------------------
@@ -83,9 +94,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64	TEXT_OFFSET			// Image load offset from start of RAM, little-endian
+	.long	_kernel_size_le, 0		// Effective size of kernel image, little-endian
+	le64	__HEAD_FLAGS			// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..bb6b0e69d0a4 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,41 +26,26 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ * Note that this will not work for 64-bit values: these are resolved using
+ * R_AARCH64_ABS64 relocations, which are fixed up@runtime rather than at
+ * build time when building the PIE executable (for KASLR).
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
-#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
-#else
-#define __HEAD_FLAG_BE	0
-#endif
-
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
-
 /*
  * These will output as part of the Image header, which should be little-endian
- * regardless of the endianness of the kernel. While constant values could be
- * endian swapped in head.S, all are done here for consistency.
+ * regardless of the endianness of the kernel.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	_kernel_size_le		= DATA_LE32(_end - _text);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 09/13] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which we will resolve
at runtime, not at build time, and the places targeted by those
relocations will contain zeroes before that.

So move back to assembly time constants or R_AARCH64_ABS32 relocations
(which, interestingly enough, do get resolved at build time)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/assembler.h | 15 ++++++++
 arch/arm64/kernel/head.S           | 17 +++++++--
 arch/arm64/kernel/image.h          | 37 ++++++--------------
 3 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 8094d50f05bc..39bf158255d7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -204,4 +204,19 @@ lr	.req	x30		// link register
 	.size	__pi_##x, . - x;	\
 	ENDPROC(x)
 
+	.macro	le16, val
+	.byte	\val & 0xff
+	.byte	(\val >> 8) & 0xff
+	.endm
+
+	.macro	le32, val
+	le16	\val
+	le16	\val >> 16
+	.endm
+
+	.macro	le64, val
+	le32	\val
+	le32	\val >> 32
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6434c844a0e4..ccbb1bd46026 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -51,6 +51,17 @@
 #define KERNEL_START	_text
 #define KERNEL_END	_end
 
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define __HEAD_FLAG_BE	1
+#else
+#define __HEAD_FLAG_BE	0
+#endif
+
+#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
+
+#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
+			 (__HEAD_FLAG_PAGE_SIZE << 1))
+
 /*
  * Kernel startup entry point.
  * ---------------------------
@@ -83,9 +94,9 @@ efi_head:
 	b	stext				// branch to kernel start, magic
 	.long	0				// reserved
 #endif
-	.quad	_kernel_offset_le		// Image load offset from start of RAM, little-endian
-	.quad	_kernel_size_le			// Effective size of kernel image, little-endian
-	.quad	_kernel_flags_le		// Informative flags, little-endian
+	le64	TEXT_OFFSET			// Image load offset from start of RAM, little-endian
+	.long	_kernel_size_le, 0		// Effective size of kernel image, little-endian
+	le64	__HEAD_FLAGS			// Informative flags, little-endian
 	.quad	0				// reserved
 	.quad	0				// reserved
 	.quad	0				// reserved
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index bc2abb8b1599..bb6b0e69d0a4 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -26,41 +26,26 @@
  * There aren't any ELF relocations we can use to endian-swap values known only
  * at link time (e.g. the subtraction of two symbol addresses), so we must get
  * the linker to endian-swap certain values before emitting them.
+ * Note that this will not work for 64-bit values: these are resolved using
+ * R_AARCH64_ABS64 relocations, which are fixed up at runtime rather than at
+ * build time when building the PIE executable (for KASLR).
  */
 #ifdef CONFIG_CPU_BIG_ENDIAN
-#define DATA_LE64(data)					\
-	((((data) & 0x00000000000000ff) << 56) |	\
-	 (((data) & 0x000000000000ff00) << 40) |	\
-	 (((data) & 0x0000000000ff0000) << 24) |	\
-	 (((data) & 0x00000000ff000000) << 8)  |	\
-	 (((data) & 0x000000ff00000000) >> 8)  |	\
-	 (((data) & 0x0000ff0000000000) >> 24) |	\
-	 (((data) & 0x00ff000000000000) >> 40) |	\
-	 (((data) & 0xff00000000000000) >> 56))
+#define DATA_LE32(data)				\
+	((((data) & 0x000000ff) << 24) |	\
+	 (((data) & 0x0000ff00) << 8)  |	\
+	 (((data) & 0x00ff0000) >> 8)  |	\
+	 (((data) & 0xff000000) >> 24))
 #else
-#define DATA_LE64(data) ((data) & 0xffffffffffffffff)
+#define DATA_LE32(data) ((data) & 0xffffffff)
 #endif
 
-#ifdef CONFIG_CPU_BIG_ENDIAN
-#define __HEAD_FLAG_BE	1
-#else
-#define __HEAD_FLAG_BE	0
-#endif
-
-#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
-
-#define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
-
 /*
  * These will output as part of the Image header, which should be little-endian
- * regardless of the endianness of the kernel. While constant values could be
- * endian swapped in head.S, all are done here for consistency.
+ * regardless of the endianness of the kernel.
  */
 #define HEAD_SYMBOLS						\
-	_kernel_size_le		= DATA_LE64(_end - _text);	\
-	_kernel_offset_le	= DATA_LE64(TEXT_OFFSET);	\
-	_kernel_flags_le	= DATA_LE64(__HEAD_FLAGS);
+	_kernel_size_le		= DATA_LE32(_end - _text);
 
 #ifdef CONFIG_EFI
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 10/13] arm64: avoid dynamic relocations in early boot code
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ccbb1bd46026..1230fa93fd8c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -78,12 +78,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -105,14 +104,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -135,7 +134,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -150,7 +149,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -230,11 +229,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -402,7 +403,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -419,6 +421,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -426,6 +431,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with
+	msr	vbar_el1, x8			// relocated address
+	isb
+
 	adr_l	x6, __bss_start
 	adr_l	x7, __bss_stop
 
@@ -609,13 +618,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	mov	x29, #0
 	b	secondary_start_kernel
@@ -638,8 +653,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 10/13] arm64: avoid dynamic relocations in early boot code
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ccbb1bd46026..1230fa93fd8c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -78,12 +78,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -105,14 +104,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -135,7 +134,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -150,7 +149,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -230,11 +229,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -402,7 +403,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -419,6 +421,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -426,6 +431,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with
+	msr	vbar_el1, x8			// relocated address
+	isb
+
 	adr_l	x6, __bss_start
 	adr_l	x7, __bss_stop
 
@@ -609,13 +618,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	mov	x29, #0
 	b	secondary_start_kernel
@@ -638,8 +653,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 10/13] arm64: avoid dynamic relocations in early boot code
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Before implementing KASLR for arm64 by building a self-relocating PIE
executable, we have to ensure that values we use before the relocation
routine is executed are not subject to dynamic relocation themselves.
This applies not only to virtual addresses, but also to values that are
supplied by the linker at build time and relocated using R_AARCH64_ABS64
relocations.

So instead, use assemble time constants, or force the use of static
relocations by folding the constants into the instructions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 39 +++++++++++++-------
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db92908b..f82036e02485 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	ldr	x21, =stext_offset
+	movz	x21, #:abs_g0:stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ccbb1bd46026..1230fa93fd8c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -78,12 +78,11 @@
  * in the entry routines.
  */
 	__HEAD
-
+_head:
 	/*
 	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
 	 */
 #ifdef CONFIG_EFI
-efi_head:
 	/*
 	 * This add instruction has no meaningful effect except that
 	 * its opcode forms the magic "MZ" signature required by UEFI.
@@ -105,14 +104,14 @@ efi_head:
 	.byte	0x4d
 	.byte	0x64
 #ifdef CONFIG_EFI
-	.long	pe_header - efi_head		// Offset to the PE header.
+	.long	pe_header - _head		// Offset to the PE header.
 #else
 	.word	0				// reserved
 #endif
 
 #ifdef CONFIG_EFI
 	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - efi_head
+	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -135,7 +134,7 @@ optional_header:
 	.long	_end - stext			// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
-	.long	__efistub_entry - efi_head	// AddressOfEntryPoint
+	.long	__efistub_entry - _head		// AddressOfEntryPoint
 	.long	__efistub_stext_offset		// BaseOfCode
 
 extra_header_fields:
@@ -150,7 +149,7 @@ extra_header_fields:
 	.short	0				// MinorSubsystemVersion
 	.long	0				// Win32VersionValue
 
-	.long	_end - efi_head			// SizeOfImage
+	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
 	.long	__efistub_stext_offset		// SizeOfHeaders
@@ -230,11 +229,13 @@ ENTRY(stext)
 	 * On return, the CPU will be ready for the MMU to be turned on and
 	 * the TCR will have been set.
 	 */
-	ldr	x27, =__mmap_switched		// address to jump to after
+	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
+	.align	3
+0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -402,7 +403,8 @@ __create_page_tables:
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
 	create_pgd_entry x0, x5, x3, x6
-	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
+	ldr	w6, kernel_img_size
+	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
 
@@ -419,6 +421,9 @@ __create_page_tables:
 	mov	lr, x27
 	ret
 ENDPROC(__create_page_tables)
+
+kernel_img_size:
+	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
@@ -426,6 +431,10 @@ ENDPROC(__create_page_tables)
  */
 	.set	initial_sp, init_thread_union + THREAD_START_SP
 __mmap_switched:
+	adr_l	x8, vectors			// load VBAR_EL1 with
+	msr	vbar_el1, x8			// relocated address
+	isb
+
 	adr_l	x6, __bss_start
 	adr_l	x7, __bss_stop
 
@@ -609,13 +618,19 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x21, =secondary_data
-	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	ldr	x8, =KIMAGE_VADDR
+	ldr	w9, 0f
+	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
 ENDPROC(secondary_startup)
+0:	.long	(_text - TEXT_OFFSET) - __secondary_switched
 
 ENTRY(__secondary_switched)
-	ldr	x0, [x21]			// get secondary_data.stack
+	adr_l	x5, vectors
+	msr	vbar_el1, x5
+	isb
+
+	ldr_l	x0, secondary_data		// get secondary_data.stack
 	mov	sp, x0
 	mov	x29, #0
 	b	secondary_start_kernel
@@ -638,8 +653,6 @@ __enable_mmu:
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
 	b.ne	__no_granule_support
-	ldr	x5, =vectors
-	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
 	msr	ttbr1_el1, x26			// load TTBR1
 	isb
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt  | 12 ++---
 arch/arm64/include/asm/boot.h    |  5 ++
 arch/arm64/include/asm/kvm_mmu.h |  2 +-
 arch/arm64/include/asm/memory.h  | 15 +++---
 arch/arm64/kernel/head.S         |  6 ++-
 arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
 arch/arm64/mm/mmu.c              | 12 +++++
 7 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..03e02ebc1b0c 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -117,14 +117,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..984cb0fa61ce 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,9 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to be 2 MB aligned
+ */
+#define MIN_KIMG_ALIGN         SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 0899026a2821..7e9516365b76 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,7 +73,7 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)	phys_to_virt((u64)&sym - kimage_voffset)
 
 /*
  * We currently only support a 40bit IPA.
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 1dcbf142d36c..557228658666 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -121,13 +121,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 1230fa93fd8c..01a33e42ed70 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -445,7 +445,11 @@ __mmap_switched:
 2:
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	sub	x24, x0, x24			// the kernel virtual and
+	str_l	x24, kimage_voffset, x0		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 2cfc9c54bf51..6aafe15c7754 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,6 +35,7 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
@@ -158,9 +159,56 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+static void __init enforce_memory_limit(void)
+{
+	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+	u64 to_remove = memblock_phys_mem_size() - memory_limit;
+	phys_addr_t max_addr = 0;
+	struct memblock_region *r;
+
+	if (memory_limit == (phys_addr_t)ULLONG_MAX)
+		return;
+
+	/*
+	 * The kernel may be high up in physical memory, so try to apply the
+	 * limit below the kernel first, and only let the generic handling
+	 * take over if it turns out we haven't clipped enough memory yet.
+	 */
+	for_each_memblock(memory, r) {
+		if (r->base + r->size > kbase) {
+			u64 rem = min(to_remove, kbase - r->base);
+
+			max_addr = r->base + rem;
+			to_remove -= rem;
+			break;
+		}
+		if (to_remove <= r->size) {
+			max_addr = r->base + to_remove;
+			to_remove = 0;
+			break;
+		}
+		to_remove -= r->size;
+	}
+
+	memblock_remove(0, max_addr);
+
+	if (to_remove)
+		memblock_enforce_memory_limit(memory_limit);
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	memblock_remove(round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT) +
+			linear_region_size, ULLONG_MAX);
+
+	enforce_memory_limit();
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6275d183c005..10067385e40f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -44,6 +44,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
@@ -326,6 +329,15 @@ static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
 
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 * This should be equal to or below the lowest usable physical
+	 * memory address, and aligned to PUD/PMD size so that we can map
+	 * it efficiently.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT);
+
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
 		phys_addr_t start = reg->base;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt  | 12 ++---
 arch/arm64/include/asm/boot.h    |  5 ++
 arch/arm64/include/asm/kvm_mmu.h |  2 +-
 arch/arm64/include/asm/memory.h  | 15 +++---
 arch/arm64/kernel/head.S         |  6 ++-
 arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
 arch/arm64/mm/mmu.c              | 12 +++++
 7 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..03e02ebc1b0c 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -117,14 +117,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..984cb0fa61ce 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,9 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to be 2 MB aligned
+ */
+#define MIN_KIMG_ALIGN         SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 0899026a2821..7e9516365b76 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,7 +73,7 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)	phys_to_virt((u64)&sym - kimage_voffset)
 
 /*
  * We currently only support a 40bit IPA.
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 1dcbf142d36c..557228658666 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -121,13 +121,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 1230fa93fd8c..01a33e42ed70 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -445,7 +445,11 @@ __mmap_switched:
 2:
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	sub	x24, x0, x24			// the kernel virtual and
+	str_l	x24, kimage_voffset, x0		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 2cfc9c54bf51..6aafe15c7754 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,6 +35,7 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
@@ -158,9 +159,56 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+static void __init enforce_memory_limit(void)
+{
+	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+	u64 to_remove = memblock_phys_mem_size() - memory_limit;
+	phys_addr_t max_addr = 0;
+	struct memblock_region *r;
+
+	if (memory_limit == (phys_addr_t)ULLONG_MAX)
+		return;
+
+	/*
+	 * The kernel may be high up in physical memory, so try to apply the
+	 * limit below the kernel first, and only let the generic handling
+	 * take over if it turns out we haven't clipped enough memory yet.
+	 */
+	for_each_memblock(memory, r) {
+		if (r->base + r->size > kbase) {
+			u64 rem = min(to_remove, kbase - r->base);
+
+			max_addr = r->base + rem;
+			to_remove -= rem;
+			break;
+		}
+		if (to_remove <= r->size) {
+			max_addr = r->base + to_remove;
+			to_remove = 0;
+			break;
+		}
+		to_remove -= r->size;
+	}
+
+	memblock_remove(0, max_addr);
+
+	if (to_remove)
+		memblock_enforce_memory_limit(memory_limit);
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	memblock_remove(round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT) +
+			linear_region_size, ULLONG_MAX);
+
+	enforce_memory_limit();
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6275d183c005..10067385e40f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -44,6 +44,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
@@ -326,6 +329,15 @@ static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
 
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 * This should be equal to or below the lowest usable physical
+	 * memory address, and aligned to PUD/PMD size so that we can map
+	 * it efficiently.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT);
+
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
 		phys_addr_t start = reg->base;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt  | 12 ++---
 arch/arm64/include/asm/boot.h    |  5 ++
 arch/arm64/include/asm/kvm_mmu.h |  2 +-
 arch/arm64/include/asm/memory.h  | 15 +++---
 arch/arm64/kernel/head.S         |  6 ++-
 arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
 arch/arm64/mm/mmu.c              | 12 +++++
 7 files changed, 86 insertions(+), 16 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d3171a..03e02ebc1b0c 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -117,14 +117,14 @@ Header notes:
   depending on selected features, and is effectively unbound.
 
 The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
 At least image_size bytes from the start of the image must be free for
 use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
 
 Any memory described to the kernel (even that below the start of the
 image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b67b26b..984cb0fa61ce 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,9 @@
 #define MIN_FDT_ALIGN		8
 #define MAX_FDT_SIZE		SZ_2M
 
+/*
+ * arm64 requires the kernel image to be 2 MB aligned
+ */
+#define MIN_KIMG_ALIGN         SZ_2M
+
 #endif
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 0899026a2821..7e9516365b76 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -73,7 +73,7 @@
 
 #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
 
-#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
+#define kvm_ksym_ref(sym)	phys_to_virt((u64)&sym - kimage_voffset)
 
 /*
  * We currently only support a 40bit IPA.
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 1dcbf142d36c..557228658666 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -88,10 +88,10 @@
 #define __virt_to_phys(x) ({						\
 	phys_addr_t __x = (phys_addr_t)(x);				\
 	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
-			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
+			     (__x - kimage_voffset); })
 
 #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
-#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
+#define __phys_to_kimg(x)	((unsigned long)((x) + kimage_voffset))
 
 /*
  * Convert a page to/from a physical address
@@ -121,13 +121,14 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the offset between the kernel virtual and physical mappings */
+extern u64			kimage_voffset;
+
 /*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
  */
-#define MAX_MEMBLOCK_ADDR	({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR	0
+#define MAX_MEMBLOCK_ADDR	U64_MAX
 
 /*
  * PFNs are used to describe any physical page; this means
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 1230fa93fd8c..01a33e42ed70 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -445,7 +445,11 @@ __mmap_switched:
 2:
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
-	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
+
+	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	sub	x24, x0, x24			// the kernel virtual and
+	str_l	x24, kimage_voffset, x0		// physical mappings
+
 	mov	x29, #0
 #ifdef CONFIG_KASAN
 	bl	kasan_early_init
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 2cfc9c54bf51..6aafe15c7754 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,6 +35,7 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 
+#include <asm/boot.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/memory.h>
@@ -158,9 +159,56 @@ static int __init early_mem(char *p)
 }
 early_param("mem", early_mem);
 
+static void __init enforce_memory_limit(void)
+{
+	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
+	u64 to_remove = memblock_phys_mem_size() - memory_limit;
+	phys_addr_t max_addr = 0;
+	struct memblock_region *r;
+
+	if (memory_limit == (phys_addr_t)ULLONG_MAX)
+		return;
+
+	/*
+	 * The kernel may be high up in physical memory, so try to apply the
+	 * limit below the kernel first, and only let the generic handling
+	 * take over if it turns out we haven't clipped enough memory yet.
+	 */
+	for_each_memblock(memory, r) {
+		if (r->base + r->size > kbase) {
+			u64 rem = min(to_remove, kbase - r->base);
+
+			max_addr = r->base + rem;
+			to_remove -= rem;
+			break;
+		}
+		if (to_remove <= r->size) {
+			max_addr = r->base + to_remove;
+			to_remove = 0;
+			break;
+		}
+		to_remove -= r->size;
+	}
+
+	memblock_remove(0, max_addr);
+
+	if (to_remove)
+		memblock_enforce_memory_limit(memory_limit);
+}
+
 void __init arm64_memblock_init(void)
 {
-	memblock_enforce_memory_limit(memory_limit);
+	/*
+	 * Remove the memory that we will not be able to cover
+	 * with the linear mapping.
+	 */
+	const s64 linear_region_size = -(s64)PAGE_OFFSET;
+
+	memblock_remove(round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT) +
+			linear_region_size, ULLONG_MAX);
+
+	enforce_memory_limit();
 
 	/*
 	 * Register the kernel text, kernel data, initrd, and initial
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6275d183c005..10067385e40f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -44,6 +44,9 @@
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
 
+u64 kimage_voffset __read_mostly;
+EXPORT_SYMBOL(kimage_voffset);
+
 /*
  * Empty_zero_page is a special page that is used for zero-initialized data
  * and COW.
@@ -326,6 +329,15 @@ static void __init map_mem(pgd_t *pgd)
 {
 	struct memblock_region *reg;
 
+	/*
+	 * Select a suitable value for the base of physical memory.
+	 * This should be equal to or below the lowest usable physical
+	 * memory address, and aligned to PUD/PMD size so that we can map
+	 * it efficiently.
+	 */
+	memstart_addr = round_down(memblock_start_of_DRAM(),
+				   1 << SWAPPER_TABLE_SHIFT);
+
 	/* map all the memory banks */
 	for_each_memblock(memory, reg) {
 		phys_addr_t start = reg->base;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This adds support for runtime relocation of the kernel Image, by
building it as a PIE (ET_DYN) executable and applying the dynamic
relocations in the early boot code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt |  3 +-
 arch/arm64/Kconfig              | 13 ++++
 arch/arm64/Makefile             |  6 +-
 arch/arm64/include/asm/memory.h |  3 +
 arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
 arch/arm64/kernel/setup.c       | 22 +++---
 arch/arm64/kernel/vmlinux.lds.S |  9 +++
 scripts/sortextable.c           |  4 +-
 8 files changed, 117 insertions(+), 18 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 03e02ebc1b0c..b17181eb4a43 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,8 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Relocatable kernel.
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 54eeab140bca..f458fb9e0dce 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
 	bool
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_RELOCATABLE_KERNEL
+	bool "Kernel address space layout randomization (KASLR)"
+	select ARM64_MODULE_PLTS
+	select ARM64_MODULE_CMODEL_LARGE
+	help
+	  This feature randomizes the virtual address of the kernel image, to
+	  harden against exploits that rely on knowledge about the absolute
+	  addresses of certain kernel data structures.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index d4654830e536..75dc477d45f5 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
@@ -41,7 +45,7 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 557228658666..afab3e669e19 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -121,6 +121,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01a33e42ed70..ab582ee58b58 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -59,8 +59,15 @@
 
 #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+#define __HEAD_FLAG_RELOC	1
+#else
+#define __HEAD_FLAG_RELOC	0
+#endif
+
 #define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+			 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+			 (__HEAD_FLAG_RELOC << 3))
 
 /*
  * Kernel startup entry point.
@@ -231,6 +238,9 @@ ENTRY(stext)
 	 */
 	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x27, x27, x23			// add KASLR displacement
+#endif
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
@@ -243,6 +253,16 @@ ENDPROC(stext)
 preserve_boot_args:
 	mov	x21, x0				// x21=FDT
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	/*
+	 * Mask off the bits of the random value supplied in x1 so it can serve
+	 * as a KASLR displacement value which will move the kernel image to a
+	 * random offset in the lower half of the VMALLOC area.
+	 */
+	mov	x23, #(1 << (VA_BITS - 2)) - 1
+	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
+#endif
+
 	adr_l	x0, boot_args			// record the contents of
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
@@ -402,6 +422,9 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x5, x5, x23			// add KASLR displacement
+#endif
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -443,10 +466,52 @@ __mmap_switched:
 	str	xzr, [x6], #8			// Clear BSS
 	b	1b
 2:
+
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+
+#define R_AARCH64_RELATIVE	0x403
+#define R_AARCH64_ABS64		0x101
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
+	add	x15, x13, x15
+	str	x15, [x11, x23]
+	b	0b
+
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
+#endif
+
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x0, kimage_vaddr		// Save the offset between
 	sub	x24, x0, x24			// the kernel virtual and
 	str_l	x24, kimage_voffset, x0		// physical mappings
 
@@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot at EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -622,7 +691,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 96177a7c0f05..2faee6042e99 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
-	static struct vm_struct vmlinux_vm __initdata = {
-		.addr		= (void *)KIMAGE_VADDR,
-		.size		= 0,
-		.flags		= VM_IOREMAP,
-		.caller		= setup_arch,
-	};
-
-	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
-				   1 << SWAPPER_BLOCK_SHIFT);
-	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	static struct vm_struct vmlinux_vm __initdata;
+
+	vmlinux_vm.addr = (void *)kimage_vaddr;
+	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
+				   SWAPPER_BLOCK_SIZE);
+	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
+	vmlinux_vm.flags = VM_IOREMAP;
+	vmlinux_vm.caller = setup_arch;
+
 	vm_area_add_early(&vmlinux_vm);
 
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
@@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
 	conswitchp = &dummy_con;
 #endif
 #endif
-	if (boot_args[1] || boot_args[2] || boot_args[3]) {
+	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
+	    boot_args[2] || boot_args[3]) {
 		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
 			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
 			"This indicates a broken bootloader or old kernel\n",
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index f935f082188d..cc1486039338 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -148,6 +148,15 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..5ecbedefdb0f 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

This adds support for runtime relocation of the kernel Image, by
building it as a PIE (ET_DYN) executable and applying the dynamic
relocations in the early boot code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt |  3 +-
 arch/arm64/Kconfig              | 13 ++++
 arch/arm64/Makefile             |  6 +-
 arch/arm64/include/asm/memory.h |  3 +
 arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
 arch/arm64/kernel/setup.c       | 22 +++---
 arch/arm64/kernel/vmlinux.lds.S |  9 +++
 scripts/sortextable.c           |  4 +-
 8 files changed, 117 insertions(+), 18 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 03e02ebc1b0c..b17181eb4a43 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,8 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Relocatable kernel.
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 54eeab140bca..f458fb9e0dce 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
 	bool
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_RELOCATABLE_KERNEL
+	bool "Kernel address space layout randomization (KASLR)"
+	select ARM64_MODULE_PLTS
+	select ARM64_MODULE_CMODEL_LARGE
+	help
+	  This feature randomizes the virtual address of the kernel image, to
+	  harden against exploits that rely on knowledge about the absolute
+	  addresses of certain kernel data structures.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index d4654830e536..75dc477d45f5 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
@@ -41,7 +45,7 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 557228658666..afab3e669e19 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -121,6 +121,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01a33e42ed70..ab582ee58b58 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -59,8 +59,15 @@
 
 #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+#define __HEAD_FLAG_RELOC	1
+#else
+#define __HEAD_FLAG_RELOC	0
+#endif
+
 #define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+			 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+			 (__HEAD_FLAG_RELOC << 3))
 
 /*
  * Kernel startup entry point.
@@ -231,6 +238,9 @@ ENTRY(stext)
 	 */
 	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x27, x27, x23			// add KASLR displacement
+#endif
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
@@ -243,6 +253,16 @@ ENDPROC(stext)
 preserve_boot_args:
 	mov	x21, x0				// x21=FDT
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	/*
+	 * Mask off the bits of the random value supplied in x1 so it can serve
+	 * as a KASLR displacement value which will move the kernel image to a
+	 * random offset in the lower half of the VMALLOC area.
+	 */
+	mov	x23, #(1 << (VA_BITS - 2)) - 1
+	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
+#endif
+
 	adr_l	x0, boot_args			// record the contents of
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
@@ -402,6 +422,9 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x5, x5, x23			// add KASLR displacement
+#endif
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -443,10 +466,52 @@ __mmap_switched:
 	str	xzr, [x6], #8			// Clear BSS
 	b	1b
 2:
+
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+
+#define R_AARCH64_RELATIVE	0x403
+#define R_AARCH64_ABS64		0x101
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
+	add	x15, x13, x15
+	str	x15, [x11, x23]
+	b	0b
+
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
+#endif
+
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x0, kimage_vaddr		// Save the offset between
 	sub	x24, x0, x24			// the kernel virtual and
 	str_l	x24, kimage_voffset, x0		// physical mappings
 
@@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot@EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -622,7 +691,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 96177a7c0f05..2faee6042e99 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
-	static struct vm_struct vmlinux_vm __initdata = {
-		.addr		= (void *)KIMAGE_VADDR,
-		.size		= 0,
-		.flags		= VM_IOREMAP,
-		.caller		= setup_arch,
-	};
-
-	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
-				   1 << SWAPPER_BLOCK_SHIFT);
-	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	static struct vm_struct vmlinux_vm __initdata;
+
+	vmlinux_vm.addr = (void *)kimage_vaddr;
+	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
+				   SWAPPER_BLOCK_SIZE);
+	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
+	vmlinux_vm.flags = VM_IOREMAP;
+	vmlinux_vm.caller = setup_arch;
+
 	vm_area_add_early(&vmlinux_vm);
 
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
@@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
 	conswitchp = &dummy_con;
 #endif
 #endif
-	if (boot_args[1] || boot_args[2] || boot_args[3]) {
+	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
+	    boot_args[2] || boot_args[3]) {
 		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
 			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
 			"This indicates a broken bootloader or old kernel\n",
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index f935f082188d..cc1486039338 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -148,6 +148,15 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..5ecbedefdb0f 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

This adds support for runtime relocation of the kernel Image, by
building it as a PIE (ET_DYN) executable and applying the dynamic
relocations in the early boot code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 Documentation/arm64/booting.txt |  3 +-
 arch/arm64/Kconfig              | 13 ++++
 arch/arm64/Makefile             |  6 +-
 arch/arm64/include/asm/memory.h |  3 +
 arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
 arch/arm64/kernel/setup.c       | 22 +++---
 arch/arm64/kernel/vmlinux.lds.S |  9 +++
 scripts/sortextable.c           |  4 +-
 8 files changed, 117 insertions(+), 18 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 03e02ebc1b0c..b17181eb4a43 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,8 @@ Header notes:
 			1 - 4K
 			2 - 16K
 			3 - 64K
-  Bits 3-63:	Reserved.
+  Bit 3:	Relocatable kernel.
+  Bits 4-63:	Reserved.
 
 - When image_size is zero, a bootloader should attempt to keep as much
   memory as possible free for use by the kernel immediately after the
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 54eeab140bca..f458fb9e0dce 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
 	bool "Cortex-A53: 843419: A load or store might access an incorrect address"
 	depends on MODULES
 	default y
+	select ARM64_MODULE_CMODEL_LARGE
 	help
 	  This option builds kernel modules using the large memory model in
 	  order to avoid the use of the ADRP instruction, which can cause
@@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
 	bool
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config ARM64_MODULE_CMODEL_LARGE
+	bool
+
+config ARM64_RELOCATABLE_KERNEL
+	bool "Kernel address space layout randomization (KASLR)"
+	select ARM64_MODULE_PLTS
+	select ARM64_MODULE_CMODEL_LARGE
+	help
+	  This feature randomizes the virtual address of the kernel image, to
+	  harden against exploits that rely on knowledge about the absolute
+	  addresses of certain kernel data structures.
+
 endmenu
 
 menu "Boot options"
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index d4654830e536..75dc477d45f5 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 GZFLAGS		:=-9
 
+ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
+LDFLAGS_vmlinux		+= -pie
+endif
+
 KBUILD_DEFCONFIG := defconfig
 
 # Check for binutils support for specific extensions
@@ -41,7 +45,7 @@ endif
 
 CHECKFLAGS	+= -D__aarch64__
 
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
 KBUILD_CFLAGS_MODULE	+= -mcmodel=large
 endif
 
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 557228658666..afab3e669e19 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -121,6 +121,9 @@ extern phys_addr_t		memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
 #define PHYS_OFFSET		({ memstart_addr; })
 
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64			kimage_vaddr;
+
 /* the offset between the kernel virtual and physical mappings */
 extern u64			kimage_voffset;
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 01a33e42ed70..ab582ee58b58 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -59,8 +59,15 @@
 
 #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+#define __HEAD_FLAG_RELOC	1
+#else
+#define __HEAD_FLAG_RELOC	0
+#endif
+
 #define __HEAD_FLAGS	((__HEAD_FLAG_BE << 0) |	\
-			 (__HEAD_FLAG_PAGE_SIZE << 1))
+			 (__HEAD_FLAG_PAGE_SIZE << 1) |	\
+			 (__HEAD_FLAG_RELOC << 3))
 
 /*
  * Kernel startup entry point.
@@ -231,6 +238,9 @@ ENTRY(stext)
 	 */
 	ldr	x27, 0f				// address to jump to after
 						// MMU has been enabled
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x27, x27, x23			// add KASLR displacement
+#endif
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
@@ -243,6 +253,16 @@ ENDPROC(stext)
 preserve_boot_args:
 	mov	x21, x0				// x21=FDT
 
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	/*
+	 * Mask off the bits of the random value supplied in x1 so it can serve
+	 * as a KASLR displacement value which will move the kernel image to a
+	 * random offset in the lower half of the VMALLOC area.
+	 */
+	mov	x23, #(1 << (VA_BITS - 2)) - 1
+	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
+#endif
+
 	adr_l	x0, boot_args			// record the contents of
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
@@ -402,6 +422,9 @@ __create_page_tables:
 	 */
 	mov	x0, x26				// swapper_pg_dir
 	ldr	x5, =KIMAGE_VADDR
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+	add	x5, x5, x23			// add KASLR displacement
+#endif
 	create_pgd_entry x0, x5, x3, x6
 	ldr	w6, kernel_img_size
 	add	x6, x6, x5
@@ -443,10 +466,52 @@ __mmap_switched:
 	str	xzr, [x6], #8			// Clear BSS
 	b	1b
 2:
+
+#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
+
+#define R_AARCH64_RELATIVE	0x403
+#define R_AARCH64_ABS64		0x101
+
+	/*
+	 * Iterate over each entry in the relocation table, and apply the
+	 * relocations in place.
+	 */
+	adr_l	x8, __dynsym_start		// start of symbol table
+	adr_l	x9, __reloc_start		// start of reloc table
+	adr_l	x10, __reloc_end		// end of reloc table
+
+0:	cmp	x9, x10
+	b.hs	2f
+	ldp	x11, x12, [x9], #24
+	ldr	x13, [x9, #-8]
+	cmp	w12, #R_AARCH64_RELATIVE
+	b.ne	1f
+	add	x13, x13, x23			// relocate
+	str	x13, [x11, x23]
+	b	0b
+
+1:	cmp	w12, #R_AARCH64_ABS64
+	b.ne	0b
+	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
+	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
+	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
+	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
+	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
+	add	x14, x15, x23			// relocate
+	csel	x15, x14, x15, ne
+	add	x15, x13, x15
+	str	x15, [x11, x23]
+	b	0b
+
+2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
+	dc	cvac, x8			// value visible to secondaries
+	dsb	sy				// with MMU off
+#endif
+
 	adr_l	sp, initial_sp, x4
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 
-	ldr	x0, =KIMAGE_VADDR		// Save the offset between
+	ldr_l	x0, kimage_vaddr		// Save the offset between
 	sub	x24, x0, x24			// the kernel virtual and
 	str_l	x24, kimage_voffset, x0		// physical mappings
 
@@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
  * hotplug and needs to have the same protections as the text region
  */
 	.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+	.quad		_text - TEXT_OFFSET
+
 /*
  * If we're fortunate enough to boot at EL2, ensure that the world is
  * sane before dropping to EL1.
@@ -622,7 +691,7 @@ ENTRY(secondary_startup)
 	adrp	x26, swapper_pg_dir
 	bl	__cpu_setup			// initialise processor
 
-	ldr	x8, =KIMAGE_VADDR
+	ldr	x8, kimage_vaddr
 	ldr	w9, 0f
 	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
 	b	__enable_mmu
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 96177a7c0f05..2faee6042e99 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
 void __init setup_arch(char **cmdline_p)
 {
-	static struct vm_struct vmlinux_vm __initdata = {
-		.addr		= (void *)KIMAGE_VADDR,
-		.size		= 0,
-		.flags		= VM_IOREMAP,
-		.caller		= setup_arch,
-	};
-
-	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
-				   1 << SWAPPER_BLOCK_SHIFT);
-	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
+	static struct vm_struct vmlinux_vm __initdata;
+
+	vmlinux_vm.addr = (void *)kimage_vaddr;
+	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
+				   SWAPPER_BLOCK_SIZE);
+	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
+	vmlinux_vm.flags = VM_IOREMAP;
+	vmlinux_vm.caller = setup_arch;
+
 	vm_area_add_early(&vmlinux_vm);
 
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
@@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
 	conswitchp = &dummy_con;
 #endif
 #endif
-	if (boot_args[1] || boot_args[2] || boot_args[3]) {
+	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
+	    boot_args[2] || boot_args[3]) {
 		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
 			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
 			"This indicates a broken bootloader or old kernel\n",
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index f935f082188d..cc1486039338 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -148,6 +148,15 @@ SECTIONS
 	.altinstr_replacement : {
 		*(.altinstr_replacement)
 	}
+	.rela : ALIGN(8) {
+		__reloc_start = .;
+		*(.rela .rela*)
+		__reloc_end = .;
+	}
+	.dynsym : ALIGN(8) {
+		__dynsym_start = .;
+		*(.dynsym)
+	}
 
 	. = ALIGN(PAGE_SIZE);
 	__init_end = .;
diff --git a/scripts/sortextable.c b/scripts/sortextable.c
index af247c70fb66..5ecbedefdb0f 100644
--- a/scripts/sortextable.c
+++ b/scripts/sortextable.c
@@ -266,9 +266,9 @@ do_file(char const *const fname)
 		break;
 	}  /* end switch */
 	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
-	||  r2(&ehdr->e_type) != ET_EXEC
+	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
 	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
-		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
+		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
 		fail_file();
 	}
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2015-12-30 15:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in register x1 upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S             |   7 +-
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
 include/linux/efi.h                       |   5 +-
 4 files changed, 127 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index f82036e02485..f41073dde7e0 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -110,7 +110,7 @@ ENTRY(entry)
 2:
 	/* Jump to kernel entry point */
 	mov	x0, x20
-	mov	x1, xzr
+	ldr	x1, efi_rnd
 	mov	x2, xzr
 	mov	x3, xzr
 	br	x21
@@ -119,6 +119,9 @@ efi_load_fail:
 	mov	x0, #EFI_LOAD_ERROR
 	ldp	x29, x30, [sp], #32
 	ret
+ENDPROC(entry)
+
+ENTRY(efi_rnd)
+	.quad	0, 0
 
 entry_end:
-ENDPROC(entry)
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 950c87f5d279..f580bcdfae4f 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
 	out->output_string(out, str);
 }
 
-
 /*
  * This function handles the architcture specific differences between arm and
  * arm64 regarding where the kernel image must be loaded and any memory that
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..4e5c306346b4 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,68 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+struct efi_rng_protocol_t {
+	efi_status_t (*get_info)(struct efi_rng_protocol_t *,
+				 unsigned long *,
+				 efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
+				efi_guid_t *,
+				unsigned long,
+				u8 *out);
+};
+
+extern struct {
+	u64	virt_seed;
+	u64	phys_seed;
+} efi_rnd;
+
+static int efi_get_random_bytes(efi_system_table_t *sys_table)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol_t *rng;
+
+	status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
+						      (void **)&rng);
+	if (status == EFI_NOT_FOUND) {
+		pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+		return EFI_SUCCESS;
+	}
+
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
+}
+
+static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
+{
+	unsigned long map_size, desc_size;
+	efi_memory_desc_t *memory_map;
+	efi_status_t status;
+	int l;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	for (l = 0; l < map_size; l += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + l;
+
+		if (md->attribute & EFI_MEMORY_WB) {
+			u64 phys_end = md->phys_addr +
+				       md->num_pages * EFI_PAGE_SIZE;
+			if (phys_end > *top)
+				*top = phys_end;
+		}
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return EFI_SUCCESS;
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
 
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
+		status = efi_get_random_bytes(sys_table_arg);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+			return status;
+		}
+	}
+
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
@@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	if (preferred_offset < dram_base)
 		preferred_offset += SZ_2M;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		u64 dram_top = dram_base;
+
+		status = get_dram_top(sys_table_arg, &dram_top);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
+			return status;
+		}
+
+		kernel_memsize += SZ_2M;
+		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
+				    EFI_PAGE_SIZE;
 
 		/*
-		 * First, try a straight allocation at the preferred offset.
+		 * Use the random seed to scale the size and add it to the DRAM
+		 * base. Note that this may give suboptimal results on systems
+		 * with discontiguous DRAM regions with large holes between them.
+		 */
+		*reserve_addr = dram_base +
+			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
+					EFI_LOADER_DATA, nr_pages,
+					(efi_physical_addr_t *)reserve_addr);
+
+		*image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
+	} else {
+		/*
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
 		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
 			   EFI_PAGE_SIZE;
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
 					EFI_LOADER_DATA, nr_pages,
 					(efi_physical_addr_t *)reserve_addr);
+	}
+
+	if (status != EFI_SUCCESS) {
+		kernel_memsize += TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, kernel_memsize,
+				       SZ_2M, reserve_addr);
+
 		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
-
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
+	*reserve_size = kernel_memsize;
 
 	return EFI_SUCCESS;
 }
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in register x1 upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S             |   7 +-
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
 include/linux/efi.h                       |   5 +-
 4 files changed, 127 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index f82036e02485..f41073dde7e0 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -110,7 +110,7 @@ ENTRY(entry)
 2:
 	/* Jump to kernel entry point */
 	mov	x0, x20
-	mov	x1, xzr
+	ldr	x1, efi_rnd
 	mov	x2, xzr
 	mov	x3, xzr
 	br	x21
@@ -119,6 +119,9 @@ efi_load_fail:
 	mov	x0, #EFI_LOAD_ERROR
 	ldp	x29, x30, [sp], #32
 	ret
+ENDPROC(entry)
+
+ENTRY(efi_rnd)
+	.quad	0, 0
 
 entry_end:
-ENDPROC(entry)
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 950c87f5d279..f580bcdfae4f 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
 	out->output_string(out, str);
 }
 
-
 /*
  * This function handles the architcture specific differences between arm and
  * arm64 regarding where the kernel image must be loaded and any memory that
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..4e5c306346b4 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,68 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+struct efi_rng_protocol_t {
+	efi_status_t (*get_info)(struct efi_rng_protocol_t *,
+				 unsigned long *,
+				 efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
+				efi_guid_t *,
+				unsigned long,
+				u8 *out);
+};
+
+extern struct {
+	u64	virt_seed;
+	u64	phys_seed;
+} efi_rnd;
+
+static int efi_get_random_bytes(efi_system_table_t *sys_table)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol_t *rng;
+
+	status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
+						      (void **)&rng);
+	if (status == EFI_NOT_FOUND) {
+		pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+		return EFI_SUCCESS;
+	}
+
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
+}
+
+static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
+{
+	unsigned long map_size, desc_size;
+	efi_memory_desc_t *memory_map;
+	efi_status_t status;
+	int l;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	for (l = 0; l < map_size; l += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + l;
+
+		if (md->attribute & EFI_MEMORY_WB) {
+			u64 phys_end = md->phys_addr +
+				       md->num_pages * EFI_PAGE_SIZE;
+			if (phys_end > *top)
+				*top = phys_end;
+		}
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return EFI_SUCCESS;
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
 
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
+		status = efi_get_random_bytes(sys_table_arg);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+			return status;
+		}
+	}
+
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
@@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	if (preferred_offset < dram_base)
 		preferred_offset += SZ_2M;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		u64 dram_top = dram_base;
+
+		status = get_dram_top(sys_table_arg, &dram_top);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
+			return status;
+		}
+
+		kernel_memsize += SZ_2M;
+		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
+				    EFI_PAGE_SIZE;
 
 		/*
-		 * First, try a straight allocation@the preferred offset.
+		 * Use the random seed to scale the size and add it to the DRAM
+		 * base. Note that this may give suboptimal results on systems
+		 * with discontiguous DRAM regions with large holes between them.
+		 */
+		*reserve_addr = dram_base +
+			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
+					EFI_LOADER_DATA, nr_pages,
+					(efi_physical_addr_t *)reserve_addr);
+
+		*image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
+	} else {
+		/*
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
 		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
 			   EFI_PAGE_SIZE;
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
 					EFI_LOADER_DATA, nr_pages,
 					(efi_physical_addr_t *)reserve_addr);
+	}
+
+	if (status != EFI_SUCCESS) {
+		kernel_memsize += TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, kernel_memsize,
+				       SZ_2M, reserve_addr);
+
 		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
-
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
+	*reserve_size = kernel_memsize;
 
 	return EFI_SUCCESS;
 }
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2015-12-30 15:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2015-12-30 15:26 UTC (permalink / raw)
  To: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall, Ard Biesheuvel

Since arm64 does not use a decompressor that supplies an execution
environment where it is feasible to some extent to provide a source of
randomness, the arm64 KASLR kernel depends on the bootloader to supply
some random bits in register x1 upon kernel entry.

On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
some random bits. At the same time, use it to randomize the offset of the
kernel Image in physical memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S             |   7 +-
 drivers/firmware/efi/libstub/arm-stub.c   |   1 -
 drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
 include/linux/efi.h                       |   5 +-
 4 files changed, 127 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index f82036e02485..f41073dde7e0 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -110,7 +110,7 @@ ENTRY(entry)
 2:
 	/* Jump to kernel entry point */
 	mov	x0, x20
-	mov	x1, xzr
+	ldr	x1, efi_rnd
 	mov	x2, xzr
 	mov	x3, xzr
 	br	x21
@@ -119,6 +119,9 @@ efi_load_fail:
 	mov	x0, #EFI_LOAD_ERROR
 	ldp	x29, x30, [sp], #32
 	ret
+ENDPROC(entry)
+
+ENTRY(efi_rnd)
+	.quad	0, 0
 
 entry_end:
-ENDPROC(entry)
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index 950c87f5d279..f580bcdfae4f 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
 	out->output_string(out, str);
 }
 
-
 /*
  * This function handles the architcture specific differences between arm and
  * arm64 regarding where the kernel image must be loaded and any memory that
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 78dfbd34b6bf..4e5c306346b4 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -13,6 +13,68 @@
 #include <asm/efi.h>
 #include <asm/sections.h>
 
+struct efi_rng_protocol_t {
+	efi_status_t (*get_info)(struct efi_rng_protocol_t *,
+				 unsigned long *,
+				 efi_guid_t *);
+	efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
+				efi_guid_t *,
+				unsigned long,
+				u8 *out);
+};
+
+extern struct {
+	u64	virt_seed;
+	u64	phys_seed;
+} efi_rnd;
+
+static int efi_get_random_bytes(efi_system_table_t *sys_table)
+{
+	efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
+	efi_status_t status;
+	struct efi_rng_protocol_t *rng;
+
+	status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
+						      (void **)&rng);
+	if (status == EFI_NOT_FOUND) {
+		pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+		return EFI_SUCCESS;
+	}
+
+	if (status != EFI_SUCCESS)
+		return status;
+
+	return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
+}
+
+static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
+{
+	unsigned long map_size, desc_size;
+	efi_memory_desc_t *memory_map;
+	efi_status_t status;
+	int l;
+
+	status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
+				    &desc_size, NULL, NULL);
+	if (status != EFI_SUCCESS)
+		return status;
+
+	for (l = 0; l < map_size; l += desc_size) {
+		efi_memory_desc_t *md = (void *)memory_map + l;
+
+		if (md->attribute & EFI_MEMORY_WB) {
+			u64 phys_end = md->phys_addr +
+				       md->num_pages * EFI_PAGE_SIZE;
+			if (phys_end > *top)
+				*top = phys_end;
+		}
+	}
+
+	efi_call_early(free_pool, memory_map);
+
+	return EFI_SUCCESS;
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 					unsigned long *image_addr,
 					unsigned long *image_size,
@@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	void *old_image_addr = (void *)*image_addr;
 	unsigned long preferred_offset;
 
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
+		status = efi_get_random_bytes(sys_table_arg);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
+			return status;
+		}
+	}
+
 	/*
 	 * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
 	 * a 2 MB aligned base, which itself may be lower than dram_base, as
@@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 	if (preferred_offset < dram_base)
 		preferred_offset += SZ_2M;
 
-	/* Relocate the image, if required. */
 	kernel_size = _edata - _text;
-	if (*image_addr != preferred_offset) {
-		kernel_memsize = kernel_size + (_end - _edata);
+	kernel_memsize = kernel_size + (_end - _edata);
+
+	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
+		/*
+		 * If KASLR is enabled, and we have some randomness available,
+		 * locate the kernel at a randomized offset in physical memory.
+		 */
+		u64 dram_top = dram_base;
+
+		status = get_dram_top(sys_table_arg, &dram_top);
+		if (status != EFI_SUCCESS) {
+			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
+			return status;
+		}
+
+		kernel_memsize += SZ_2M;
+		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
+				    EFI_PAGE_SIZE;
 
 		/*
-		 * First, try a straight allocation at the preferred offset.
+		 * Use the random seed to scale the size and add it to the DRAM
+		 * base. Note that this may give suboptimal results on systems
+		 * with discontiguous DRAM regions with large holes between them.
+		 */
+		*reserve_addr = dram_base +
+			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+
+		status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
+					EFI_LOADER_DATA, nr_pages,
+					(efi_physical_addr_t *)reserve_addr);
+
+		*image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
+	} else {
+		/*
+		 * Else, try a straight allocation at the preferred offset.
 		 * This will work around the issue where, if dram_base == 0x0,
 		 * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
 		 * address of the allocation to be mistaken for a FAIL return
@@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
 		 * Mustang), we can still place the kernel at the address
 		 * 'dram_base + TEXT_OFFSET'.
 		 */
+		if (*image_addr == preferred_offset)
+			return EFI_SUCCESS;
+
 		*image_addr = *reserve_addr = preferred_offset;
 		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
 			   EFI_PAGE_SIZE;
 		status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
 					EFI_LOADER_DATA, nr_pages,
 					(efi_physical_addr_t *)reserve_addr);
+	}
+
+	if (status != EFI_SUCCESS) {
+		kernel_memsize += TEXT_OFFSET;
+		status = efi_low_alloc(sys_table_arg, kernel_memsize,
+				       SZ_2M, reserve_addr);
+
 		if (status != EFI_SUCCESS) {
-			kernel_memsize += TEXT_OFFSET;
-			status = efi_low_alloc(sys_table_arg, kernel_memsize,
-					       SZ_2M, reserve_addr);
-
-			if (status != EFI_SUCCESS) {
-				pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
-				return status;
-			}
-			*image_addr = *reserve_addr + TEXT_OFFSET;
+			pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
+			return status;
 		}
-		memcpy((void *)*image_addr, old_image_addr, kernel_size);
-		*reserve_size = kernel_memsize;
+		*image_addr = *reserve_addr + TEXT_OFFSET;
 	}
-
+	memcpy((void *)*image_addr, old_image_addr, kernel_size);
+	*reserve_size = kernel_memsize;
 
 	return EFI_SUCCESS;
 }
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 569b5a866bb1..13783fdc9bdd 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -299,7 +299,7 @@ typedef struct {
 	void *open_protocol_information;
 	void *protocols_per_handle;
 	void *locate_handle_buffer;
-	void *locate_protocol;
+	efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
 	void *install_multiple_protocol_interfaces;
 	void *uninstall_multiple_protocol_interfaces;
 	void *calculate_crc32;
@@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
 #define EFI_PROPERTIES_TABLE_GUID \
     EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
 
+#define EFI_RNG_PROTOCOL_GUID \
+    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-04 10:08     ` Marc Zyngier
  -1 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 10:08 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, mark.rutland, leif.lindholm, keescook,
	linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall

Hi Ard,

On 30/12/15 15:26, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This looks good to me, a few comments below.

> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);

Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
change more readable, but I have the feeling it would require an
intermediate #define...

>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);

Why don't you need to use kvm_ksym_ref here? Is the idmap treated
differently?

>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;

And that's it? Never actually defined? That's cunning! ;-)

>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

I have a couple of patches going in the exact opposite direction (making
arm more similar to arm64):

http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18

As at least the first patch is required to convert the 32bit HYP code to
C, I'd rather not change this in the 64bit code.

>  
>  #define IDMAP_TEXT					\
>  	. = ALIGN(SZ_4K);				\
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 47e5f0feaee8..99e5a403af4e 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -24,6 +24,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>  
>  #include "trace.h"
>  
> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>  
>  void kvm_arm_init_debug(void)
>  {
> -	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
> +	__this_cpu_write(mdcr_el2,
> +			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>  }
>  
>  /**
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index 487d6357b7e7..58f5a6521307 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>  		goto out;
>  	}
>  
> -	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
> +	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>  
>  	/*
>  	 * The ListRegs field is 5 bits, but there is a architectural
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 10:08     ` Marc Zyngier
  0 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 10:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

On 30/12/15 15:26, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This looks good to me, a few comments below.

> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);

Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
change more readable, but I have the feeling it would require an
intermediate #define...

>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);

Why don't you need to use kvm_ksym_ref here? Is the idmap treated
differently?

>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;

And that's it? Never actually defined? That's cunning! ;-)

>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

I have a couple of patches going in the exact opposite direction (making
arm more similar to arm64):

http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18

As at least the first patch is required to convert the 32bit HYP code to
C, I'd rather not change this in the 64bit code.

>  
>  #define IDMAP_TEXT					\
>  	. = ALIGN(SZ_4K);				\
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 47e5f0feaee8..99e5a403af4e 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -24,6 +24,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>  
>  #include "trace.h"
>  
> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>  
>  void kvm_arm_init_debug(void)
>  {
> -	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
> +	__this_cpu_write(mdcr_el2,
> +			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>  }
>  
>  /**
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index 487d6357b7e7..58f5a6521307 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>  		goto out;
>  	}
>  
> -	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
> +	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>  
>  	/*
>  	 * The ListRegs field is 5 bits, but there is a architectural
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 10:08     ` Marc Zyngier
  0 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 10:08 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, mark.rutland, leif.lindholm, keescook,
	linux-kernel
  Cc: stuart.yoder, bhupesh.sharma, arnd, christoffer.dall

Hi Ard,

On 30/12/15 15:26, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This looks good to me, a few comments below.

> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);

Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
change more readable, but I have the feeling it would require an
intermediate #define...

>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);

Why don't you need to use kvm_ksym_ref here? Is the idmap treated
differently?

>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;

And that's it? Never actually defined? That's cunning! ;-)

>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

I have a couple of patches going in the exact opposite direction (making
arm more similar to arm64):

http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18

As at least the first patch is required to convert the 32bit HYP code to
C, I'd rather not change this in the 64bit code.

>  
>  #define IDMAP_TEXT					\
>  	. = ALIGN(SZ_4K);				\
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 47e5f0feaee8..99e5a403af4e 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -24,6 +24,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>  
>  #include "trace.h"
>  
> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>  
>  void kvm_arm_init_debug(void)
>  {
> -	__this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
> +	__this_cpu_write(mdcr_el2,
> +			 kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>  }
>  
>  /**
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index 487d6357b7e7..58f5a6521307 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>  		goto out;
>  	}
>  
> -	ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
> +	ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>  
>  	/*
>  	 * The ListRegs field is 5 bits, but there is a architectural
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-04 10:08     ` Marc Zyngier
  (?)
@ 2016-01-04 10:31       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-04 10:31 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Christoffer Dall

On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Hi Ard,
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> This looks good to me, a few comments below.
>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>
> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
> change more readable, but I have the feeling it would require an
> intermediate #define...
>

Yes, we'd have to rename the actual kvm_call_hyp definition so we can
wrap it in a macro

And the call in __cpu_init_hyp_mode() would need to omit the macro,
since it passes a pointer into the linear mapping, not a kernel
symbol.
So if you think that's ok, I'm happy to change that.

>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>
> Why don't you need to use kvm_ksym_ref here? Is the idmap treated
> differently?
>

No, but we are taking the physical address, which ultimately produces
the same value whether we use kvm_ksym_ref() or not.

>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>
> And that's it? Never actually defined? That's cunning! ;-)
>
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> I have a couple of patches going in the exact opposite direction (making
> arm more similar to arm64):
>
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18
>
> As at least the first patch is required to convert the 32bit HYP code to
> C, I'd rather not change this in the 64bit code.
>

OK, I will align with those changes instead.


>>
>>  #define IDMAP_TEXT                                   \
>>       . = ALIGN(SZ_4K);                               \
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 47e5f0feaee8..99e5a403af4e 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -24,6 +24,7 @@
>>  #include <asm/kvm_asm.h>
>>  #include <asm/kvm_arm.h>
>>  #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_mmu.h>
>>
>>  #include "trace.h"
>>
>> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>>
>>  void kvm_arm_init_debug(void)
>>  {
>> -     __this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
>> +     __this_cpu_write(mdcr_el2,
>> +                      kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>>  }
>>
>>  /**
>> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
>> index 487d6357b7e7..58f5a6521307 100644
>> --- a/virt/kvm/arm/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic-v3.c
>> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>>               goto out;
>>       }
>>
>> -     ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
>> +     ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>>
>>       /*
>>        * The ListRegs field is 5 bits, but there is a architectural
>>
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 10:31       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-04 10:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Hi Ard,
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> This looks good to me, a few comments below.
>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>
> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
> change more readable, but I have the feeling it would require an
> intermediate #define...
>

Yes, we'd have to rename the actual kvm_call_hyp definition so we can
wrap it in a macro

And the call in __cpu_init_hyp_mode() would need to omit the macro,
since it passes a pointer into the linear mapping, not a kernel
symbol.
So if you think that's ok, I'm happy to change that.

>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>
> Why don't you need to use kvm_ksym_ref here? Is the idmap treated
> differently?
>

No, but we are taking the physical address, which ultimately produces
the same value whether we use kvm_ksym_ref() or not.

>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>
> And that's it? Never actually defined? That's cunning! ;-)
>
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> I have a couple of patches going in the exact opposite direction (making
> arm more similar to arm64):
>
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18
>
> As at least the first patch is required to convert the 32bit HYP code to
> C, I'd rather not change this in the 64bit code.
>

OK, I will align with those changes instead.


>>
>>  #define IDMAP_TEXT                                   \
>>       . = ALIGN(SZ_4K);                               \
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 47e5f0feaee8..99e5a403af4e 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -24,6 +24,7 @@
>>  #include <asm/kvm_asm.h>
>>  #include <asm/kvm_arm.h>
>>  #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_mmu.h>
>>
>>  #include "trace.h"
>>
>> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>>
>>  void kvm_arm_init_debug(void)
>>  {
>> -     __this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
>> +     __this_cpu_write(mdcr_el2,
>> +                      kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>>  }
>>
>>  /**
>> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
>> index 487d6357b7e7..58f5a6521307 100644
>> --- a/virt/kvm/arm/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic-v3.c
>> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>>               goto out;
>>       }
>>
>> -     ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
>> +     ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>>
>>       /*
>>        * The ListRegs field is 5 bits, but there is a architectural
>>
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 10:31       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-04 10:31 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Christoffer Dall

On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
> Hi Ard,
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> This looks good to me, a few comments below.
>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>
> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
> change more readable, but I have the feeling it would require an
> intermediate #define...
>

Yes, we'd have to rename the actual kvm_call_hyp definition so we can
wrap it in a macro

And the call in __cpu_init_hyp_mode() would need to omit the macro,
since it passes a pointer into the linear mapping, not a kernel
symbol.
So if you think that's ok, I'm happy to change that.

>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>
> Why don't you need to use kvm_ksym_ref here? Is the idmap treated
> differently?
>

No, but we are taking the physical address, which ultimately produces
the same value whether we use kvm_ksym_ref() or not.

>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>
> And that's it? Never actually defined? That's cunning! ;-)
>
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> I have a couple of patches going in the exact opposite direction (making
> arm more similar to arm64):
>
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=94a3d4d4ff1d8ad59f9150dfa9fdd1685ab03950
> http://git.kernel.org/cgit/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm/wsinc&id=44aec57b62dca67cf91f425e3707f257b9bbeb18
>
> As at least the first patch is required to convert the 32bit HYP code to
> C, I'd rather not change this in the 64bit code.
>

OK, I will align with those changes instead.


>>
>>  #define IDMAP_TEXT                                   \
>>       . = ALIGN(SZ_4K);                               \
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 47e5f0feaee8..99e5a403af4e 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -24,6 +24,7 @@
>>  #include <asm/kvm_asm.h>
>>  #include <asm/kvm_arm.h>
>>  #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_mmu.h>
>>
>>  #include "trace.h"
>>
>> @@ -72,7 +73,8 @@ static void restore_guest_debug_regs(struct kvm_vcpu *vcpu)
>>
>>  void kvm_arm_init_debug(void)
>>  {
>> -     __this_cpu_write(mdcr_el2, kvm_call_hyp(__kvm_get_mdcr_el2));
>> +     __this_cpu_write(mdcr_el2,
>> +                      kvm_call_hyp(kvm_ksym_ref(__kvm_get_mdcr_el2)));
>>  }
>>
>>  /**
>> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
>> index 487d6357b7e7..58f5a6521307 100644
>> --- a/virt/kvm/arm/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic-v3.c
>> @@ -247,7 +247,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>>               goto out;
>>       }
>>
>> -     ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
>> +     ich_vtr_el2 = kvm_call_hyp(kvm_ksym_ref(__vgic_v3_get_ich_vtr_el2));
>>
>>       /*
>>        * The ListRegs field is 5 bits, but there is a architectural
>>
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-04 10:31       ` Ard Biesheuvel
  (?)
@ 2016-01-04 11:02         ` Marc Zyngier
  -1 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 11:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Christoffer Dall

On 04/01/16 10:31, Ard Biesheuvel wrote:
> On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Hi Ard,
>>
>> On 30/12/15 15:26, Ard Biesheuvel wrote:
>>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>>> out of the linear mapping, we have to make sure that references to kernel
>>> symbols that are accessed via the HYP mapping are translated to their
>>> linear equivalent.
>>>
>>> To prevent inadvertent direct references from sneaking in later, change
>>> the type of all extern declarations to HYP kernel symbols to the opaque
>>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>>> and function references. This is not bullet proof, but at least forces the
>>> user to take the address explicitly rather than referencing it directly.
>>>
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> This looks good to me, a few comments below.
>>
>>> ---
>>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>>  arch/arm/kvm/arm.c               |  9 +++++----
>>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>>  arch/arm64/include/asm/virt.h    |  4 ----
>>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>>  arch/arm64/kvm/debug.c           |  4 +++-
>>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>>> index 194c91b610ff..484ffdf7c70b 100644
>>> --- a/arch/arm/include/asm/kvm_asm.h
>>> +++ b/arch/arm/include/asm/kvm_asm.h
>>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>>
>>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>>> +
>>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>>  #endif
>>>
>>>  #endif /* __ARM_KVM_ASM_H__ */
>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>> index 405aa1883307..412b363f79e9 100644
>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>> @@ -30,6 +30,8 @@
>>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>>  #define KERN_TO_HYP(kva)     (kva)
>>>
>>> +#define kvm_ksym_ref(kva)    (kva)
>>> +
>>>  /*
>>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>>   * shared across all the page-tables. Conveniently, we use the vectors
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index e06fd299de08..014b542ea658 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>>                * shareable domain to make sure all data structures are
>>>                * clean.
>>>                */
>>> -             kvm_call_hyp(__kvm_flush_vm_context);
>>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>>       }
>>>
>>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>               __kvm_guest_enter();
>>>               vcpu->mode = IN_GUEST_MODE;
>>>
>>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>>
>>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>>               /*
>>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>>       pgd_ptr = kvm_mmu_get_httbr();
>>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>>
>>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>>
>>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>>       /*
>>>        * Map the Hyp-code called directly from the host
>>>        */
>>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>>       if (err) {
>>>               kvm_err("Cannot map world-switch code\n");
>>>               goto out_free_mappings;
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 7dace909d5cf..7c448b943e3a 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -31,8 +31,6 @@
>>>
>>>  #include "trace.h"
>>>
>>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>> -
>>>  static pgd_t *boot_hyp_pgd;
>>>  static pgd_t *hyp_pgd;
>>>  static pgd_t *merged_hyp_pgd;
>>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>>   */
>>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>  {
>>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>
>> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
>> change more readable, but I have the feeling it would require an
>> intermediate #define...
>>
> 
> Yes, we'd have to rename the actual kvm_call_hyp definition so we can
> wrap it in a macro
> 
> And the call in __cpu_init_hyp_mode() would need to omit the macro,
> since it passes a pointer into the linear mapping, not a kernel
> symbol.
> So if you think that's ok, I'm happy to change that.

That'd be great, thanks.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 11:02         ` Marc Zyngier
  0 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/01/16 10:31, Ard Biesheuvel wrote:
> On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Hi Ard,
>>
>> On 30/12/15 15:26, Ard Biesheuvel wrote:
>>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>>> out of the linear mapping, we have to make sure that references to kernel
>>> symbols that are accessed via the HYP mapping are translated to their
>>> linear equivalent.
>>>
>>> To prevent inadvertent direct references from sneaking in later, change
>>> the type of all extern declarations to HYP kernel symbols to the opaque
>>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>>> and function references. This is not bullet proof, but at least forces the
>>> user to take the address explicitly rather than referencing it directly.
>>>
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> This looks good to me, a few comments below.
>>
>>> ---
>>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>>  arch/arm/kvm/arm.c               |  9 +++++----
>>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>>  arch/arm64/include/asm/virt.h    |  4 ----
>>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>>  arch/arm64/kvm/debug.c           |  4 +++-
>>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>>> index 194c91b610ff..484ffdf7c70b 100644
>>> --- a/arch/arm/include/asm/kvm_asm.h
>>> +++ b/arch/arm/include/asm/kvm_asm.h
>>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>>
>>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>>> +
>>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>>  #endif
>>>
>>>  #endif /* __ARM_KVM_ASM_H__ */
>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>> index 405aa1883307..412b363f79e9 100644
>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>> @@ -30,6 +30,8 @@
>>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>>  #define KERN_TO_HYP(kva)     (kva)
>>>
>>> +#define kvm_ksym_ref(kva)    (kva)
>>> +
>>>  /*
>>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>>   * shared across all the page-tables. Conveniently, we use the vectors
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index e06fd299de08..014b542ea658 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>>                * shareable domain to make sure all data structures are
>>>                * clean.
>>>                */
>>> -             kvm_call_hyp(__kvm_flush_vm_context);
>>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>>       }
>>>
>>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>               __kvm_guest_enter();
>>>               vcpu->mode = IN_GUEST_MODE;
>>>
>>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>>
>>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>>               /*
>>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>>       pgd_ptr = kvm_mmu_get_httbr();
>>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>>
>>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>>
>>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>>       /*
>>>        * Map the Hyp-code called directly from the host
>>>        */
>>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>>       if (err) {
>>>               kvm_err("Cannot map world-switch code\n");
>>>               goto out_free_mappings;
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 7dace909d5cf..7c448b943e3a 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -31,8 +31,6 @@
>>>
>>>  #include "trace.h"
>>>
>>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>> -
>>>  static pgd_t *boot_hyp_pgd;
>>>  static pgd_t *hyp_pgd;
>>>  static pgd_t *merged_hyp_pgd;
>>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>>   */
>>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>  {
>>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>
>> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
>> change more readable, but I have the feeling it would require an
>> intermediate #define...
>>
> 
> Yes, we'd have to rename the actual kvm_call_hyp definition so we can
> wrap it in a macro
> 
> And the call in __cpu_init_hyp_mode() would need to omit the macro,
> since it passes a pointer into the linear mapping, not a kernel
> symbol.
> So if you think that's ok, I'm happy to change that.

That'd be great, thanks.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-04 11:02         ` Marc Zyngier
  0 siblings, 0 replies; 156+ messages in thread
From: Marc Zyngier @ 2016-01-04 11:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Christoffer Dall

On 04/01/16 10:31, Ard Biesheuvel wrote:
> On 4 January 2016 at 11:08, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> Hi Ard,
>>
>> On 30/12/15 15:26, Ard Biesheuvel wrote:
>>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>>> out of the linear mapping, we have to make sure that references to kernel
>>> symbols that are accessed via the HYP mapping are translated to their
>>> linear equivalent.
>>>
>>> To prevent inadvertent direct references from sneaking in later, change
>>> the type of all extern declarations to HYP kernel symbols to the opaque
>>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>>> and function references. This is not bullet proof, but at least forces the
>>> user to take the address explicitly rather than referencing it directly.
>>>
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> This looks good to me, a few comments below.
>>
>>> ---
>>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>>  arch/arm/kvm/arm.c               |  9 +++++----
>>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>>  arch/arm64/include/asm/virt.h    |  4 ----
>>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>>  arch/arm64/kvm/debug.c           |  4 +++-
>>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>>> index 194c91b610ff..484ffdf7c70b 100644
>>> --- a/arch/arm/include/asm/kvm_asm.h
>>> +++ b/arch/arm/include/asm/kvm_asm.h
>>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>>
>>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>>> +
>>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>>  #endif
>>>
>>>  #endif /* __ARM_KVM_ASM_H__ */
>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>> index 405aa1883307..412b363f79e9 100644
>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>> @@ -30,6 +30,8 @@
>>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>>  #define KERN_TO_HYP(kva)     (kva)
>>>
>>> +#define kvm_ksym_ref(kva)    (kva)
>>> +
>>>  /*
>>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>>   * shared across all the page-tables. Conveniently, we use the vectors
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index e06fd299de08..014b542ea658 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -427,7 +42You can use it, but you don't have to, since yo7,7 @@ static void update_vttbr(struct kvm *kvm)
>>>                * shareable domain to make sure all data structures are
>>>                * clean.
>>>                */
>>> -             kvm_call_hyp(__kvm_flush_vm_context);
>>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>>       }
>>>
>>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>               __kvm_guest_enter();
>>>               vcpu->mode = IN_GUEST_MODE;
>>>
>>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>>
>>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>>               /*
>>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>>       pgd_ptr = kvm_mmu_get_httbr();
>>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>>
>>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>>
>>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>>       /*
>>>        * Map the Hyp-code called directly from the host
>>>        */
>>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>>       if (err) {
>>>               kvm_err("Cannot map world-switch code\n");
>>>               goto out_free_mappings;
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 7dace909d5cf..7c448b943e3a 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -31,8 +31,6 @@
>>>
>>>  #include "trace.h"
>>>
>>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>> -
>>>  static pgd_t *boot_hyp_pgd;
>>>  static pgd_t *hyp_pgd;
>>>  static pgd_t *merged_hyp_pgd;
>>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>>   */
>>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>  {
>>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>
>> Any chance we could bury kvm_ksym_ref in kvm_call_hyp? It may make the
>> change more readable, but I have the feeling it would require an
>> intermediate #define...
>>
> 
> Yes, we'd have to rename the actual kvm_call_hyp definition so we can
> wrap it in a macro
> 
> And the call in __cpu_init_hyp_mode() would need to omit the macro,
> since it passes a pointer into the linear mapping, not a kernel
> symbol.
> So if you think that's ok, I'm happy to change that.

That'd be great, thanks.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-05 14:36     ` Christoffer Dall
  -1 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> equal to PAGE_OFFSET, but in the future, it will be moved below it once
> we move the kernel virtual mapping out of the linear mapping.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/memory.h | 10 ++++++++--
>  arch/arm64/kernel/head.S        |  2 +-
>  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 853953cd1f08..bea9631b34a8 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -51,7 +51,8 @@
>  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
>  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
>  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> -#define MODULES_END		(PAGE_OFFSET)
> +#define KIMAGE_VADDR		(PAGE_OFFSET)
> +#define MODULES_END		(KIMAGE_VADDR)
>  #define MODULES_VADDR		(MODULES_END - SZ_64M)
>  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
>  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> @@ -75,8 +76,13 @@
>   * private definitions which should NOT be used outside memory.h
>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>   */
> -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> +#define __virt_to_phys(x) ({						\
> +	phys_addr_t __x = (phys_addr_t)(x);				\
> +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })

so __virt_to_phys will now work with a subset of the non-linear namely
all except vmalloced and ioremapped ones?

> +
>  #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
> +#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
>  
>  /*
>   * Convert a page to/from a physical address
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 23cfc08fc8ba..6434c844a0e4 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -389,7 +389,7 @@ __create_page_tables:
>  	 * Map the kernel image (starting with PHYS_OFFSET).
>  	 */
>  	mov	x0, x26				// swapper_pg_dir
> -	mov	x5, #PAGE_OFFSET
> +	ldr	x5, =KIMAGE_VADDR
>  	create_pgd_entry x0, x5, x3, x6
>  	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
>  	mov	x3, x24				// phys offset
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 7de6c39858a5..ced0dedcabcc 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -88,7 +88,7 @@ SECTIONS
>  		*(.discard.*)
>  	}
>  
> -	. = PAGE_OFFSET + TEXT_OFFSET;
> +	. = KIMAGE_VADDR + TEXT_OFFSET;
>  
>  	.head.text : {
>  		_text = .;
> @@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
>  /*
>   * If padding is applied before .head.text, virt<->phys conversions will fail.
>   */
> -ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
> +ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:36     ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> equal to PAGE_OFFSET, but in the future, it will be moved below it once
> we move the kernel virtual mapping out of the linear mapping.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/memory.h | 10 ++++++++--
>  arch/arm64/kernel/head.S        |  2 +-
>  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 853953cd1f08..bea9631b34a8 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -51,7 +51,8 @@
>  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
>  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
>  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> -#define MODULES_END		(PAGE_OFFSET)
> +#define KIMAGE_VADDR		(PAGE_OFFSET)
> +#define MODULES_END		(KIMAGE_VADDR)
>  #define MODULES_VADDR		(MODULES_END - SZ_64M)
>  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
>  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> @@ -75,8 +76,13 @@
>   * private definitions which should NOT be used outside memory.h
>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>   */
> -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> +#define __virt_to_phys(x) ({						\
> +	phys_addr_t __x = (phys_addr_t)(x);				\
> +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })

so __virt_to_phys will now work with a subset of the non-linear namely
all except vmalloced and ioremapped ones?

> +
>  #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
> +#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
>  
>  /*
>   * Convert a page to/from a physical address
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 23cfc08fc8ba..6434c844a0e4 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -389,7 +389,7 @@ __create_page_tables:
>  	 * Map the kernel image (starting with PHYS_OFFSET).
>  	 */
>  	mov	x0, x26				// swapper_pg_dir
> -	mov	x5, #PAGE_OFFSET
> +	ldr	x5, =KIMAGE_VADDR
>  	create_pgd_entry x0, x5, x3, x6
>  	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
>  	mov	x3, x24				// phys offset
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 7de6c39858a5..ced0dedcabcc 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -88,7 +88,7 @@ SECTIONS
>  		*(.discard.*)
>  	}
>  
> -	. = PAGE_OFFSET + TEXT_OFFSET;
> +	. = KIMAGE_VADDR + TEXT_OFFSET;
>  
>  	.head.text : {
>  		_text = .;
> @@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
>  /*
>   * If padding is applied before .head.text, virt<->phys conversions will fail.
>   */
> -ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
> +ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:36     ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> equal to PAGE_OFFSET, but in the future, it will be moved below it once
> we move the kernel virtual mapping out of the linear mapping.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/include/asm/memory.h | 10 ++++++++--
>  arch/arm64/kernel/head.S        |  2 +-
>  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 853953cd1f08..bea9631b34a8 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -51,7 +51,8 @@
>  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
>  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
>  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> -#define MODULES_END		(PAGE_OFFSET)
> +#define KIMAGE_VADDR		(PAGE_OFFSET)
> +#define MODULES_END		(KIMAGE_VADDR)
>  #define MODULES_VADDR		(MODULES_END - SZ_64M)
>  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
>  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> @@ -75,8 +76,13 @@
>   * private definitions which should NOT be used outside memory.h
>   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
>   */
> -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> +#define __virt_to_phys(x) ({						\
> +	phys_addr_t __x = (phys_addr_t)(x);				\
> +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })

so __virt_to_phys will now work with a subset of the non-linear namely
all except vmalloced and ioremapped ones?

> +
>  #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
> +#define __phys_to_kimg(x)	((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR))
>  
>  /*
>   * Convert a page to/from a physical address
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 23cfc08fc8ba..6434c844a0e4 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -389,7 +389,7 @@ __create_page_tables:
>  	 * Map the kernel image (starting with PHYS_OFFSET).
>  	 */
>  	mov	x0, x26				// swapper_pg_dir
> -	mov	x5, #PAGE_OFFSET
> +	ldr	x5, =KIMAGE_VADDR
>  	create_pgd_entry x0, x5, x3, x6
>  	ldr	x6, =KERNEL_END			// __va(KERNEL_END)
>  	mov	x3, x24				// phys offset
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 7de6c39858a5..ced0dedcabcc 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -88,7 +88,7 @@ SECTIONS
>  		*(.discard.*)
>  	}
>  
> -	. = PAGE_OFFSET + TEXT_OFFSET;
> +	. = KIMAGE_VADDR + TEXT_OFFSET;
>  
>  	.head.text : {
>  		_text = .;
> @@ -185,4 +185,4 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
>  /*
>   * If padding is applied before .head.text, virt<->phys conversions will fail.
>   */
> -ASSERT(_text == (PAGE_OFFSET + TEXT_OFFSET), "HEAD is misaligned")
> +ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-05 14:41     ` Christoffer Dall
  -1 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:41 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;
>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

why this rename?

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:41     ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;
>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

why this rename?

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:41     ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:41 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> out of the linear mapping, we have to make sure that references to kernel
> symbols that are accessed via the HYP mapping are translated to their
> linear equivalent.
> 
> To prevent inadvertent direct references from sneaking in later, change
> the type of all extern declarations to HYP kernel symbols to the opaque
> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> and function references. This is not bullet proof, but at least forces the
> user to take the address explicitly rather than referencing it directly.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm/kvm/arm.c               |  9 +++++----
>  arch/arm/kvm/mmu.c               | 12 +++++------
>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  arch/arm64/include/asm/virt.h    |  4 ----
>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>  arch/arm64/kvm/debug.c           |  4 +++-
>  virt/kvm/arm/vgic-v3.c           |  2 +-
>  10 files changed, 34 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 194c91b610ff..484ffdf7c70b 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +
> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>  #endif
>  
>  #endif /* __ARM_KVM_ASM_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..412b363f79e9 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -30,6 +30,8 @@
>  #define HYP_PAGE_OFFSET		PAGE_OFFSET
>  #define KERN_TO_HYP(kva)	(kva)
>  
> +#define kvm_ksym_ref(kva)	(kva)
> +
>  /*
>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>   * shared across all the page-tables. Conveniently, we use the vectors
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e06fd299de08..014b542ea658 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>  		 * shareable domain to make sure all data structures are
>  		 * clean.
>  		 */
> -		kvm_call_hyp(__kvm_flush_vm_context);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>  	}
>  
>  	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		__kvm_guest_enter();
>  		vcpu->mode = IN_GUEST_MODE;
>  
> -		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> +		ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		/*
> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>  	pgd_ptr = kvm_mmu_get_httbr();
>  	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>  	hyp_stack_ptr = stack_page + PAGE_SIZE;
> -	vector_ptr = (unsigned long)__kvm_hyp_vector;
> +	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>  
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>  	/*
>  	 * Map the Hyp-code called directly from the host
>  	 */
> -	err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> +	err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> +				  kvm_ksym_ref(__kvm_hyp_code_end));
>  	if (err) {
>  		kvm_err("Cannot map world-switch code\n");
>  		goto out_free_mappings;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 7dace909d5cf..7c448b943e3a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -31,8 +31,6 @@
>  
>  #include "trace.h"
>  
> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> -
>  static pgd_t *boot_hyp_pgd;
>  static pgd_t *hyp_pgd;
>  static pgd_t *merged_hyp_pgd;
> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>   */
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +	kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  	 * anything there.
>  	 */
>  	if (kvm)
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> +		kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>  }
>  
>  /*
> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>  {
>  	int err;
>  
> -	hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> -	hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> -	hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> +	hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> +	hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> +	hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>  
>  	/*
>  	 * We rely on the linker script to ensure at build time that the HYP
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 5e377101f919..830402f847e0 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -105,24 +105,27 @@
>  #ifndef __ASSEMBLY__
>  struct kvm;
>  struct kvm_vcpu;
> +struct kvm_ksym;
>  
>  extern char __kvm_hyp_init[];
>  extern char __kvm_hyp_init_end[];
>  
> -extern char __kvm_hyp_vector[];
> +extern struct kvm_ksym __kvm_hyp_vector;
>  
> -#define	__kvm_hyp_code_start	__hyp_text_start
> -#define	__kvm_hyp_code_end	__hyp_text_end
> +extern struct kvm_ksym __kvm_hyp_code_start;
> +extern struct kvm_ksym __kvm_hyp_code_end;
>  
> -extern void __kvm_flush_vm_context(void);
> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> +extern struct kvm_ksym __kvm_flush_vm_context;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>  
> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> +extern struct kvm_ksym __kvm_vcpu_run;
>  
> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>  
> -extern u32 __kvm_get_mdcr_el2(void);
> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> +
> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>  
>  #endif
>  
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..0899026a2821 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -73,6 +73,8 @@
>  
>  #define KERN_TO_HYP(kva)	((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>  
> +#define kvm_ksym_ref(sym)	((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> +
>  /*
>   * We currently only support a 40bit IPA.
>   */
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 7a5df5252dd7..215ad4649dd7 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>  	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>  }
>  
> -/* The section containing the hypervisor text */
> -extern char __hyp_text_start[];
> -extern char __hyp_text_end[];
> -
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* ! __ASM__VIRT_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 363c2f529951..f935f082188d 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>  	VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;	\
>  	*(.hyp.idmap.text)				\
>  	VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;	\
> -	VMLINUX_SYMBOL(__hyp_text_start) = .;		\
> +	VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;	\
>  	*(.hyp.text)					\
> -	VMLINUX_SYMBOL(__hyp_text_end) = .;
> +	VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;

why this rename?

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  2016-01-05 14:36     ` Christoffer Dall
  (?)
@ 2016-01-05 14:46       ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-05 14:46 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > we move the kernel virtual mapping out of the linear mapping.
> > 
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > ---
> >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> >  arch/arm64/kernel/head.S        |  2 +-
> >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> >  3 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 853953cd1f08..bea9631b34a8 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -51,7 +51,8 @@
> >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > -#define MODULES_END		(PAGE_OFFSET)
> > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > +#define MODULES_END		(KIMAGE_VADDR)
> >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > @@ -75,8 +76,13 @@
> >   * private definitions which should NOT be used outside memory.h
> >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> >   */
> > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > +#define __virt_to_phys(x) ({						\
> > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> 
> so __virt_to_phys will now work with a subset of the non-linear namely
> all except vmalloced and ioremapped ones?

It will work for linear mapped memory and for the kernel image, which is
what it used to do. It's just that the relationship between the image
and the linear map is broken.

The same rules apply to x86, where their virt_to_phys eventually boils down to:

static inline unsigned long __phys_addr_nodebug(unsigned long x)
{
        unsigned long y = x - __START_KERNEL_map;

        /* use the carry flag to determine if x was < __START_KERNEL_map */
        x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));

        return x;
}

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:46       ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-05 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > we move the kernel virtual mapping out of the linear mapping.
> > 
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > ---
> >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> >  arch/arm64/kernel/head.S        |  2 +-
> >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> >  3 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 853953cd1f08..bea9631b34a8 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -51,7 +51,8 @@
> >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > -#define MODULES_END		(PAGE_OFFSET)
> > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > +#define MODULES_END		(KIMAGE_VADDR)
> >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > @@ -75,8 +76,13 @@
> >   * private definitions which should NOT be used outside memory.h
> >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> >   */
> > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > +#define __virt_to_phys(x) ({						\
> > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> 
> so __virt_to_phys will now work with a subset of the non-linear namely
> all except vmalloced and ioremapped ones?

It will work for linear mapped memory and for the kernel image, which is
what it used to do. It's just that the relationship between the image
and the linear map is broken.

The same rules apply to x86, where their virt_to_phys eventually boils down to:

static inline unsigned long __phys_addr_nodebug(unsigned long x)
{
        unsigned long y = x - __START_KERNEL_map;

        /* use the carry flag to determine if x was < __START_KERNEL_map */
        x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));

        return x;
}

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:46       ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-05 14:46 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > we move the kernel virtual mapping out of the linear mapping.
> > 
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > ---
> >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> >  arch/arm64/kernel/head.S        |  2 +-
> >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> >  3 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 853953cd1f08..bea9631b34a8 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -51,7 +51,8 @@
> >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > -#define MODULES_END		(PAGE_OFFSET)
> > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > +#define MODULES_END		(KIMAGE_VADDR)
> >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > @@ -75,8 +76,13 @@
> >   * private definitions which should NOT be used outside memory.h
> >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> >   */
> > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > +#define __virt_to_phys(x) ({						\
> > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> 
> so __virt_to_phys will now work with a subset of the non-linear namely
> all except vmalloced and ioremapped ones?

It will work for linear mapped memory and for the kernel image, which is
what it used to do. It's just that the relationship between the image
and the linear map is broken.

The same rules apply to x86, where their virt_to_phys eventually boils down to:

static inline unsigned long __phys_addr_nodebug(unsigned long x)
{
        unsigned long y = x - __START_KERNEL_map;

        /* use the carry flag to determine if x was < __START_KERNEL_map */
        x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));

        return x;
}

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-05 14:41     ` Christoffer Dall
  (?)
@ 2016-01-05 14:51       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 14:51 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier

On 5 January 2016 at 15:41, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> why this rename?
>

I already got rid of it based on Marc's feedback. The only reason was
to align between ARM and arm64, but he is already doing the same in
the opposite direction

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 14:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 January 2016 at 15:41, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> why this rename?
>

I already got rid of it based on Marc's feedback. The only reason was
to align between ARM and arm64, but he is already doing the same in
the opposite direction

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 14:51 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier

On 5 January 2016 at 15:41, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
>> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
>> the HYP mapping at EL2. Before we can move the kernel virtual mapping
>> out of the linear mapping, we have to make sure that references to kernel
>> symbols that are accessed via the HYP mapping are translated to their
>> linear equivalent.
>>
>> To prevent inadvertent direct references from sneaking in later, change
>> the type of all extern declarations to HYP kernel symbols to the opaque
>> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
>> and function references. This is not bullet proof, but at least forces the
>> user to take the address explicitly rather than referencing it directly.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_asm.h   |  2 ++
>>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>>  arch/arm/kvm/arm.c               |  9 +++++----
>>  arch/arm/kvm/mmu.c               | 12 +++++------
>>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
>>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>>  arch/arm64/include/asm/virt.h    |  4 ----
>>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
>>  arch/arm64/kvm/debug.c           |  4 +++-
>>  virt/kvm/arm/vgic-v3.c           |  2 +-
>>  10 files changed, 34 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 194c91b610ff..484ffdf7c70b 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +
>> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>>  #endif
>>
>>  #endif /* __ARM_KVM_ASM_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 405aa1883307..412b363f79e9 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -30,6 +30,8 @@
>>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
>>  #define KERN_TO_HYP(kva)     (kva)
>>
>> +#define kvm_ksym_ref(kva)    (kva)
>> +
>>  /*
>>   * Our virtual mapping for the boot-time MMU-enable code. Must be
>>   * shared across all the page-tables. Conveniently, we use the vectors
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index e06fd299de08..014b542ea658 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
>>                * shareable domain to make sure all data structures are
>>                * clean.
>>                */
>> -             kvm_call_hyp(__kvm_flush_vm_context);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
>>       }
>>
>>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>               __kvm_guest_enter();
>>               vcpu->mode = IN_GUEST_MODE;
>>
>> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
>>
>>               vcpu->mode = OUTSIDE_GUEST_MODE;
>>               /*
>> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
>>       pgd_ptr = kvm_mmu_get_httbr();
>>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
>>       hyp_stack_ptr = stack_page + PAGE_SIZE;
>> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
>> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
>>
>>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>>
>> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
>>       /*
>>        * Map the Hyp-code called directly from the host
>>        */
>> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
>> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
>> +                               kvm_ksym_ref(__kvm_hyp_code_end));
>>       if (err) {
>>               kvm_err("Cannot map world-switch code\n");
>>               goto out_free_mappings;
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 7dace909d5cf..7c448b943e3a 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -31,8 +31,6 @@
>>
>>  #include "trace.h"
>>
>> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>> -
>>  static pgd_t *boot_hyp_pgd;
>>  static pgd_t *hyp_pgd;
>>  static pgd_t *merged_hyp_pgd;
>> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>>   */
>>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>>  {
>> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
>>  }
>>
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>        * anything there.
>>        */
>>       if (kvm)
>> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
>>  }
>>
>>  /*
>> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
>>  {
>>       int err;
>>
>> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
>> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
>> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
>> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
>> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
>> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
>>
>>       /*
>>        * We rely on the linker script to ensure at build time that the HYP
>> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
>> index 5e377101f919..830402f847e0 100644
>> --- a/arch/arm64/include/asm/kvm_asm.h
>> +++ b/arch/arm64/include/asm/kvm_asm.h
>> @@ -105,24 +105,27 @@
>>  #ifndef __ASSEMBLY__
>>  struct kvm;
>>  struct kvm_vcpu;
>> +struct kvm_ksym;
>>
>>  extern char __kvm_hyp_init[];
>>  extern char __kvm_hyp_init_end[];
>>
>> -extern char __kvm_hyp_vector[];
>> +extern struct kvm_ksym __kvm_hyp_vector;
>>
>> -#define      __kvm_hyp_code_start    __hyp_text_start
>> -#define      __kvm_hyp_code_end      __hyp_text_end
>> +extern struct kvm_ksym __kvm_hyp_code_start;
>> +extern struct kvm_ksym __kvm_hyp_code_end;
>>
>> -extern void __kvm_flush_vm_context(void);
>> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>> +extern struct kvm_ksym __kvm_flush_vm_context;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
>> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
>>
>> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>> +extern struct kvm_ksym __kvm_vcpu_run;
>>
>> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
>> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
>>
>> -extern u32 __kvm_get_mdcr_el2(void);
>> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
>> +
>> +extern struct kvm_ksym __kvm_get_mdcr_el2;
>>
>>  #endif
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 61505676d085..0899026a2821 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -73,6 +73,8 @@
>>
>>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
>>
>> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
>> +
>>  /*
>>   * We currently only support a 40bit IPA.
>>   */
>> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
>> index 7a5df5252dd7..215ad4649dd7 100644
>> --- a/arch/arm64/include/asm/virt.h
>> +++ b/arch/arm64/include/asm/virt.h
>> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
>>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
>>  }
>>
>> -/* The section containing the hypervisor text */
>> -extern char __hyp_text_start[];
>> -extern char __hyp_text_end[];
>> -
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif /* ! __ASM__VIRT_H */
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index 363c2f529951..f935f082188d 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
>>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
>>       *(.hyp.idmap.text)                              \
>>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
>> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
>>       *(.hyp.text)                                    \
>> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
>> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
>
> why this rename?
>

I already got rid of it based on Marc's feedback. The only reason was
to align between ARM and arm64, but he is already doing the same in
the opposite direction

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
  2016-01-05 14:51       ` Ard Biesheuvel
  (?)
@ 2016-01-05 14:56         ` Christoffer Dall
  -1 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier

On Tue, Jan 05, 2016 at 03:51:58PM +0100, Ard Biesheuvel wrote:
> On 5 January 2016 at 15:41, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> >> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> >> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> >> out of the linear mapping, we have to make sure that references to kernel
> >> symbols that are accessed via the HYP mapping are translated to their
> >> linear equivalent.
> >>
> >> To prevent inadvertent direct references from sneaking in later, change
> >> the type of all extern declarations to HYP kernel symbols to the opaque
> >> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> >> and function references. This is not bullet proof, but at least forces the
> >> user to take the address explicitly rather than referencing it directly.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_asm.h   |  2 ++
> >>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
> >>  arch/arm/kvm/arm.c               |  9 +++++----
> >>  arch/arm/kvm/mmu.c               | 12 +++++------
> >>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
> >>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
> >>  arch/arm64/include/asm/virt.h    |  4 ----
> >>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
> >>  arch/arm64/kvm/debug.c           |  4 +++-
> >>  virt/kvm/arm/vgic-v3.c           |  2 +-
> >>  10 files changed, 34 insertions(+), 28 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> >> index 194c91b610ff..484ffdf7c70b 100644
> >> --- a/arch/arm/include/asm/kvm_asm.h
> >> +++ b/arch/arm/include/asm/kvm_asm.h
> >> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >>
> >>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +
> >> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >>  #endif
> >>
> >>  #endif /* __ARM_KVM_ASM_H__ */
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index 405aa1883307..412b363f79e9 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -30,6 +30,8 @@
> >>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
> >>  #define KERN_TO_HYP(kva)     (kva)
> >>
> >> +#define kvm_ksym_ref(kva)    (kva)
> >> +
> >>  /*
> >>   * Our virtual mapping for the boot-time MMU-enable code. Must be
> >>   * shared across all the page-tables. Conveniently, we use the vectors
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index e06fd299de08..014b542ea658 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
> >>                * shareable domain to make sure all data structures are
> >>                * clean.
> >>                */
> >> -             kvm_call_hyp(__kvm_flush_vm_context);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
> >>       }
> >>
> >>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>               __kvm_guest_enter();
> >>               vcpu->mode = IN_GUEST_MODE;
> >>
> >> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> >> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
> >>
> >>               vcpu->mode = OUTSIDE_GUEST_MODE;
> >>               /*
> >> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
> >>       pgd_ptr = kvm_mmu_get_httbr();
> >>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
> >>       hyp_stack_ptr = stack_page + PAGE_SIZE;
> >> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
> >> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
> >>
> >>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
> >>
> >> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
> >>       /*
> >>        * Map the Hyp-code called directly from the host
> >>        */
> >> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> >> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> >> +                               kvm_ksym_ref(__kvm_hyp_code_end));
> >>       if (err) {
> >>               kvm_err("Cannot map world-switch code\n");
> >>               goto out_free_mappings;
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 7dace909d5cf..7c448b943e3a 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -31,8 +31,6 @@
> >>
> >>  #include "trace.h"
> >>
> >> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >> -
> >>  static pgd_t *boot_hyp_pgd;
> >>  static pgd_t *hyp_pgd;
> >>  static pgd_t *merged_hyp_pgd;
> >> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
> >>   */
> >>  void kvm_flush_remote_tlbs(struct kvm *kvm)
> >>  {
> >> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
> >>  }
> >>
> >>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >>        * anything there.
> >>        */
> >>       if (kvm)
> >> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
> >>  }
> >>
> >>  /*
> >> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
> >>  {
> >>       int err;
> >>
> >> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> >> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> >> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> >> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> >> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> >> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
> >>
> >>       /*
> >>        * We rely on the linker script to ensure at build time that the HYP
> >> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >> index 5e377101f919..830402f847e0 100644
> >> --- a/arch/arm64/include/asm/kvm_asm.h
> >> +++ b/arch/arm64/include/asm/kvm_asm.h
> >> @@ -105,24 +105,27 @@
> >>  #ifndef __ASSEMBLY__
> >>  struct kvm;
> >>  struct kvm_vcpu;
> >> +struct kvm_ksym;
> >>
> >>  extern char __kvm_hyp_init[];
> >>  extern char __kvm_hyp_init_end[];
> >>
> >> -extern char __kvm_hyp_vector[];
> >> +extern struct kvm_ksym __kvm_hyp_vector;
> >>
> >> -#define      __kvm_hyp_code_start    __hyp_text_start
> >> -#define      __kvm_hyp_code_end      __hyp_text_end
> >> +extern struct kvm_ksym __kvm_hyp_code_start;
> >> +extern struct kvm_ksym __kvm_hyp_code_end;
> >>
> >> -extern void __kvm_flush_vm_context(void);
> >> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >> +extern struct kvm_ksym __kvm_flush_vm_context;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
> >>
> >> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +extern struct kvm_ksym __kvm_vcpu_run;
> >>
> >> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> >> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
> >>
> >> -extern u32 __kvm_get_mdcr_el2(void);
> >> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> >> +
> >> +extern struct kvm_ksym __kvm_get_mdcr_el2;
> >>
> >>  #endif
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 61505676d085..0899026a2821 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -73,6 +73,8 @@
> >>
> >>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
> >>
> >> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> >> +
> >>  /*
> >>   * We currently only support a 40bit IPA.
> >>   */
> >> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> >> index 7a5df5252dd7..215ad4649dd7 100644
> >> --- a/arch/arm64/include/asm/virt.h
> >> +++ b/arch/arm64/include/asm/virt.h
> >> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
> >>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
> >>  }
> >>
> >> -/* The section containing the hypervisor text */
> >> -extern char __hyp_text_start[];
> >> -extern char __hyp_text_end[];
> >> -
> >>  #endif /* __ASSEMBLY__ */
> >>
> >>  #endif /* ! __ASM__VIRT_H */
> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> >> index 363c2f529951..f935f082188d 100644
> >> --- a/arch/arm64/kernel/vmlinux.lds.S
> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
> >> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
> >>       *(.hyp.idmap.text)                              \
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
> >> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
> >>       *(.hyp.text)                                    \
> >> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
> >
> > why this rename?
> >
> 
> I already got rid of it based on Marc's feedback. The only reason was
> to align between ARM and arm64, but he is already doing the same in
> the opposite direction

ah, now I understand what Marc was referring to in his comment, thanks.

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:56         ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 03:51:58PM +0100, Ard Biesheuvel wrote:
> On 5 January 2016 at 15:41, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> >> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> >> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> >> out of the linear mapping, we have to make sure that references to kernel
> >> symbols that are accessed via the HYP mapping are translated to their
> >> linear equivalent.
> >>
> >> To prevent inadvertent direct references from sneaking in later, change
> >> the type of all extern declarations to HYP kernel symbols to the opaque
> >> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> >> and function references. This is not bullet proof, but at least forces the
> >> user to take the address explicitly rather than referencing it directly.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_asm.h   |  2 ++
> >>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
> >>  arch/arm/kvm/arm.c               |  9 +++++----
> >>  arch/arm/kvm/mmu.c               | 12 +++++------
> >>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
> >>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
> >>  arch/arm64/include/asm/virt.h    |  4 ----
> >>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
> >>  arch/arm64/kvm/debug.c           |  4 +++-
> >>  virt/kvm/arm/vgic-v3.c           |  2 +-
> >>  10 files changed, 34 insertions(+), 28 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> >> index 194c91b610ff..484ffdf7c70b 100644
> >> --- a/arch/arm/include/asm/kvm_asm.h
> >> +++ b/arch/arm/include/asm/kvm_asm.h
> >> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >>
> >>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +
> >> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >>  #endif
> >>
> >>  #endif /* __ARM_KVM_ASM_H__ */
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index 405aa1883307..412b363f79e9 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -30,6 +30,8 @@
> >>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
> >>  #define KERN_TO_HYP(kva)     (kva)
> >>
> >> +#define kvm_ksym_ref(kva)    (kva)
> >> +
> >>  /*
> >>   * Our virtual mapping for the boot-time MMU-enable code. Must be
> >>   * shared across all the page-tables. Conveniently, we use the vectors
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index e06fd299de08..014b542ea658 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
> >>                * shareable domain to make sure all data structures are
> >>                * clean.
> >>                */
> >> -             kvm_call_hyp(__kvm_flush_vm_context);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
> >>       }
> >>
> >>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>               __kvm_guest_enter();
> >>               vcpu->mode = IN_GUEST_MODE;
> >>
> >> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> >> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
> >>
> >>               vcpu->mode = OUTSIDE_GUEST_MODE;
> >>               /*
> >> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
> >>       pgd_ptr = kvm_mmu_get_httbr();
> >>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
> >>       hyp_stack_ptr = stack_page + PAGE_SIZE;
> >> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
> >> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
> >>
> >>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
> >>
> >> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
> >>       /*
> >>        * Map the Hyp-code called directly from the host
> >>        */
> >> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> >> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> >> +                               kvm_ksym_ref(__kvm_hyp_code_end));
> >>       if (err) {
> >>               kvm_err("Cannot map world-switch code\n");
> >>               goto out_free_mappings;
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 7dace909d5cf..7c448b943e3a 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -31,8 +31,6 @@
> >>
> >>  #include "trace.h"
> >>
> >> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >> -
> >>  static pgd_t *boot_hyp_pgd;
> >>  static pgd_t *hyp_pgd;
> >>  static pgd_t *merged_hyp_pgd;
> >> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
> >>   */
> >>  void kvm_flush_remote_tlbs(struct kvm *kvm)
> >>  {
> >> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
> >>  }
> >>
> >>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >>        * anything there.
> >>        */
> >>       if (kvm)
> >> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
> >>  }
> >>
> >>  /*
> >> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
> >>  {
> >>       int err;
> >>
> >> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> >> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> >> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> >> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> >> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> >> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
> >>
> >>       /*
> >>        * We rely on the linker script to ensure at build time that the HYP
> >> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >> index 5e377101f919..830402f847e0 100644
> >> --- a/arch/arm64/include/asm/kvm_asm.h
> >> +++ b/arch/arm64/include/asm/kvm_asm.h
> >> @@ -105,24 +105,27 @@
> >>  #ifndef __ASSEMBLY__
> >>  struct kvm;
> >>  struct kvm_vcpu;
> >> +struct kvm_ksym;
> >>
> >>  extern char __kvm_hyp_init[];
> >>  extern char __kvm_hyp_init_end[];
> >>
> >> -extern char __kvm_hyp_vector[];
> >> +extern struct kvm_ksym __kvm_hyp_vector;
> >>
> >> -#define      __kvm_hyp_code_start    __hyp_text_start
> >> -#define      __kvm_hyp_code_end      __hyp_text_end
> >> +extern struct kvm_ksym __kvm_hyp_code_start;
> >> +extern struct kvm_ksym __kvm_hyp_code_end;
> >>
> >> -extern void __kvm_flush_vm_context(void);
> >> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >> +extern struct kvm_ksym __kvm_flush_vm_context;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
> >>
> >> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +extern struct kvm_ksym __kvm_vcpu_run;
> >>
> >> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> >> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
> >>
> >> -extern u32 __kvm_get_mdcr_el2(void);
> >> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> >> +
> >> +extern struct kvm_ksym __kvm_get_mdcr_el2;
> >>
> >>  #endif
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 61505676d085..0899026a2821 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -73,6 +73,8 @@
> >>
> >>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
> >>
> >> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> >> +
> >>  /*
> >>   * We currently only support a 40bit IPA.
> >>   */
> >> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> >> index 7a5df5252dd7..215ad4649dd7 100644
> >> --- a/arch/arm64/include/asm/virt.h
> >> +++ b/arch/arm64/include/asm/virt.h
> >> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
> >>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
> >>  }
> >>
> >> -/* The section containing the hypervisor text */
> >> -extern char __hyp_text_start[];
> >> -extern char __hyp_text_end[];
> >> -
> >>  #endif /* __ASSEMBLY__ */
> >>
> >>  #endif /* ! __ASM__VIRT_H */
> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> >> index 363c2f529951..f935f082188d 100644
> >> --- a/arch/arm64/kernel/vmlinux.lds.S
> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
> >> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
> >>       *(.hyp.idmap.text)                              \
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
> >> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
> >>       *(.hyp.text)                                    \
> >> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
> >
> > why this rename?
> >
> 
> I already got rid of it based on Marc's feedback. The only reason was
> to align between ARM and arm64, but he is already doing the same in
> the opposite direction

ah, now I understand what Marc was referring to in his comment, thanks.

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of linear mapping
@ 2016-01-05 14:56         ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier

On Tue, Jan 05, 2016 at 03:51:58PM +0100, Ard Biesheuvel wrote:
> On 5 January 2016 at 15:41, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Wed, Dec 30, 2015 at 04:26:04PM +0100, Ard Biesheuvel wrote:
> >> KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
> >> the HYP mapping at EL2. Before we can move the kernel virtual mapping
> >> out of the linear mapping, we have to make sure that references to kernel
> >> symbols that are accessed via the HYP mapping are translated to their
> >> linear equivalent.
> >>
> >> To prevent inadvertent direct references from sneaking in later, change
> >> the type of all extern declarations to HYP kernel symbols to the opaque
> >> 'struct kvm_ksym', which does not decay to a pointer type like char arrays
> >> and function references. This is not bullet proof, but at least forces the
> >> user to take the address explicitly rather than referencing it directly.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm/include/asm/kvm_asm.h   |  2 ++
> >>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
> >>  arch/arm/kvm/arm.c               |  9 +++++----
> >>  arch/arm/kvm/mmu.c               | 12 +++++------
> >>  arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++---------
> >>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
> >>  arch/arm64/include/asm/virt.h    |  4 ----
> >>  arch/arm64/kernel/vmlinux.lds.S  |  4 ++--
> >>  arch/arm64/kvm/debug.c           |  4 +++-
> >>  virt/kvm/arm/vgic-v3.c           |  2 +-
> >>  10 files changed, 34 insertions(+), 28 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> >> index 194c91b610ff..484ffdf7c70b 100644
> >> --- a/arch/arm/include/asm/kvm_asm.h
> >> +++ b/arch/arm/include/asm/kvm_asm.h
> >> @@ -99,6 +99,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >>  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >>
> >>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +
> >> +extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >>  #endif
> >>
> >>  #endif /* __ARM_KVM_ASM_H__ */
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index 405aa1883307..412b363f79e9 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -30,6 +30,8 @@
> >>  #define HYP_PAGE_OFFSET              PAGE_OFFSET
> >>  #define KERN_TO_HYP(kva)     (kva)
> >>
> >> +#define kvm_ksym_ref(kva)    (kva)
> >> +
> >>  /*
> >>   * Our virtual mapping for the boot-time MMU-enable code. Must be
> >>   * shared across all the page-tables. Conveniently, we use the vectors
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index e06fd299de08..014b542ea658 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -427,7 +427,7 @@ static void update_vttbr(struct kvm *kvm)
> >>                * shareable domain to make sure all data structures are
> >>                * clean.
> >>                */
> >> -             kvm_call_hyp(__kvm_flush_vm_context);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_flush_vm_context));
> >>       }
> >>
> >>       kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >> @@ -600,7 +600,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>               __kvm_guest_enter();
> >>               vcpu->mode = IN_GUEST_MODE;
> >>
> >> -             ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> >> +             ret = kvm_call_hyp(kvm_ksym_ref(__kvm_vcpu_run), vcpu);
> >>
> >>               vcpu->mode = OUTSIDE_GUEST_MODE;
> >>               /*
> >> @@ -969,7 +969,7 @@ static void cpu_init_hyp_mode(void *dummy)
> >>       pgd_ptr = kvm_mmu_get_httbr();
> >>       stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
> >>       hyp_stack_ptr = stack_page + PAGE_SIZE;
> >> -     vector_ptr = (unsigned long)__kvm_hyp_vector;
> >> +     vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
> >>
> >>       __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
> >>
> >> @@ -1061,7 +1061,8 @@ static int init_hyp_mode(void)
> >>       /*
> >>        * Map the Hyp-code called directly from the host
> >>        */
> >> -     err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
> >> +     err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
> >> +                               kvm_ksym_ref(__kvm_hyp_code_end));
> >>       if (err) {
> >>               kvm_err("Cannot map world-switch code\n");
> >>               goto out_free_mappings;
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 7dace909d5cf..7c448b943e3a 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -31,8 +31,6 @@
> >>
> >>  #include "trace.h"
> >>
> >> -extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
> >> -
> >>  static pgd_t *boot_hyp_pgd;
> >>  static pgd_t *hyp_pgd;
> >>  static pgd_t *merged_hyp_pgd;
> >> @@ -63,7 +61,7 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot)
> >>   */
> >>  void kvm_flush_remote_tlbs(struct kvm *kvm)
> >>  {
> >> -     kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> +     kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid), kvm);
> >>  }
> >>
> >>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >> @@ -75,7 +73,7 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >>        * anything there.
> >>        */
> >>       if (kvm)
> >> -             kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> >> +             kvm_call_hyp(kvm_ksym_ref(__kvm_tlb_flush_vmid_ipa), kvm, ipa);
> >>  }
> >>
> >>  /*
> >> @@ -1647,9 +1645,9 @@ int kvm_mmu_init(void)
> >>  {
> >>       int err;
> >>
> >> -     hyp_idmap_start = kvm_virt_to_phys(__hyp_idmap_text_start);
> >> -     hyp_idmap_end = kvm_virt_to_phys(__hyp_idmap_text_end);
> >> -     hyp_idmap_vector = kvm_virt_to_phys(__kvm_hyp_init);
> >> +     hyp_idmap_start = kvm_virt_to_phys(&__hyp_idmap_text_start);
> >> +     hyp_idmap_end = kvm_virt_to_phys(&__hyp_idmap_text_end);
> >> +     hyp_idmap_vector = kvm_virt_to_phys(&__kvm_hyp_init);
> >>
> >>       /*
> >>        * We rely on the linker script to ensure at build time that the HYP
> >> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> >> index 5e377101f919..830402f847e0 100644
> >> --- a/arch/arm64/include/asm/kvm_asm.h
> >> +++ b/arch/arm64/include/asm/kvm_asm.h
> >> @@ -105,24 +105,27 @@
> >>  #ifndef __ASSEMBLY__
> >>  struct kvm;
> >>  struct kvm_vcpu;
> >> +struct kvm_ksym;
> >>
> >>  extern char __kvm_hyp_init[];
> >>  extern char __kvm_hyp_init_end[];
> >>
> >> -extern char __kvm_hyp_vector[];
> >> +extern struct kvm_ksym __kvm_hyp_vector;
> >>
> >> -#define      __kvm_hyp_code_start    __hyp_text_start
> >> -#define      __kvm_hyp_code_end      __hyp_text_end
> >> +extern struct kvm_ksym __kvm_hyp_code_start;
> >> +extern struct kvm_ksym __kvm_hyp_code_end;
> >>
> >> -extern void __kvm_flush_vm_context(void);
> >> -extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> >> -extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
> >> +extern struct kvm_ksym __kvm_flush_vm_context;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid_ipa;
> >> +extern struct kvm_ksym __kvm_tlb_flush_vmid;
> >>
> >> -extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >> +extern struct kvm_ksym __kvm_vcpu_run;
> >>
> >> -extern u64 __vgic_v3_get_ich_vtr_el2(void);
> >> +extern struct kvm_ksym __hyp_idmap_text_start, __hyp_idmap_text_end;
> >>
> >> -extern u32 __kvm_get_mdcr_el2(void);
> >> +extern struct kvm_ksym __vgic_v3_get_ich_vtr_el2;
> >> +
> >> +extern struct kvm_ksym __kvm_get_mdcr_el2;
> >>
> >>  #endif
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 61505676d085..0899026a2821 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -73,6 +73,8 @@
> >>
> >>  #define KERN_TO_HYP(kva)     ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)
> >>
> >> +#define kvm_ksym_ref(sym)    ((void *)&sym - KIMAGE_VADDR + PAGE_OFFSET)
> >> +
> >>  /*
> >>   * We currently only support a 40bit IPA.
> >>   */
> >> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> >> index 7a5df5252dd7..215ad4649dd7 100644
> >> --- a/arch/arm64/include/asm/virt.h
> >> +++ b/arch/arm64/include/asm/virt.h
> >> @@ -50,10 +50,6 @@ static inline bool is_hyp_mode_mismatched(void)
> >>       return __boot_cpu_mode[0] != __boot_cpu_mode[1];
> >>  }
> >>
> >> -/* The section containing the hypervisor text */
> >> -extern char __hyp_text_start[];
> >> -extern char __hyp_text_end[];
> >> -
> >>  #endif /* __ASSEMBLY__ */
> >>
> >>  #endif /* ! __ASM__VIRT_H */
> >> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> >> index 363c2f529951..f935f082188d 100644
> >> --- a/arch/arm64/kernel/vmlinux.lds.S
> >> +++ b/arch/arm64/kernel/vmlinux.lds.S
> >> @@ -35,9 +35,9 @@ jiffies = jiffies_64;
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_start) = .;     \
> >>       *(.hyp.idmap.text)                              \
> >>       VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;       \
> >> -     VMLINUX_SYMBOL(__hyp_text_start) = .;           \
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_start) = .;       \
> >>       *(.hyp.text)                                    \
> >> -     VMLINUX_SYMBOL(__hyp_text_end) = .;
> >> +     VMLINUX_SYMBOL(__kvm_hyp_code_end) = .;
> >
> > why this rename?
> >
> 
> I already got rid of it based on Marc's feedback. The only reason was
> to align between ARM and arm64, but he is already doing the same in
> the opposite direction

ah, now I understand what Marc was referring to in his comment, thanks.

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  2016-01-05 14:46       ` Mark Rutland
  (?)
@ 2016-01-05 14:58         ` Christoffer Dall
  -1 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:58 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Tue, Jan 05, 2016 at 02:46:50PM +0000, Mark Rutland wrote:
> On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > > we move the kernel virtual mapping out of the linear mapping.
> > > 
> > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> > >  arch/arm64/kernel/head.S        |  2 +-
> > >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> > >  3 files changed, 11 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > > index 853953cd1f08..bea9631b34a8 100644
> > > --- a/arch/arm64/include/asm/memory.h
> > > +++ b/arch/arm64/include/asm/memory.h
> > > @@ -51,7 +51,8 @@
> > >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> > >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> > >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > > -#define MODULES_END		(PAGE_OFFSET)
> > > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > > +#define MODULES_END		(KIMAGE_VADDR)
> > >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> > >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> > >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > > @@ -75,8 +76,13 @@
> > >   * private definitions which should NOT be used outside memory.h
> > >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> > >   */
> > > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > > +#define __virt_to_phys(x) ({						\
> > > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> > 
> > so __virt_to_phys will now work with a subset of the non-linear namely
> > all except vmalloced and ioremapped ones?
> 
> It will work for linear mapped memory and for the kernel image, which is
> what it used to do. It's just that the relationship between the image
> and the linear map is broken.
> 
> The same rules apply to x86, where their virt_to_phys eventually boils down to:
> 
> static inline unsigned long __phys_addr_nodebug(unsigned long x)
> {
>         unsigned long y = x - __START_KERNEL_map;
> 
>         /* use the carry flag to determine if x was < __START_KERNEL_map */
>         x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
> 
>         return x;
> }
> 
ok, thanks for the snippet :)

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:58         ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 05, 2016 at 02:46:50PM +0000, Mark Rutland wrote:
> On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > > we move the kernel virtual mapping out of the linear mapping.
> > > 
> > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> > >  arch/arm64/kernel/head.S        |  2 +-
> > >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> > >  3 files changed, 11 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > > index 853953cd1f08..bea9631b34a8 100644
> > > --- a/arch/arm64/include/asm/memory.h
> > > +++ b/arch/arm64/include/asm/memory.h
> > > @@ -51,7 +51,8 @@
> > >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> > >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> > >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > > -#define MODULES_END		(PAGE_OFFSET)
> > > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > > +#define MODULES_END		(KIMAGE_VADDR)
> > >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> > >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> > >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > > @@ -75,8 +76,13 @@
> > >   * private definitions which should NOT be used outside memory.h
> > >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> > >   */
> > > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > > +#define __virt_to_phys(x) ({						\
> > > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> > 
> > so __virt_to_phys will now work with a subset of the non-linear namely
> > all except vmalloced and ioremapped ones?
> 
> It will work for linear mapped memory and for the kernel image, which is
> what it used to do. It's just that the relationship between the image
> and the linear map is broken.
> 
> The same rules apply to x86, where their virt_to_phys eventually boils down to:
> 
> static inline unsigned long __phys_addr_nodebug(unsigned long x)
> {
>         unsigned long y = x - __START_KERNEL_map;
> 
>         /* use the carry flag to determine if x was < __START_KERNEL_map */
>         x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
> 
>         return x;
> }
> 
ok, thanks for the snippet :)

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
@ 2016-01-05 14:58         ` Christoffer Dall
  0 siblings, 0 replies; 156+ messages in thread
From: Christoffer Dall @ 2016-01-05 14:58 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	catalin.marinas, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier

On Tue, Jan 05, 2016 at 02:46:50PM +0000, Mark Rutland wrote:
> On Tue, Jan 05, 2016 at 03:36:34PM +0100, Christoffer Dall wrote:
> > On Wed, Dec 30, 2015 at 04:26:01PM +0100, Ard Biesheuvel wrote:
> > > This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
> > > the symbolic virtual base of the kernel region, i.e., the kernel's virtual
> > > offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
> > > equal to PAGE_OFFSET, but in the future, it will be moved below it once
> > > we move the kernel virtual mapping out of the linear mapping.
> > > 
> > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/memory.h | 10 ++++++++--
> > >  arch/arm64/kernel/head.S        |  2 +-
> > >  arch/arm64/kernel/vmlinux.lds.S |  4 ++--
> > >  3 files changed, 11 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > > index 853953cd1f08..bea9631b34a8 100644
> > > --- a/arch/arm64/include/asm/memory.h
> > > +++ b/arch/arm64/include/asm/memory.h
> > > @@ -51,7 +51,8 @@
> > >  #define VA_BITS			(CONFIG_ARM64_VA_BITS)
> > >  #define VA_START		(UL(0xffffffffffffffff) << VA_BITS)
> > >  #define PAGE_OFFSET		(UL(0xffffffffffffffff) << (VA_BITS - 1))
> > > -#define MODULES_END		(PAGE_OFFSET)
> > > +#define KIMAGE_VADDR		(PAGE_OFFSET)
> > > +#define MODULES_END		(KIMAGE_VADDR)
> > >  #define MODULES_VADDR		(MODULES_END - SZ_64M)
> > >  #define PCI_IO_END		(MODULES_VADDR - SZ_2M)
> > >  #define PCI_IO_START		(PCI_IO_END - PCI_IO_SIZE)
> > > @@ -75,8 +76,13 @@
> > >   * private definitions which should NOT be used outside memory.h
> > >   * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
> > >   */
> > > -#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
> > > +#define __virt_to_phys(x) ({						\
> > > +	phys_addr_t __x = (phys_addr_t)(x);				\
> > > +	__x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) :	\
> > > +			     (__x - KIMAGE_VADDR + PHYS_OFFSET); })
> > 
> > so __virt_to_phys will now work with a subset of the non-linear namely
> > all except vmalloced and ioremapped ones?
> 
> It will work for linear mapped memory and for the kernel image, which is
> what it used to do. It's just that the relationship between the image
> and the linear map is broken.
> 
> The same rules apply to x86, where their virt_to_phys eventually boils down to:
> 
> static inline unsigned long __phys_addr_nodebug(unsigned long x)
> {
>         unsigned long y = x - __START_KERNEL_map;
> 
>         /* use the carry flag to determine if x was < __START_KERNEL_map */
>         x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
> 
>         return x;
> }
> 
ok, thanks for the snippet :)

-Christoffer

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-05 19:51     ` Kees Cook
  -1 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:51 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 03e02ebc1b0c..b17181eb4a43 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -109,7 +109,8 @@ Header notes:
>                         1 - 4K
>                         2 - 16K
>                         3 - 64K
> -  Bits 3-63:   Reserved.
> +  Bit 3:       Relocatable kernel.
> +  Bits 4-63:   Reserved.
>
>  - When image_size is zero, a bootloader should attempt to keep as much
>    memory as possible free for use by the kernel immediately after the
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 54eeab140bca..f458fb9e0dce 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>         depends on MODULES
>         default y
> +       select ARM64_MODULE_CMODEL_LARGE
>         help
>           This option builds kernel modules using the large memory model in
>           order to avoid the use of the ADRP instruction, which can cause
> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>         bool
>         select HAVE_MOD_ARCH_SPECIFIC
>
> +config ARM64_MODULE_CMODEL_LARGE
> +       bool
> +
> +config ARM64_RELOCATABLE_KERNEL

Should this be called "CONFIG_RELOCATABLE" instead, just to keep
naming the same across x86, powerpw, and arm64?

> +       bool "Kernel address space layout randomization (KASLR)"

Strictly speaking, this enables KASLR, but doesn't provide it,
correct? It still relies on the boot loader for the randomness?

> +       select ARM64_MODULE_PLTS
> +       select ARM64_MODULE_CMODEL_LARGE
> +       help
> +         This feature randomizes the virtual address of the kernel image, to
> +         harden against exploits that rely on knowledge about the absolute
> +         addresses of certain kernel data structures.
> +
>  endmenu
>
>  menu "Boot options"
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index d4654830e536..75dc477d45f5 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>  GZFLAGS                :=-9
>
> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
> +LDFLAGS_vmlinux                += -pie
> +endif
> +
>  KBUILD_DEFCONFIG := defconfig
>
>  # Check for binutils support for specific extensions
> @@ -41,7 +45,7 @@ endif
>
>  CHECKFLAGS     += -D__aarch64__
>
> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>  endif
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 557228658666..afab3e669e19 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>  /* PHYS_OFFSET - the physical address of the start of memory. */
>  #define PHYS_OFFSET            ({ memstart_addr; })
>
> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
> +extern u64                     kimage_vaddr;
> +
>  /* the offset between the kernel virtual and physical mappings */
>  extern u64                     kimage_voffset;
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -59,8 +59,15 @@
>
>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +#define __HEAD_FLAG_RELOC      1
> +#else
> +#define __HEAD_FLAG_RELOC      0
> +#endif
> +
>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
> +                        (__HEAD_FLAG_RELOC << 3))
>
>  /*
>   * Kernel startup entry point.
> @@ -231,6 +238,9 @@ ENTRY(stext)
>          */
>         ldr     x27, 0f                         // address to jump to after
>                                                 // MMU has been enabled
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x27, x27, x23                   // add KASLR displacement
> +#endif
>         adr_l   lr, __enable_mmu                // return (PIC) address
>         b       __cpu_setup                     // initialise processor
>  ENDPROC(stext)
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>         mov     x21, x0                         // x21=FDT
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       /*
> +        * Mask off the bits of the random value supplied in x1 so it can serve
> +        * as a KASLR displacement value which will move the kernel image to a
> +        * random offset in the lower half of the VMALLOC area.
> +        */
> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif
> +
>         adr_l   x0, boot_args                   // record the contents of
>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>         stp     x2, x3, [x0, #16]
> @@ -402,6 +422,9 @@ __create_page_tables:
>          */
>         mov     x0, x26                         // swapper_pg_dir
>         ldr     x5, =KIMAGE_VADDR
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x5, x5, x23                     // add KASLR displacement
> +#endif
>         create_pgd_entry x0, x5, x3, x6
>         ldr     w6, kernel_img_size
>         add     x6, x6, x5
> @@ -443,10 +466,52 @@ __mmap_switched:
>         str     xzr, [x6], #8                   // Clear BSS
>         b       1b
>  2:
> +
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE     0x403
> +#define R_AARCH64_ABS64                0x101
> +
> +       /*
> +        * Iterate over each entry in the relocation table, and apply the
> +        * relocations in place.
> +        */
> +       adr_l   x8, __dynsym_start              // start of symbol table
> +       adr_l   x9, __reloc_start               // start of reloc table
> +       adr_l   x10, __reloc_end                // end of reloc table
> +
> +0:     cmp     x9, x10
> +       b.hs    2f
> +       ldp     x11, x12, [x9], #24
> +       ldr     x13, [x9, #-8]
> +       cmp     w12, #R_AARCH64_RELATIVE
> +       b.ne    1f
> +       add     x13, x13, x23                   // relocate
> +       str     x13, [x11, x23]
> +       b       0b
> +
> +1:     cmp     w12, #R_AARCH64_ABS64
> +       b.ne    0b
> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> +       add     x14, x15, x23                   // relocate
> +       csel    x15, x14, x15, ne
> +       add     x15, x13, x15
> +       str     x15, [x11, x23]
> +       b       0b
> +
> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
> +       dc      cvac, x8                        // value visible to secondaries
> +       dsb     sy                              // with MMU off
> +#endif
> +
>         adr_l   sp, initial_sp, x4
>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>
> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>         sub     x24, x0, x24                    // the kernel virtual and
>         str_l   x24, kimage_voffset, x0         // physical mappings
>
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>         .section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +       .quad           _text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>         adrp    x26, swapper_pg_dir
>         bl      __cpu_setup                     // initialise processor
>
> -       ldr     x8, =KIMAGE_VADDR
> +       ldr     x8, kimage_vaddr
>         ldr     w9, 0f
>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>         b       __enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>
>  void __init setup_arch(char **cmdline_p)
>  {
> -       static struct vm_struct vmlinux_vm __initdata = {
> -               .addr           = (void *)KIMAGE_VADDR,
> -               .size           = 0,
> -               .flags          = VM_IOREMAP,
> -               .caller         = setup_arch,
> -       };
> -
> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -                                  1 << SWAPPER_BLOCK_SHIFT);
> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +       static struct vm_struct vmlinux_vm __initdata;
> +
> +       vmlinux_vm.addr = (void *)kimage_vaddr;
> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +                                  SWAPPER_BLOCK_SIZE);
> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +       vmlinux_vm.flags = VM_IOREMAP;
> +       vmlinux_vm.caller = setup_arch;
> +
>         vm_area_add_early(&vmlinux_vm);
>
>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>         conswitchp = &dummy_con;
>  #endif
>  #endif
> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +           boot_args[2] || boot_args[3]) {
>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>                         "This indicates a broken bootloader or old kernel\n",
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>         .altinstr_replacement : {
>                 *(.altinstr_replacement)
>         }
> +       .rela : ALIGN(8) {
> +               __reloc_start = .;
> +               *(.rela .rela*)
> +               __reloc_end = .;
> +       }
> +       .dynsym : ALIGN(8) {
> +               __dynsym_start = .;
> +               *(.dynsym)
> +       }
>
>         . = ALIGN(PAGE_SIZE);
>         __init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> --
> 2.5.0
>

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-05 19:51     ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 03e02ebc1b0c..b17181eb4a43 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -109,7 +109,8 @@ Header notes:
>                         1 - 4K
>                         2 - 16K
>                         3 - 64K
> -  Bits 3-63:   Reserved.
> +  Bit 3:       Relocatable kernel.
> +  Bits 4-63:   Reserved.
>
>  - When image_size is zero, a bootloader should attempt to keep as much
>    memory as possible free for use by the kernel immediately after the
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 54eeab140bca..f458fb9e0dce 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>         depends on MODULES
>         default y
> +       select ARM64_MODULE_CMODEL_LARGE
>         help
>           This option builds kernel modules using the large memory model in
>           order to avoid the use of the ADRP instruction, which can cause
> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>         bool
>         select HAVE_MOD_ARCH_SPECIFIC
>
> +config ARM64_MODULE_CMODEL_LARGE
> +       bool
> +
> +config ARM64_RELOCATABLE_KERNEL

Should this be called "CONFIG_RELOCATABLE" instead, just to keep
naming the same across x86, powerpw, and arm64?

> +       bool "Kernel address space layout randomization (KASLR)"

Strictly speaking, this enables KASLR, but doesn't provide it,
correct? It still relies on the boot loader for the randomness?

> +       select ARM64_MODULE_PLTS
> +       select ARM64_MODULE_CMODEL_LARGE
> +       help
> +         This feature randomizes the virtual address of the kernel image, to
> +         harden against exploits that rely on knowledge about the absolute
> +         addresses of certain kernel data structures.
> +
>  endmenu
>
>  menu "Boot options"
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index d4654830e536..75dc477d45f5 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>  GZFLAGS                :=-9
>
> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
> +LDFLAGS_vmlinux                += -pie
> +endif
> +
>  KBUILD_DEFCONFIG := defconfig
>
>  # Check for binutils support for specific extensions
> @@ -41,7 +45,7 @@ endif
>
>  CHECKFLAGS     += -D__aarch64__
>
> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>  endif
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 557228658666..afab3e669e19 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>  /* PHYS_OFFSET - the physical address of the start of memory. */
>  #define PHYS_OFFSET            ({ memstart_addr; })
>
> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
> +extern u64                     kimage_vaddr;
> +
>  /* the offset between the kernel virtual and physical mappings */
>  extern u64                     kimage_voffset;
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -59,8 +59,15 @@
>
>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +#define __HEAD_FLAG_RELOC      1
> +#else
> +#define __HEAD_FLAG_RELOC      0
> +#endif
> +
>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
> +                        (__HEAD_FLAG_RELOC << 3))
>
>  /*
>   * Kernel startup entry point.
> @@ -231,6 +238,9 @@ ENTRY(stext)
>          */
>         ldr     x27, 0f                         // address to jump to after
>                                                 // MMU has been enabled
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x27, x27, x23                   // add KASLR displacement
> +#endif
>         adr_l   lr, __enable_mmu                // return (PIC) address
>         b       __cpu_setup                     // initialise processor
>  ENDPROC(stext)
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>         mov     x21, x0                         // x21=FDT
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       /*
> +        * Mask off the bits of the random value supplied in x1 so it can serve
> +        * as a KASLR displacement value which will move the kernel image to a
> +        * random offset in the lower half of the VMALLOC area.
> +        */
> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif
> +
>         adr_l   x0, boot_args                   // record the contents of
>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>         stp     x2, x3, [x0, #16]
> @@ -402,6 +422,9 @@ __create_page_tables:
>          */
>         mov     x0, x26                         // swapper_pg_dir
>         ldr     x5, =KIMAGE_VADDR
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x5, x5, x23                     // add KASLR displacement
> +#endif
>         create_pgd_entry x0, x5, x3, x6
>         ldr     w6, kernel_img_size
>         add     x6, x6, x5
> @@ -443,10 +466,52 @@ __mmap_switched:
>         str     xzr, [x6], #8                   // Clear BSS
>         b       1b
>  2:
> +
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE     0x403
> +#define R_AARCH64_ABS64                0x101
> +
> +       /*
> +        * Iterate over each entry in the relocation table, and apply the
> +        * relocations in place.
> +        */
> +       adr_l   x8, __dynsym_start              // start of symbol table
> +       adr_l   x9, __reloc_start               // start of reloc table
> +       adr_l   x10, __reloc_end                // end of reloc table
> +
> +0:     cmp     x9, x10
> +       b.hs    2f
> +       ldp     x11, x12, [x9], #24
> +       ldr     x13, [x9, #-8]
> +       cmp     w12, #R_AARCH64_RELATIVE
> +       b.ne    1f
> +       add     x13, x13, x23                   // relocate
> +       str     x13, [x11, x23]
> +       b       0b
> +
> +1:     cmp     w12, #R_AARCH64_ABS64
> +       b.ne    0b
> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> +       add     x14, x15, x23                   // relocate
> +       csel    x15, x14, x15, ne
> +       add     x15, x13, x15
> +       str     x15, [x11, x23]
> +       b       0b
> +
> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
> +       dc      cvac, x8                        // value visible to secondaries
> +       dsb     sy                              // with MMU off
> +#endif
> +
>         adr_l   sp, initial_sp, x4
>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>
> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>         sub     x24, x0, x24                    // the kernel virtual and
>         str_l   x24, kimage_voffset, x0         // physical mappings
>
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>         .section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +       .quad           _text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>         adrp    x26, swapper_pg_dir
>         bl      __cpu_setup                     // initialise processor
>
> -       ldr     x8, =KIMAGE_VADDR
> +       ldr     x8, kimage_vaddr
>         ldr     w9, 0f
>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>         b       __enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>
>  void __init setup_arch(char **cmdline_p)
>  {
> -       static struct vm_struct vmlinux_vm __initdata = {
> -               .addr           = (void *)KIMAGE_VADDR,
> -               .size           = 0,
> -               .flags          = VM_IOREMAP,
> -               .caller         = setup_arch,
> -       };
> -
> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -                                  1 << SWAPPER_BLOCK_SHIFT);
> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +       static struct vm_struct vmlinux_vm __initdata;
> +
> +       vmlinux_vm.addr = (void *)kimage_vaddr;
> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +                                  SWAPPER_BLOCK_SIZE);
> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +       vmlinux_vm.flags = VM_IOREMAP;
> +       vmlinux_vm.caller = setup_arch;
> +
>         vm_area_add_early(&vmlinux_vm);
>
>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>         conswitchp = &dummy_con;
>  #endif
>  #endif
> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +           boot_args[2] || boot_args[3]) {
>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>                         "This indicates a broken bootloader or old kernel\n",
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>         .altinstr_replacement : {
>                 *(.altinstr_replacement)
>         }
> +       .rela : ALIGN(8) {
> +               __reloc_start = .;
> +               *(.rela .rela*)
> +               __reloc_end = .;
> +       }
> +       .dynsym : ALIGN(8) {
> +               __dynsym_start = .;
> +               *(.dynsym)
> +       }
>
>         . = ALIGN(PAGE_SIZE);
>         __init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> --
> 2.5.0
>

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-05 19:51     ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:51 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 03e02ebc1b0c..b17181eb4a43 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -109,7 +109,8 @@ Header notes:
>                         1 - 4K
>                         2 - 16K
>                         3 - 64K
> -  Bits 3-63:   Reserved.
> +  Bit 3:       Relocatable kernel.
> +  Bits 4-63:   Reserved.
>
>  - When image_size is zero, a bootloader should attempt to keep as much
>    memory as possible free for use by the kernel immediately after the
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 54eeab140bca..f458fb9e0dce 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>         depends on MODULES
>         default y
> +       select ARM64_MODULE_CMODEL_LARGE
>         help
>           This option builds kernel modules using the large memory model in
>           order to avoid the use of the ADRP instruction, which can cause
> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>         bool
>         select HAVE_MOD_ARCH_SPECIFIC
>
> +config ARM64_MODULE_CMODEL_LARGE
> +       bool
> +
> +config ARM64_RELOCATABLE_KERNEL

Should this be called "CONFIG_RELOCATABLE" instead, just to keep
naming the same across x86, powerpw, and arm64?

> +       bool "Kernel address space layout randomization (KASLR)"

Strictly speaking, this enables KASLR, but doesn't provide it,
correct? It still relies on the boot loader for the randomness?

> +       select ARM64_MODULE_PLTS
> +       select ARM64_MODULE_CMODEL_LARGE
> +       help
> +         This feature randomizes the virtual address of the kernel image, to
> +         harden against exploits that rely on knowledge about the absolute
> +         addresses of certain kernel data structures.
> +
>  endmenu
>
>  menu "Boot options"
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index d4654830e536..75dc477d45f5 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>  GZFLAGS                :=-9
>
> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
> +LDFLAGS_vmlinux                += -pie
> +endif
> +
>  KBUILD_DEFCONFIG := defconfig
>
>  # Check for binutils support for specific extensions
> @@ -41,7 +45,7 @@ endif
>
>  CHECKFLAGS     += -D__aarch64__
>
> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>  endif
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 557228658666..afab3e669e19 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>  /* PHYS_OFFSET - the physical address of the start of memory. */
>  #define PHYS_OFFSET            ({ memstart_addr; })
>
> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
> +extern u64                     kimage_vaddr;
> +
>  /* the offset between the kernel virtual and physical mappings */
>  extern u64                     kimage_voffset;
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -59,8 +59,15 @@
>
>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +#define __HEAD_FLAG_RELOC      1
> +#else
> +#define __HEAD_FLAG_RELOC      0
> +#endif
> +
>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
> +                        (__HEAD_FLAG_RELOC << 3))
>
>  /*
>   * Kernel startup entry point.
> @@ -231,6 +238,9 @@ ENTRY(stext)
>          */
>         ldr     x27, 0f                         // address to jump to after
>                                                 // MMU has been enabled
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x27, x27, x23                   // add KASLR displacement
> +#endif
>         adr_l   lr, __enable_mmu                // return (PIC) address
>         b       __cpu_setup                     // initialise processor
>  ENDPROC(stext)
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>         mov     x21, x0                         // x21=FDT
>
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       /*
> +        * Mask off the bits of the random value supplied in x1 so it can serve
> +        * as a KASLR displacement value which will move the kernel image to a
> +        * random offset in the lower half of the VMALLOC area.
> +        */
> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif
> +
>         adr_l   x0, boot_args                   // record the contents of
>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>         stp     x2, x3, [x0, #16]
> @@ -402,6 +422,9 @@ __create_page_tables:
>          */
>         mov     x0, x26                         // swapper_pg_dir
>         ldr     x5, =KIMAGE_VADDR
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +       add     x5, x5, x23                     // add KASLR displacement
> +#endif
>         create_pgd_entry x0, x5, x3, x6
>         ldr     w6, kernel_img_size
>         add     x6, x6, x5
> @@ -443,10 +466,52 @@ __mmap_switched:
>         str     xzr, [x6], #8                   // Clear BSS
>         b       1b
>  2:
> +
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE     0x403
> +#define R_AARCH64_ABS64                0x101
> +
> +       /*
> +        * Iterate over each entry in the relocation table, and apply the
> +        * relocations in place.
> +        */
> +       adr_l   x8, __dynsym_start              // start of symbol table
> +       adr_l   x9, __reloc_start               // start of reloc table
> +       adr_l   x10, __reloc_end                // end of reloc table
> +
> +0:     cmp     x9, x10
> +       b.hs    2f
> +       ldp     x11, x12, [x9], #24
> +       ldr     x13, [x9, #-8]
> +       cmp     w12, #R_AARCH64_RELATIVE
> +       b.ne    1f
> +       add     x13, x13, x23                   // relocate
> +       str     x13, [x11, x23]
> +       b       0b
> +
> +1:     cmp     w12, #R_AARCH64_ABS64
> +       b.ne    0b
> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> +       add     x14, x15, x23                   // relocate
> +       csel    x15, x14, x15, ne
> +       add     x15, x13, x15
> +       str     x15, [x11, x23]
> +       b       0b
> +
> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
> +       dc      cvac, x8                        // value visible to secondaries
> +       dsb     sy                              // with MMU off
> +#endif
> +
>         adr_l   sp, initial_sp, x4
>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>
> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>         sub     x24, x0, x24                    // the kernel virtual and
>         str_l   x24, kimage_voffset, x0         // physical mappings
>
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>         .section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +       .quad           _text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>         adrp    x26, swapper_pg_dir
>         bl      __cpu_setup                     // initialise processor
>
> -       ldr     x8, =KIMAGE_VADDR
> +       ldr     x8, kimage_vaddr
>         ldr     w9, 0f
>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>         b       __enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>
>  void __init setup_arch(char **cmdline_p)
>  {
> -       static struct vm_struct vmlinux_vm __initdata = {
> -               .addr           = (void *)KIMAGE_VADDR,
> -               .size           = 0,
> -               .flags          = VM_IOREMAP,
> -               .caller         = setup_arch,
> -       };
> -
> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -                                  1 << SWAPPER_BLOCK_SHIFT);
> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +       static struct vm_struct vmlinux_vm __initdata;
> +
> +       vmlinux_vm.addr = (void *)kimage_vaddr;
> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +                                  SWAPPER_BLOCK_SIZE);
> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +       vmlinux_vm.flags = VM_IOREMAP;
> +       vmlinux_vm.caller = setup_arch;
> +
>         vm_area_add_early(&vmlinux_vm);
>
>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>         conswitchp = &dummy_con;
>  #endif
>  #endif
> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +           boot_args[2] || boot_args[3]) {
>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>                         "This indicates a broken bootloader or old kernel\n",
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>         .altinstr_replacement : {
>                 *(.altinstr_replacement)
>         }
> +       .rela : ALIGN(8) {
> +               __reloc_start = .;
> +               *(.rela .rela*)
> +               __reloc_end = .;
> +       }
> +       .dynsym : ALIGN(8) {
> +               __dynsym_start = .;
> +               *(.dynsym)
> +       }
>
>         . = ALIGN(PAGE_SIZE);
>         __init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>                 break;
>         }  /* end switch */
>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -       ||  r2(&ehdr->e_type) != ET_EXEC
> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>                 fail_file();
>         }
>
> --
> 2.5.0
>

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-05 19:53     ` Kees Cook
  -1 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.

This logic seems like it should be under the name
CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
keep naming conventions the same across architectures to avoid
confusion.)

-Kees

>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
> index f82036e02485..f41073dde7e0 100644
> --- a/arch/arm64/kernel/efi-entry.S
> +++ b/arch/arm64/kernel/efi-entry.S
> @@ -110,7 +110,7 @@ ENTRY(entry)
>  2:
>         /* Jump to kernel entry point */
>         mov     x0, x20
> -       mov     x1, xzr
> +       ldr     x1, efi_rnd
>         mov     x2, xzr
>         mov     x3, xzr
>         br      x21
> @@ -119,6 +119,9 @@ efi_load_fail:
>         mov     x0, #EFI_LOAD_ERROR
>         ldp     x29, x30, [sp], #32
>         ret
> +ENDPROC(entry)
> +
> +ENTRY(efi_rnd)
> +       .quad   0, 0
>
>  entry_end:
> -ENDPROC(entry)
> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
> index 950c87f5d279..f580bcdfae4f 100644
> --- a/drivers/firmware/efi/libstub/arm-stub.c
> +++ b/drivers/firmware/efi/libstub/arm-stub.c
> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>         out->output_string(out, str);
>  }
>
> -
>  /*
>   * This function handles the architcture specific differences between arm and
>   * arm64 regarding where the kernel image must be loaded and any memory that
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 78dfbd34b6bf..4e5c306346b4 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -13,6 +13,68 @@
>  #include <asm/efi.h>
>  #include <asm/sections.h>
>
> +struct efi_rng_protocol_t {
> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
> +                                unsigned long *,
> +                                efi_guid_t *);
> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
> +                               efi_guid_t *,
> +                               unsigned long,
> +                               u8 *out);
> +};
> +
> +extern struct {
> +       u64     virt_seed;
> +       u64     phys_seed;
> +} efi_rnd;
> +
> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> +       efi_status_t status;
> +       struct efi_rng_protocol_t *rng;
> +
> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
> +                                                     (void **)&rng);
> +       if (status == EFI_NOT_FOUND) {
> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               return EFI_SUCCESS;
> +       }
> +
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
> +}
> +
> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
> +{
> +       unsigned long map_size, desc_size;
> +       efi_memory_desc_t *memory_map;
> +       efi_status_t status;
> +       int l;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       for (l = 0; l < map_size; l += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + l;
> +
> +               if (md->attribute & EFI_MEMORY_WB) {
> +                       u64 phys_end = md->phys_addr +
> +                                      md->num_pages * EFI_PAGE_SIZE;
> +                       if (phys_end > *top)
> +                               *top = phys_end;
> +               }
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return EFI_SUCCESS;
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
>
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
> +               status = efi_get_random_bytes(sys_table_arg);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
> +                       return status;
> +               }
> +       }
> +
>         /*
>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>          * a 2 MB aligned base, which itself may be lower than dram_base, as
> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         if (preferred_offset < dram_base)
>                 preferred_offset += SZ_2M;
>
> -       /* Relocate the image, if required. */
>         kernel_size = _edata - _text;
> -       if (*image_addr != preferred_offset) {
> -               kernel_memsize = kernel_size + (_end - _edata);
> +       kernel_memsize = kernel_size + (_end - _edata);
> +
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +               /*
> +                * If KASLR is enabled, and we have some randomness available,
> +                * locate the kernel at a randomized offset in physical memory.
> +                */
> +               u64 dram_top = dram_base;
> +
> +               status = get_dram_top(sys_table_arg, &dram_top);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +                       return status;
> +               }
> +
> +               kernel_memsize += SZ_2M;
> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +                                   EFI_PAGE_SIZE;
>
>                 /*
> -                * First, try a straight allocation at the preferred offset.
> +                * Use the random seed to scale the size and add it to the DRAM
> +                * base. Note that this may give suboptimal results on systems
> +                * with discontiguous DRAM regions with large holes between them.
> +                */
> +               *reserve_addr = dram_base +
> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
> +                                       EFI_LOADER_DATA, nr_pages,
> +                                       (efi_physical_addr_t *)reserve_addr);
> +
> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
> +       } else {
> +               /*
> +                * Else, try a straight allocation at the preferred offset.
>                  * This will work around the issue where, if dram_base == 0x0,
>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>                  * address of the allocation to be mistaken for a FAIL return
> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * Mustang), we can still place the kernel at the address
>                  * 'dram_base + TEXT_OFFSET'.
>                  */
> +               if (*image_addr == preferred_offset)
> +                       return EFI_SUCCESS;
> +
>                 *image_addr = *reserve_addr = preferred_offset;
>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>                            EFI_PAGE_SIZE;
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
>                                         (efi_physical_addr_t *)reserve_addr);
> +       }
> +
> +       if (status != EFI_SUCCESS) {
> +               kernel_memsize += TEXT_OFFSET;
> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
> +                                      SZ_2M, reserve_addr);
> +
>                 if (status != EFI_SUCCESS) {
> -                       kernel_memsize += TEXT_OFFSET;
> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
> -                                              SZ_2M, reserve_addr);
> -
> -                       if (status != EFI_SUCCESS) {
> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> -                               return status;
> -                       }
> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> +                       return status;
>                 }
> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
> -               *reserve_size = kernel_memsize;
> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> -
> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
> +       *reserve_size = kernel_memsize;
>
>         return EFI_SUCCESS;
>  }
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 569b5a866bb1..13783fdc9bdd 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -299,7 +299,7 @@ typedef struct {
>         void *open_protocol_information;
>         void *protocols_per_handle;
>         void *locate_handle_buffer;
> -       void *locate_protocol;
> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>         void *install_multiple_protocol_interfaces;
>         void *uninstall_multiple_protocol_interfaces;
>         void *calculate_crc32;
> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>  #define EFI_PROPERTIES_TABLE_GUID \
>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>
> +#define EFI_RNG_PROTOCOL_GUID \
> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
> +
>  typedef struct {
>         efi_guid_t guid;
>         u64 table;
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-05 19:53     ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.

This logic seems like it should be under the name
CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
keep naming conventions the same across architectures to avoid
confusion.)

-Kees

>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
> index f82036e02485..f41073dde7e0 100644
> --- a/arch/arm64/kernel/efi-entry.S
> +++ b/arch/arm64/kernel/efi-entry.S
> @@ -110,7 +110,7 @@ ENTRY(entry)
>  2:
>         /* Jump to kernel entry point */
>         mov     x0, x20
> -       mov     x1, xzr
> +       ldr     x1, efi_rnd
>         mov     x2, xzr
>         mov     x3, xzr
>         br      x21
> @@ -119,6 +119,9 @@ efi_load_fail:
>         mov     x0, #EFI_LOAD_ERROR
>         ldp     x29, x30, [sp], #32
>         ret
> +ENDPROC(entry)
> +
> +ENTRY(efi_rnd)
> +       .quad   0, 0
>
>  entry_end:
> -ENDPROC(entry)
> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
> index 950c87f5d279..f580bcdfae4f 100644
> --- a/drivers/firmware/efi/libstub/arm-stub.c
> +++ b/drivers/firmware/efi/libstub/arm-stub.c
> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>         out->output_string(out, str);
>  }
>
> -
>  /*
>   * This function handles the architcture specific differences between arm and
>   * arm64 regarding where the kernel image must be loaded and any memory that
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 78dfbd34b6bf..4e5c306346b4 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -13,6 +13,68 @@
>  #include <asm/efi.h>
>  #include <asm/sections.h>
>
> +struct efi_rng_protocol_t {
> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
> +                                unsigned long *,
> +                                efi_guid_t *);
> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
> +                               efi_guid_t *,
> +                               unsigned long,
> +                               u8 *out);
> +};
> +
> +extern struct {
> +       u64     virt_seed;
> +       u64     phys_seed;
> +} efi_rnd;
> +
> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> +       efi_status_t status;
> +       struct efi_rng_protocol_t *rng;
> +
> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
> +                                                     (void **)&rng);
> +       if (status == EFI_NOT_FOUND) {
> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               return EFI_SUCCESS;
> +       }
> +
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
> +}
> +
> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
> +{
> +       unsigned long map_size, desc_size;
> +       efi_memory_desc_t *memory_map;
> +       efi_status_t status;
> +       int l;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       for (l = 0; l < map_size; l += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + l;
> +
> +               if (md->attribute & EFI_MEMORY_WB) {
> +                       u64 phys_end = md->phys_addr +
> +                                      md->num_pages * EFI_PAGE_SIZE;
> +                       if (phys_end > *top)
> +                               *top = phys_end;
> +               }
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return EFI_SUCCESS;
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
>
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
> +               status = efi_get_random_bytes(sys_table_arg);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
> +                       return status;
> +               }
> +       }
> +
>         /*
>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>          * a 2 MB aligned base, which itself may be lower than dram_base, as
> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         if (preferred_offset < dram_base)
>                 preferred_offset += SZ_2M;
>
> -       /* Relocate the image, if required. */
>         kernel_size = _edata - _text;
> -       if (*image_addr != preferred_offset) {
> -               kernel_memsize = kernel_size + (_end - _edata);
> +       kernel_memsize = kernel_size + (_end - _edata);
> +
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +               /*
> +                * If KASLR is enabled, and we have some randomness available,
> +                * locate the kernel at a randomized offset in physical memory.
> +                */
> +               u64 dram_top = dram_base;
> +
> +               status = get_dram_top(sys_table_arg, &dram_top);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +                       return status;
> +               }
> +
> +               kernel_memsize += SZ_2M;
> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +                                   EFI_PAGE_SIZE;
>
>                 /*
> -                * First, try a straight allocation at the preferred offset.
> +                * Use the random seed to scale the size and add it to the DRAM
> +                * base. Note that this may give suboptimal results on systems
> +                * with discontiguous DRAM regions with large holes between them.
> +                */
> +               *reserve_addr = dram_base +
> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
> +                                       EFI_LOADER_DATA, nr_pages,
> +                                       (efi_physical_addr_t *)reserve_addr);
> +
> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
> +       } else {
> +               /*
> +                * Else, try a straight allocation at the preferred offset.
>                  * This will work around the issue where, if dram_base == 0x0,
>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>                  * address of the allocation to be mistaken for a FAIL return
> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * Mustang), we can still place the kernel at the address
>                  * 'dram_base + TEXT_OFFSET'.
>                  */
> +               if (*image_addr == preferred_offset)
> +                       return EFI_SUCCESS;
> +
>                 *image_addr = *reserve_addr = preferred_offset;
>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>                            EFI_PAGE_SIZE;
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
>                                         (efi_physical_addr_t *)reserve_addr);
> +       }
> +
> +       if (status != EFI_SUCCESS) {
> +               kernel_memsize += TEXT_OFFSET;
> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
> +                                      SZ_2M, reserve_addr);
> +
>                 if (status != EFI_SUCCESS) {
> -                       kernel_memsize += TEXT_OFFSET;
> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
> -                                              SZ_2M, reserve_addr);
> -
> -                       if (status != EFI_SUCCESS) {
> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> -                               return status;
> -                       }
> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> +                       return status;
>                 }
> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
> -               *reserve_size = kernel_memsize;
> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> -
> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
> +       *reserve_size = kernel_memsize;
>
>         return EFI_SUCCESS;
>  }
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 569b5a866bb1..13783fdc9bdd 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -299,7 +299,7 @@ typedef struct {
>         void *open_protocol_information;
>         void *protocols_per_handle;
>         void *locate_handle_buffer;
> -       void *locate_protocol;
> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>         void *install_multiple_protocol_interfaces;
>         void *uninstall_multiple_protocol_interfaces;
>         void *calculate_crc32;
> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>  #define EFI_PROPERTIES_TABLE_GUID \
>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>
> +#define EFI_RNG_PROTOCOL_GUID \
> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
> +
>  typedef struct {
>         efi_guid_t guid;
>         u64 table;
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-05 19:53     ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 19:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
>
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.

This logic seems like it should be under the name
CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
keep naming conventions the same across architectures to avoid
confusion.)

-Kees

>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
> index f82036e02485..f41073dde7e0 100644
> --- a/arch/arm64/kernel/efi-entry.S
> +++ b/arch/arm64/kernel/efi-entry.S
> @@ -110,7 +110,7 @@ ENTRY(entry)
>  2:
>         /* Jump to kernel entry point */
>         mov     x0, x20
> -       mov     x1, xzr
> +       ldr     x1, efi_rnd
>         mov     x2, xzr
>         mov     x3, xzr
>         br      x21
> @@ -119,6 +119,9 @@ efi_load_fail:
>         mov     x0, #EFI_LOAD_ERROR
>         ldp     x29, x30, [sp], #32
>         ret
> +ENDPROC(entry)
> +
> +ENTRY(efi_rnd)
> +       .quad   0, 0
>
>  entry_end:
> -ENDPROC(entry)
> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
> index 950c87f5d279..f580bcdfae4f 100644
> --- a/drivers/firmware/efi/libstub/arm-stub.c
> +++ b/drivers/firmware/efi/libstub/arm-stub.c
> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>         out->output_string(out, str);
>  }
>
> -
>  /*
>   * This function handles the architcture specific differences between arm and
>   * arm64 regarding where the kernel image must be loaded and any memory that
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 78dfbd34b6bf..4e5c306346b4 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -13,6 +13,68 @@
>  #include <asm/efi.h>
>  #include <asm/sections.h>
>
> +struct efi_rng_protocol_t {
> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
> +                                unsigned long *,
> +                                efi_guid_t *);
> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
> +                               efi_guid_t *,
> +                               unsigned long,
> +                               u8 *out);
> +};
> +
> +extern struct {
> +       u64     virt_seed;
> +       u64     phys_seed;
> +} efi_rnd;
> +
> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> +       efi_status_t status;
> +       struct efi_rng_protocol_t *rng;
> +
> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
> +                                                     (void **)&rng);
> +       if (status == EFI_NOT_FOUND) {
> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               return EFI_SUCCESS;
> +       }
> +
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
> +}
> +
> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
> +{
> +       unsigned long map_size, desc_size;
> +       efi_memory_desc_t *memory_map;
> +       efi_status_t status;
> +       int l;
> +
> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
> +                                   &desc_size, NULL, NULL);
> +       if (status != EFI_SUCCESS)
> +               return status;
> +
> +       for (l = 0; l < map_size; l += desc_size) {
> +               efi_memory_desc_t *md = (void *)memory_map + l;
> +
> +               if (md->attribute & EFI_MEMORY_WB) {
> +                       u64 phys_end = md->phys_addr +
> +                                      md->num_pages * EFI_PAGE_SIZE;
> +                       if (phys_end > *top)
> +                               *top = phys_end;
> +               }
> +       }
> +
> +       efi_call_early(free_pool, memory_map);
> +
> +       return EFI_SUCCESS;
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
>
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
> +               status = efi_get_random_bytes(sys_table_arg);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
> +                       return status;
> +               }
> +       }
> +
>         /*
>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>          * a 2 MB aligned base, which itself may be lower than dram_base, as
> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         if (preferred_offset < dram_base)
>                 preferred_offset += SZ_2M;
>
> -       /* Relocate the image, if required. */
>         kernel_size = _edata - _text;
> -       if (*image_addr != preferred_offset) {
> -               kernel_memsize = kernel_size + (_end - _edata);
> +       kernel_memsize = kernel_size + (_end - _edata);
> +
> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +               /*
> +                * If KASLR is enabled, and we have some randomness available,
> +                * locate the kernel at a randomized offset in physical memory.
> +                */
> +               u64 dram_top = dram_base;
> +
> +               status = get_dram_top(sys_table_arg, &dram_top);
> +               if (status != EFI_SUCCESS) {
> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +                       return status;
> +               }
> +
> +               kernel_memsize += SZ_2M;
> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +                                   EFI_PAGE_SIZE;
>
>                 /*
> -                * First, try a straight allocation at the preferred offset.
> +                * Use the random seed to scale the size and add it to the DRAM
> +                * base. Note that this may give suboptimal results on systems
> +                * with discontiguous DRAM regions with large holes between them.
> +                */
> +               *reserve_addr = dram_base +
> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +
> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
> +                                       EFI_LOADER_DATA, nr_pages,
> +                                       (efi_physical_addr_t *)reserve_addr);
> +
> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
> +       } else {
> +               /*
> +                * Else, try a straight allocation at the preferred offset.
>                  * This will work around the issue where, if dram_base == 0x0,
>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>                  * address of the allocation to be mistaken for a FAIL return
> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * Mustang), we can still place the kernel at the address
>                  * 'dram_base + TEXT_OFFSET'.
>                  */
> +               if (*image_addr == preferred_offset)
> +                       return EFI_SUCCESS;
> +
>                 *image_addr = *reserve_addr = preferred_offset;
>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>                            EFI_PAGE_SIZE;
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
>                                         (efi_physical_addr_t *)reserve_addr);
> +       }
> +
> +       if (status != EFI_SUCCESS) {
> +               kernel_memsize += TEXT_OFFSET;
> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
> +                                      SZ_2M, reserve_addr);
> +
>                 if (status != EFI_SUCCESS) {
> -                       kernel_memsize += TEXT_OFFSET;
> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
> -                                              SZ_2M, reserve_addr);
> -
> -                       if (status != EFI_SUCCESS) {
> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> -                               return status;
> -                       }
> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
> +                       return status;
>                 }
> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
> -               *reserve_size = kernel_memsize;
> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> -
> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
> +       *reserve_size = kernel_memsize;
>
>         return EFI_SUCCESS;
>  }
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 569b5a866bb1..13783fdc9bdd 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -299,7 +299,7 @@ typedef struct {
>         void *open_protocol_information;
>         void *protocols_per_handle;
>         void *locate_handle_buffer;
> -       void *locate_protocol;
> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>         void *install_multiple_protocol_interfaces;
>         void *uninstall_multiple_protocol_interfaces;
>         void *calculate_crc32;
> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>  #define EFI_PROPERTIES_TABLE_GUID \
>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>
> +#define EFI_RNG_PROTOCOL_GUID \
> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
> +
>  typedef struct {
>         efi_guid_t guid;
>         u64 table;
> --
> 2.5.0
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 00/13] arm64: implement support for KASLR
  2015-12-30 15:25 ` Ard Biesheuvel
  (?)
@ 2016-01-05 20:08   ` Kees Cook
  -1 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 20:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This series implements KASLR for arm64, by building the kernel as a PIE
> executable that can relocate itself at runtime, and moving it to a random
> offset in the vmalloc area. This v2 also implements physical randomization,
> i.e., it allows the kernel to deal with being loaded at any physical offset
> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
> UEFI stub to obtain random bits and perform the actual randomization of the
> physical load address.

This is great! Thanks for working through all these details.

> Changes since v1/RFC:
> - This series now implements fully independent virtual and physical address
>   randomization at load time. I have recycled some patches from this series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>   final UEFI stub patch to randomize the physical address as well.

I'd love to get virt/phy separated on x86. There was a series, but it
still needs more work. Any one on the kernel-hardening list want to
take a stab at this?

> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>   relation between kernel symbols and the linear mapping (on which the HYP
>   mapping is based), as these assumptions cease to be valid once we move the
>   kernel Image out of the linear mapping.
> - Updated the module PLT patch so it works on BE kernels as well.
> - Moved the constant Image header values to head.S, and updated the linker
>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>   time. This allows me to get rid of the post-build perl script to swab header
>   values on BE kernels.
> - Minor style tweaks.
>
> Notes:
> - These patches apply on top of Mark Rutland's pagetable rework series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>   couple of MBs worth of relocation data in an .init section). However, no
>   build time postprocessing is required, we rely fully on the toolchain to
>   produce the image
> - We have to rely on the bootloader to supply some randomness in register x1
>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>   to collect randomness in the head.S code path before mapping the kernel and
>   enabling the MMU.
> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>   implements a pseudo-random version is available here:
>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>   See below for instructions how to run the pseudo-random version on real
>   hardware.
> - Only mildly tested. Help appreciated.
>
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>
> Patch #1 updates the OF code to allow the minimum memblock physical address to
> be overridden by the arch.
>
> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>
> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
> individually.
>
> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
> (i.e., the use of phys_to_virt() is avoided)
>
> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
> are not covered by the linear mapping.
>
> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
> module region which is kept right below it, as before.
>
> Patch #7 adds support for PLTs in modules so that relative branches can be
> resolved via a PLT if the target is out of range.
>
> Patch #8 moves to the x86 version of the extable implementation so that it no
> longer contains absolute addresses that require fixing up at relocation time,
> but uses relative offsets instead.
>
> Patch #9 reverts some changes to the Image header population code so we no
> longer depend on the linker to populate the header fields. This is necessary
> since the R_AARCH64_ABS relocations that are emitted for these fields are not
> resolved at build time for PIE executables.
>
> Patch #10 updates the code in head.S that needs to execute before relocation to
> avoid the use of values that are subject to dynamic relocation. These values
> will not be populated in PIE executables.
>
> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
> decoupling PHYS_OFFSET from the base of the kernel image.
>
> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
> and using it to move the kernel inside the vmalloc area.
>
> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
> kernel proper.

I see a few other things that we'll probably want to add:

- kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)

- randomization of module load address (see get_module_load_offset in
arch/x86/kernel/module.c)

- panic reporting of offset (see register_kernel_offset_dumper in
arch/x86/kernel/setup.c)

- vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
arm64, so maybe not, as kexec appears unimplemented)

> Ard Biesheuvel (13):
>   of/fdt: make memblock minimum physical address arch configurable
>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>   arm64: use more granular reservations for static page table
>     allocations
>   arm64: decouple early fixmap init from linear mapping
>   arm64: kvm: deal with kernel symbols outside of linear mapping
>   arm64: move kernel image to base of vmalloc area
>   arm64: add support for module PLTs
>   arm64: use relative references in exception tables
>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>   arm64: avoid dynamic relocations in early boot code
>   arm64: allow kernel Image to be loaded anywhere in physical memory
>   arm64: add support for relocatable kernel
>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>
>  Documentation/arm64/booting.txt           |  15 ++-
>  arch/arm/include/asm/kvm_asm.h            |   2 +
>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>  arch/arm/kvm/arm.c                        |   9 +-
>  arch/arm/kvm/mmu.c                        |  12 +-
>  arch/arm64/Kconfig                        |  18 +++
>  arch/arm64/Makefile                       |  10 +-
>  arch/arm64/include/asm/assembler.h        |  17 ++-
>  arch/arm64/include/asm/boot.h             |   5 +
>  arch/arm64/include/asm/compiler.h         |   2 +
>  arch/arm64/include/asm/futex.h            |   4 +-
>  arch/arm64/include/asm/kasan.h            |  17 +--
>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>  arch/arm64/include/asm/memory.h           |  37 ++++--
>  arch/arm64/include/asm/module.h           |  11 ++
>  arch/arm64/include/asm/pgtable.h          |   7 -
>  arch/arm64/include/asm/uaccess.h          |  16 +--
>  arch/arm64/include/asm/virt.h             |   4 -
>  arch/arm64/kernel/Makefile                |   1 +
>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>  arch/arm64/kernel/efi-entry.S             |   9 +-
>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>  arch/arm64/kernel/image.h                 |  37 ++----
>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>  arch/arm64/kernel/module.c                |   7 +
>  arch/arm64/kernel/module.lds              |   4 +
>  arch/arm64/kernel/setup.c                 |  15 ++-
>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>  arch/arm64/kvm/debug.c                    |   4 +-
>  arch/arm64/mm/dump.c                      |  12 +-
>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>  arch/arm64/mm/init.c                      |  75 +++++++++--
>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>  drivers/of/fdt.c                          |   5 +-
>  include/linux/efi.h                       |   5 +-
>  scripts/sortextable.c                     |   6 +-
>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>  41 files changed, 813 insertions(+), 257 deletions(-)
>  create mode 100644 arch/arm64/kernel/module-plts.c
>  create mode 100644 arch/arm64/kernel/module.lds
>
>
> EFI_RNG_PROTOCOL on real hardware
> =================================
>
> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
> following executable and run it from the UEFI Shell:
> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>
> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
> values and permutes them using a couple of rounds of AES.
> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>
> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
> code, and is not suitable for anything else.
>
> FS0:\> rngdxe
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Pass]
>  -- Call RNG->GetInfo() interface :
>      >> Supported RNG Algorithm (Count = 2) :
>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>  -- Call RNG->GetRNG() interface :
>      >> RNG with default algorithm : [Pass]
>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-CTR-256 : [Pass]
>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>      >> RNG with RAW Entropy : [Pass]
>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>           01) - 27
>           02) - 61E8
>           03) - 496FD8
>           04) - DDD793BF
>           05) - B6C37C8E23
>           06) - 4D183C604A96
>           07) - 9363311DB61298
>           08) - 5715A7294F4E436E
>           09) - F0D4D7BAA0DD52318E
>           10) - C88C6EBCF4C0474D87C3
>           11) - B5594602B482A643932172
>           12) - CA7573F704B2089B726B9CF1
>           13) - A93E9451CB533DCFBA87B97C33
>           14) - 45AA7B83DB6044F7BBAB031F0D24
>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>           17) - 4BBB225E61C465F174254159467E65939F
>           18) - 030A156C9616337A20070941E702827DA8E1
>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>

Have you done any repeated boot testing? When I originally did x86
kASLR, I had a machine rebooting over and over spitting the _text line
from /proc/kallsyms to the console. This both caught page table corner
cases where the system was unbootable and let me run a statistical
analysis of the offsets, just to make sure there wasn't any glaring
error in either the RNG or the relocation.

Very cool!

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 00/13] arm64: implement support for KASLR
@ 2016-01-05 20:08   ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 20:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This series implements KASLR for arm64, by building the kernel as a PIE
> executable that can relocate itself at runtime, and moving it to a random
> offset in the vmalloc area. This v2 also implements physical randomization,
> i.e., it allows the kernel to deal with being loaded at any physical offset
> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
> UEFI stub to obtain random bits and perform the actual randomization of the
> physical load address.

This is great! Thanks for working through all these details.

> Changes since v1/RFC:
> - This series now implements fully independent virtual and physical address
>   randomization at load time. I have recycled some patches from this series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>   final UEFI stub patch to randomize the physical address as well.

I'd love to get virt/phy separated on x86. There was a series, but it
still needs more work. Any one on the kernel-hardening list want to
take a stab at this?

> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>   relation between kernel symbols and the linear mapping (on which the HYP
>   mapping is based), as these assumptions cease to be valid once we move the
>   kernel Image out of the linear mapping.
> - Updated the module PLT patch so it works on BE kernels as well.
> - Moved the constant Image header values to head.S, and updated the linker
>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>   time. This allows me to get rid of the post-build perl script to swab header
>   values on BE kernels.
> - Minor style tweaks.
>
> Notes:
> - These patches apply on top of Mark Rutland's pagetable rework series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>   couple of MBs worth of relocation data in an .init section). However, no
>   build time postprocessing is required, we rely fully on the toolchain to
>   produce the image
> - We have to rely on the bootloader to supply some randomness in register x1
>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>   to collect randomness in the head.S code path before mapping the kernel and
>   enabling the MMU.
> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>   implements a pseudo-random version is available here:
>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>   See below for instructions how to run the pseudo-random version on real
>   hardware.
> - Only mildly tested. Help appreciated.
>
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>
> Patch #1 updates the OF code to allow the minimum memblock physical address to
> be overridden by the arch.
>
> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>
> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
> individually.
>
> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
> (i.e., the use of phys_to_virt() is avoided)
>
> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
> are not covered by the linear mapping.
>
> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
> module region which is kept right below it, as before.
>
> Patch #7 adds support for PLTs in modules so that relative branches can be
> resolved via a PLT if the target is out of range.
>
> Patch #8 moves to the x86 version of the extable implementation so that it no
> longer contains absolute addresses that require fixing up at relocation time,
> but uses relative offsets instead.
>
> Patch #9 reverts some changes to the Image header population code so we no
> longer depend on the linker to populate the header fields. This is necessary
> since the R_AARCH64_ABS relocations that are emitted for these fields are not
> resolved at build time for PIE executables.
>
> Patch #10 updates the code in head.S that needs to execute before relocation to
> avoid the use of values that are subject to dynamic relocation. These values
> will not be populated in PIE executables.
>
> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
> decoupling PHYS_OFFSET from the base of the kernel image.
>
> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
> and using it to move the kernel inside the vmalloc area.
>
> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
> kernel proper.

I see a few other things that we'll probably want to add:

- kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)

- randomization of module load address (see get_module_load_offset in
arch/x86/kernel/module.c)

- panic reporting of offset (see register_kernel_offset_dumper in
arch/x86/kernel/setup.c)

- vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
arm64, so maybe not, as kexec appears unimplemented)

> Ard Biesheuvel (13):
>   of/fdt: make memblock minimum physical address arch configurable
>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>   arm64: use more granular reservations for static page table
>     allocations
>   arm64: decouple early fixmap init from linear mapping
>   arm64: kvm: deal with kernel symbols outside of linear mapping
>   arm64: move kernel image to base of vmalloc area
>   arm64: add support for module PLTs
>   arm64: use relative references in exception tables
>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>   arm64: avoid dynamic relocations in early boot code
>   arm64: allow kernel Image to be loaded anywhere in physical memory
>   arm64: add support for relocatable kernel
>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>
>  Documentation/arm64/booting.txt           |  15 ++-
>  arch/arm/include/asm/kvm_asm.h            |   2 +
>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>  arch/arm/kvm/arm.c                        |   9 +-
>  arch/arm/kvm/mmu.c                        |  12 +-
>  arch/arm64/Kconfig                        |  18 +++
>  arch/arm64/Makefile                       |  10 +-
>  arch/arm64/include/asm/assembler.h        |  17 ++-
>  arch/arm64/include/asm/boot.h             |   5 +
>  arch/arm64/include/asm/compiler.h         |   2 +
>  arch/arm64/include/asm/futex.h            |   4 +-
>  arch/arm64/include/asm/kasan.h            |  17 +--
>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>  arch/arm64/include/asm/memory.h           |  37 ++++--
>  arch/arm64/include/asm/module.h           |  11 ++
>  arch/arm64/include/asm/pgtable.h          |   7 -
>  arch/arm64/include/asm/uaccess.h          |  16 +--
>  arch/arm64/include/asm/virt.h             |   4 -
>  arch/arm64/kernel/Makefile                |   1 +
>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>  arch/arm64/kernel/efi-entry.S             |   9 +-
>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>  arch/arm64/kernel/image.h                 |  37 ++----
>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>  arch/arm64/kernel/module.c                |   7 +
>  arch/arm64/kernel/module.lds              |   4 +
>  arch/arm64/kernel/setup.c                 |  15 ++-
>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>  arch/arm64/kvm/debug.c                    |   4 +-
>  arch/arm64/mm/dump.c                      |  12 +-
>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>  arch/arm64/mm/init.c                      |  75 +++++++++--
>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>  drivers/of/fdt.c                          |   5 +-
>  include/linux/efi.h                       |   5 +-
>  scripts/sortextable.c                     |   6 +-
>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>  41 files changed, 813 insertions(+), 257 deletions(-)
>  create mode 100644 arch/arm64/kernel/module-plts.c
>  create mode 100644 arch/arm64/kernel/module.lds
>
>
> EFI_RNG_PROTOCOL on real hardware
> =================================
>
> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
> following executable and run it from the UEFI Shell:
> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>
> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
> values and permutes them using a couple of rounds of AES.
> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>
> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
> code, and is not suitable for anything else.
>
> FS0:\> rngdxe
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Pass]
>  -- Call RNG->GetInfo() interface :
>      >> Supported RNG Algorithm (Count = 2) :
>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>  -- Call RNG->GetRNG() interface :
>      >> RNG with default algorithm : [Pass]
>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-CTR-256 : [Pass]
>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>      >> RNG with RAW Entropy : [Pass]
>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>           01) - 27
>           02) - 61E8
>           03) - 496FD8
>           04) - DDD793BF
>           05) - B6C37C8E23
>           06) - 4D183C604A96
>           07) - 9363311DB61298
>           08) - 5715A7294F4E436E
>           09) - F0D4D7BAA0DD52318E
>           10) - C88C6EBCF4C0474D87C3
>           11) - B5594602B482A643932172
>           12) - CA7573F704B2089B726B9CF1
>           13) - A93E9451CB533DCFBA87B97C33
>           14) - 45AA7B83DB6044F7BBAB031F0D24
>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>           17) - 4BBB225E61C465F174254159467E65939F
>           18) - 030A156C9616337A20070941E702827DA8E1
>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>

Have you done any repeated boot testing? When I originally did x86
kASLR, I had a machine rebooting over and over spitting the _text line
from /proc/kallsyms to the console. This both caught page table corner
cases where the system was unbootable and let me run a statistical
analysis of the offsets, just to make sure there wasn't any glaring
error in either the RNG or the relocation.

Very cool!

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 00/13] arm64: implement support for KASLR
@ 2016-01-05 20:08   ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-05 20:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, stuart.yoder, bhupesh.sharma,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This series implements KASLR for arm64, by building the kernel as a PIE
> executable that can relocate itself at runtime, and moving it to a random
> offset in the vmalloc area. This v2 also implements physical randomization,
> i.e., it allows the kernel to deal with being loaded at any physical offset
> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
> UEFI stub to obtain random bits and perform the actual randomization of the
> physical load address.

This is great! Thanks for working through all these details.

> Changes since v1/RFC:
> - This series now implements fully independent virtual and physical address
>   randomization at load time. I have recycled some patches from this series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>   final UEFI stub patch to randomize the physical address as well.

I'd love to get virt/phy separated on x86. There was a series, but it
still needs more work. Any one on the kernel-hardening list want to
take a stab at this?

> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>   relation between kernel symbols and the linear mapping (on which the HYP
>   mapping is based), as these assumptions cease to be valid once we move the
>   kernel Image out of the linear mapping.
> - Updated the module PLT patch so it works on BE kernels as well.
> - Moved the constant Image header values to head.S, and updated the linker
>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>   time. This allows me to get rid of the post-build perl script to swab header
>   values on BE kernels.
> - Minor style tweaks.
>
> Notes:
> - These patches apply on top of Mark Rutland's pagetable rework series:
>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>   couple of MBs worth of relocation data in an .init section). However, no
>   build time postprocessing is required, we rely fully on the toolchain to
>   produce the image
> - We have to rely on the bootloader to supply some randomness in register x1
>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>   to collect randomness in the head.S code path before mapping the kernel and
>   enabling the MMU.
> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>   implements a pseudo-random version is available here:
>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>   See below for instructions how to run the pseudo-random version on real
>   hardware.
> - Only mildly tested. Help appreciated.
>
> Code can be found here:
> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>
> Patch #1 updates the OF code to allow the minimum memblock physical address to
> be overridden by the arch.
>
> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>
> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
> individually.
>
> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
> (i.e., the use of phys_to_virt() is avoided)
>
> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
> are not covered by the linear mapping.
>
> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
> module region which is kept right below it, as before.
>
> Patch #7 adds support for PLTs in modules so that relative branches can be
> resolved via a PLT if the target is out of range.
>
> Patch #8 moves to the x86 version of the extable implementation so that it no
> longer contains absolute addresses that require fixing up at relocation time,
> but uses relative offsets instead.
>
> Patch #9 reverts some changes to the Image header population code so we no
> longer depend on the linker to populate the header fields. This is necessary
> since the R_AARCH64_ABS relocations that are emitted for these fields are not
> resolved at build time for PIE executables.
>
> Patch #10 updates the code in head.S that needs to execute before relocation to
> avoid the use of values that are subject to dynamic relocation. These values
> will not be populated in PIE executables.
>
> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
> decoupling PHYS_OFFSET from the base of the kernel image.
>
> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
> and using it to move the kernel inside the vmalloc area.
>
> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
> kernel proper.

I see a few other things that we'll probably want to add:

- kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)

- randomization of module load address (see get_module_load_offset in
arch/x86/kernel/module.c)

- panic reporting of offset (see register_kernel_offset_dumper in
arch/x86/kernel/setup.c)

- vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
arm64, so maybe not, as kexec appears unimplemented)

> Ard Biesheuvel (13):
>   of/fdt: make memblock minimum physical address arch configurable
>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>   arm64: use more granular reservations for static page table
>     allocations
>   arm64: decouple early fixmap init from linear mapping
>   arm64: kvm: deal with kernel symbols outside of linear mapping
>   arm64: move kernel image to base of vmalloc area
>   arm64: add support for module PLTs
>   arm64: use relative references in exception tables
>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>   arm64: avoid dynamic relocations in early boot code
>   arm64: allow kernel Image to be loaded anywhere in physical memory
>   arm64: add support for relocatable kernel
>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>
>  Documentation/arm64/booting.txt           |  15 ++-
>  arch/arm/include/asm/kvm_asm.h            |   2 +
>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>  arch/arm/kvm/arm.c                        |   9 +-
>  arch/arm/kvm/mmu.c                        |  12 +-
>  arch/arm64/Kconfig                        |  18 +++
>  arch/arm64/Makefile                       |  10 +-
>  arch/arm64/include/asm/assembler.h        |  17 ++-
>  arch/arm64/include/asm/boot.h             |   5 +
>  arch/arm64/include/asm/compiler.h         |   2 +
>  arch/arm64/include/asm/futex.h            |   4 +-
>  arch/arm64/include/asm/kasan.h            |  17 +--
>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>  arch/arm64/include/asm/memory.h           |  37 ++++--
>  arch/arm64/include/asm/module.h           |  11 ++
>  arch/arm64/include/asm/pgtable.h          |   7 -
>  arch/arm64/include/asm/uaccess.h          |  16 +--
>  arch/arm64/include/asm/virt.h             |   4 -
>  arch/arm64/kernel/Makefile                |   1 +
>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>  arch/arm64/kernel/efi-entry.S             |   9 +-
>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>  arch/arm64/kernel/image.h                 |  37 ++----
>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>  arch/arm64/kernel/module.c                |   7 +
>  arch/arm64/kernel/module.lds              |   4 +
>  arch/arm64/kernel/setup.c                 |  15 ++-
>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>  arch/arm64/kvm/debug.c                    |   4 +-
>  arch/arm64/mm/dump.c                      |  12 +-
>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>  arch/arm64/mm/init.c                      |  75 +++++++++--
>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>  drivers/of/fdt.c                          |   5 +-
>  include/linux/efi.h                       |   5 +-
>  scripts/sortextable.c                     |   6 +-
>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>  41 files changed, 813 insertions(+), 257 deletions(-)
>  create mode 100644 arch/arm64/kernel/module-plts.c
>  create mode 100644 arch/arm64/kernel/module.lds
>
>
> EFI_RNG_PROTOCOL on real hardware
> =================================
>
> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
> following executable and run it from the UEFI Shell:
> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>
> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
> values and permutes them using a couple of rounds of AES.
> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>
> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
> code, and is not suitable for anything else.
>
> FS0:\> rngdxe
> FS0:\> rngtest
> UEFI RNG Protocol Testing :
> ----------------------------
>  -- Locate UEFI RNG Protocol : [Pass]
>  -- Call RNG->GetInfo() interface :
>      >> Supported RNG Algorithm (Count = 2) :
>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>  -- Call RNG->GetRNG() interface :
>      >> RNG with default algorithm : [Pass]
>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>      >> RNG with SP800-90-CTR-256 : [Pass]
>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>      >> RNG with RAW Entropy : [Pass]
>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>           01) - 27
>           02) - 61E8
>           03) - 496FD8
>           04) - DDD793BF
>           05) - B6C37C8E23
>           06) - 4D183C604A96
>           07) - 9363311DB61298
>           08) - 5715A7294F4E436E
>           09) - F0D4D7BAA0DD52318E
>           10) - C88C6EBCF4C0474D87C3
>           11) - B5594602B482A643932172
>           12) - CA7573F704B2089B726B9CF1
>           13) - A93E9451CB533DCFBA87B97C33
>           14) - 45AA7B83DB6044F7BBAB031F0D24
>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>           17) - 4BBB225E61C465F174254159467E65939F
>           18) - 030A156C9616337A20070941E702827DA8E1
>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>

Have you done any repeated boot testing? When I originally did x86
kASLR, I had a machine rebooting over and over spitting the _text line
from /proc/kallsyms to the console. This both caught page table corner
cases where the system was unbootable and let me run a statistical
analysis of the offsets, just to make sure there wasn't any glaring
error in either the RNG or the relocation.

Very cool!

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 00/13] arm64: implement support for KASLR
  2016-01-05 20:08   ` Kees Cook
  (?)
@ 2016-01-05 21:24     ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 21:24 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 21:08, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This series implements KASLR for arm64, by building the kernel as a PIE
>> executable that can relocate itself at runtime, and moving it to a random
>> offset in the vmalloc area. This v2 also implements physical randomization,
>> i.e., it allows the kernel to deal with being loaded at any physical offset
>> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
>> UEFI stub to obtain random bits and perform the actual randomization of the
>> physical load address.
>
> This is great! Thanks for working through all these details.
>
>> Changes since v1/RFC:
>> - This series now implements fully independent virtual and physical address
>>   randomization at load time. I have recycled some patches from this series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>>   final UEFI stub patch to randomize the physical address as well.
>
> I'd love to get virt/phy separated on x86. There was a series, but it
> still needs more work. Any one on the kernel-hardening list want to
> take a stab at this?
>
>> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>>   relation between kernel symbols and the linear mapping (on which the HYP
>>   mapping is based), as these assumptions cease to be valid once we move the
>>   kernel Image out of the linear mapping.
>> - Updated the module PLT patch so it works on BE kernels as well.
>> - Moved the constant Image header values to head.S, and updated the linker
>>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>>   time. This allows me to get rid of the post-build perl script to swab header
>>   values on BE kernels.
>> - Minor style tweaks.
>>
>> Notes:
>> - These patches apply on top of Mark Rutland's pagetable rework series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
>> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>>   couple of MBs worth of relocation data in an .init section). However, no
>>   build time postprocessing is required, we rely fully on the toolchain to
>>   produce the image
>> - We have to rely on the bootloader to supply some randomness in register x1
>>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>>   to collect randomness in the head.S code path before mapping the kernel and
>>   enabling the MMU.
>> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>>   implements a pseudo-random version is available here:
>>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>>   See below for instructions how to run the pseudo-random version on real
>>   hardware.
>> - Only mildly tested. Help appreciated.
>>
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>>
>> Patch #1 updates the OF code to allow the minimum memblock physical address to
>> be overridden by the arch.
>>
>> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>>
>> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
>> individually.
>>
>> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
>> (i.e., the use of phys_to_virt() is avoided)
>>
>> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
>> are not covered by the linear mapping.
>>
>> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
>> module region which is kept right below it, as before.
>>
>> Patch #7 adds support for PLTs in modules so that relative branches can be
>> resolved via a PLT if the target is out of range.
>>
>> Patch #8 moves to the x86 version of the extable implementation so that it no
>> longer contains absolute addresses that require fixing up at relocation time,
>> but uses relative offsets instead.
>>
>> Patch #9 reverts some changes to the Image header population code so we no
>> longer depend on the linker to populate the header fields. This is necessary
>> since the R_AARCH64_ABS relocations that are emitted for these fields are not
>> resolved at build time for PIE executables.
>>
>> Patch #10 updates the code in head.S that needs to execute before relocation to
>> avoid the use of values that are subject to dynamic relocation. These values
>> will not be populated in PIE executables.
>>
>> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
>> decoupling PHYS_OFFSET from the base of the kernel image.
>>
>> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
>> and using it to move the kernel inside the vmalloc area.
>>
>> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
>> kernel proper.
>
> I see a few other things that we'll probably want to add:
>
> - kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)
>

Yes, that makes sense. For the UEFI stub version of the randomization,
that is trivially achievable, since we already parse the command line
in that context. For the bare bootloader case, that is a bit more
involved, since we'd need to parse the FDT before applying the
relocations, or apply the relocations twice.

> - randomization of module load address (see get_module_load_offset in
> arch/x86/kernel/module.c)
>
> - panic reporting of offset (see register_kernel_offset_dumper in
> arch/x86/kernel/setup.c)
>
> - vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
> arm64, so maybe not, as kexec appears unimplemented)
>

I will look into all of these. kexec support has been in flight for a
while now, but I have no idea when it is expected to land.

>> Ard Biesheuvel (13):
>>   of/fdt: make memblock minimum physical address arch configurable
>>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>>   arm64: use more granular reservations for static page table
>>     allocations
>>   arm64: decouple early fixmap init from linear mapping
>>   arm64: kvm: deal with kernel symbols outside of linear mapping
>>   arm64: move kernel image to base of vmalloc area
>>   arm64: add support for module PLTs
>>   arm64: use relative references in exception tables
>>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>>   arm64: avoid dynamic relocations in early boot code
>>   arm64: allow kernel Image to be loaded anywhere in physical memory
>>   arm64: add support for relocatable kernel
>>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>>
>>  Documentation/arm64/booting.txt           |  15 ++-
>>  arch/arm/include/asm/kvm_asm.h            |   2 +
>>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>>  arch/arm/kvm/arm.c                        |   9 +-
>>  arch/arm/kvm/mmu.c                        |  12 +-
>>  arch/arm64/Kconfig                        |  18 +++
>>  arch/arm64/Makefile                       |  10 +-
>>  arch/arm64/include/asm/assembler.h        |  17 ++-
>>  arch/arm64/include/asm/boot.h             |   5 +
>>  arch/arm64/include/asm/compiler.h         |   2 +
>>  arch/arm64/include/asm/futex.h            |   4 +-
>>  arch/arm64/include/asm/kasan.h            |  17 +--
>>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>>  arch/arm64/include/asm/memory.h           |  37 ++++--
>>  arch/arm64/include/asm/module.h           |  11 ++
>>  arch/arm64/include/asm/pgtable.h          |   7 -
>>  arch/arm64/include/asm/uaccess.h          |  16 +--
>>  arch/arm64/include/asm/virt.h             |   4 -
>>  arch/arm64/kernel/Makefile                |   1 +
>>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>>  arch/arm64/kernel/efi-entry.S             |   9 +-
>>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>>  arch/arm64/kernel/image.h                 |  37 ++----
>>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>>  arch/arm64/kernel/module.c                |   7 +
>>  arch/arm64/kernel/module.lds              |   4 +
>>  arch/arm64/kernel/setup.c                 |  15 ++-
>>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>>  arch/arm64/kvm/debug.c                    |   4 +-
>>  arch/arm64/mm/dump.c                      |  12 +-
>>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>>  arch/arm64/mm/init.c                      |  75 +++++++++--
>>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>>  drivers/of/fdt.c                          |   5 +-
>>  include/linux/efi.h                       |   5 +-
>>  scripts/sortextable.c                     |   6 +-
>>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>>  41 files changed, 813 insertions(+), 257 deletions(-)
>>  create mode 100644 arch/arm64/kernel/module-plts.c
>>  create mode 100644 arch/arm64/kernel/module.lds
>>
>>
>> EFI_RNG_PROTOCOL on real hardware
>> =================================
>>
>> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
>> following executable and run it from the UEFI Shell:
>> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>>
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>>
>> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
>> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
>> values and permutes them using a couple of rounds of AES.
>> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>>
>> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
>> code, and is not suitable for anything else.
>>
>> FS0:\> rngdxe
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Pass]
>>  -- Call RNG->GetInfo() interface :
>>      >> Supported RNG Algorithm (Count = 2) :
>>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>>  -- Call RNG->GetRNG() interface :
>>      >> RNG with default algorithm : [Pass]
>>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-CTR-256 : [Pass]
>>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>>      >> RNG with RAW Entropy : [Pass]
>>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>>           01) - 27
>>           02) - 61E8
>>           03) - 496FD8
>>           04) - DDD793BF
>>           05) - B6C37C8E23
>>           06) - 4D183C604A96
>>           07) - 9363311DB61298
>>           08) - 5715A7294F4E436E
>>           09) - F0D4D7BAA0DD52318E
>>           10) - C88C6EBCF4C0474D87C3
>>           11) - B5594602B482A643932172
>>           12) - CA7573F704B2089B726B9CF1
>>           13) - A93E9451CB533DCFBA87B97C33
>>           14) - 45AA7B83DB6044F7BBAB031F0D24
>>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>>           17) - 4BBB225E61C465F174254159467E65939F
>>           18) - 030A156C9616337A20070941E702827DA8E1
>>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>>
>
> Have you done any repeated boot testing?

Not automatically, no.

> When I originally did x86
> kASLR, I had a machine rebooting over and over spitting the _text line
> from /proc/kallsyms to the console. This both caught page table corner
> cases where the system was unbootable and let me run a statistical
> analysis of the offsets, just to make sure there wasn't any glaring
> error in either the RNG or the relocation.
>

Well, I conveniently punted the RNG problem to the firmware, so as far
as the quality of the randomness is concerned, it is easy to shift the
blame. For the other stuff (including the quality of the translation
between a random number and a KASRL offset), I highly appreciate your
input.

> Very cool!
>

Thanks!

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 00/13] arm64: implement support for KASLR
@ 2016-01-05 21:24     ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 21:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 January 2016 at 21:08, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This series implements KASLR for arm64, by building the kernel as a PIE
>> executable that can relocate itself at runtime, and moving it to a random
>> offset in the vmalloc area. This v2 also implements physical randomization,
>> i.e., it allows the kernel to deal with being loaded at any physical offset
>> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
>> UEFI stub to obtain random bits and perform the actual randomization of the
>> physical load address.
>
> This is great! Thanks for working through all these details.
>
>> Changes since v1/RFC:
>> - This series now implements fully independent virtual and physical address
>>   randomization at load time. I have recycled some patches from this series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>>   final UEFI stub patch to randomize the physical address as well.
>
> I'd love to get virt/phy separated on x86. There was a series, but it
> still needs more work. Any one on the kernel-hardening list want to
> take a stab at this?
>
>> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>>   relation between kernel symbols and the linear mapping (on which the HYP
>>   mapping is based), as these assumptions cease to be valid once we move the
>>   kernel Image out of the linear mapping.
>> - Updated the module PLT patch so it works on BE kernels as well.
>> - Moved the constant Image header values to head.S, and updated the linker
>>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>>   time. This allows me to get rid of the post-build perl script to swab header
>>   values on BE kernels.
>> - Minor style tweaks.
>>
>> Notes:
>> - These patches apply on top of Mark Rutland's pagetable rework series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
>> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>>   couple of MBs worth of relocation data in an .init section). However, no
>>   build time postprocessing is required, we rely fully on the toolchain to
>>   produce the image
>> - We have to rely on the bootloader to supply some randomness in register x1
>>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>>   to collect randomness in the head.S code path before mapping the kernel and
>>   enabling the MMU.
>> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>>   implements a pseudo-random version is available here:
>>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>>   See below for instructions how to run the pseudo-random version on real
>>   hardware.
>> - Only mildly tested. Help appreciated.
>>
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>>
>> Patch #1 updates the OF code to allow the minimum memblock physical address to
>> be overridden by the arch.
>>
>> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>>
>> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
>> individually.
>>
>> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
>> (i.e., the use of phys_to_virt() is avoided)
>>
>> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
>> are not covered by the linear mapping.
>>
>> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
>> module region which is kept right below it, as before.
>>
>> Patch #7 adds support for PLTs in modules so that relative branches can be
>> resolved via a PLT if the target is out of range.
>>
>> Patch #8 moves to the x86 version of the extable implementation so that it no
>> longer contains absolute addresses that require fixing up at relocation time,
>> but uses relative offsets instead.
>>
>> Patch #9 reverts some changes to the Image header population code so we no
>> longer depend on the linker to populate the header fields. This is necessary
>> since the R_AARCH64_ABS relocations that are emitted for these fields are not
>> resolved at build time for PIE executables.
>>
>> Patch #10 updates the code in head.S that needs to execute before relocation to
>> avoid the use of values that are subject to dynamic relocation. These values
>> will not be populated in PIE executables.
>>
>> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
>> decoupling PHYS_OFFSET from the base of the kernel image.
>>
>> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
>> and using it to move the kernel inside the vmalloc area.
>>
>> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
>> kernel proper.
>
> I see a few other things that we'll probably want to add:
>
> - kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)
>

Yes, that makes sense. For the UEFI stub version of the randomization,
that is trivially achievable, since we already parse the command line
in that context. For the bare bootloader case, that is a bit more
involved, since we'd need to parse the FDT before applying the
relocations, or apply the relocations twice.

> - randomization of module load address (see get_module_load_offset in
> arch/x86/kernel/module.c)
>
> - panic reporting of offset (see register_kernel_offset_dumper in
> arch/x86/kernel/setup.c)
>
> - vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
> arm64, so maybe not, as kexec appears unimplemented)
>

I will look into all of these. kexec support has been in flight for a
while now, but I have no idea when it is expected to land.

>> Ard Biesheuvel (13):
>>   of/fdt: make memblock minimum physical address arch configurable
>>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>>   arm64: use more granular reservations for static page table
>>     allocations
>>   arm64: decouple early fixmap init from linear mapping
>>   arm64: kvm: deal with kernel symbols outside of linear mapping
>>   arm64: move kernel image to base of vmalloc area
>>   arm64: add support for module PLTs
>>   arm64: use relative references in exception tables
>>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>>   arm64: avoid dynamic relocations in early boot code
>>   arm64: allow kernel Image to be loaded anywhere in physical memory
>>   arm64: add support for relocatable kernel
>>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>>
>>  Documentation/arm64/booting.txt           |  15 ++-
>>  arch/arm/include/asm/kvm_asm.h            |   2 +
>>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>>  arch/arm/kvm/arm.c                        |   9 +-
>>  arch/arm/kvm/mmu.c                        |  12 +-
>>  arch/arm64/Kconfig                        |  18 +++
>>  arch/arm64/Makefile                       |  10 +-
>>  arch/arm64/include/asm/assembler.h        |  17 ++-
>>  arch/arm64/include/asm/boot.h             |   5 +
>>  arch/arm64/include/asm/compiler.h         |   2 +
>>  arch/arm64/include/asm/futex.h            |   4 +-
>>  arch/arm64/include/asm/kasan.h            |  17 +--
>>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>>  arch/arm64/include/asm/memory.h           |  37 ++++--
>>  arch/arm64/include/asm/module.h           |  11 ++
>>  arch/arm64/include/asm/pgtable.h          |   7 -
>>  arch/arm64/include/asm/uaccess.h          |  16 +--
>>  arch/arm64/include/asm/virt.h             |   4 -
>>  arch/arm64/kernel/Makefile                |   1 +
>>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>>  arch/arm64/kernel/efi-entry.S             |   9 +-
>>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>>  arch/arm64/kernel/image.h                 |  37 ++----
>>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>>  arch/arm64/kernel/module.c                |   7 +
>>  arch/arm64/kernel/module.lds              |   4 +
>>  arch/arm64/kernel/setup.c                 |  15 ++-
>>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>>  arch/arm64/kvm/debug.c                    |   4 +-
>>  arch/arm64/mm/dump.c                      |  12 +-
>>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>>  arch/arm64/mm/init.c                      |  75 +++++++++--
>>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>>  drivers/of/fdt.c                          |   5 +-
>>  include/linux/efi.h                       |   5 +-
>>  scripts/sortextable.c                     |   6 +-
>>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>>  41 files changed, 813 insertions(+), 257 deletions(-)
>>  create mode 100644 arch/arm64/kernel/module-plts.c
>>  create mode 100644 arch/arm64/kernel/module.lds
>>
>>
>> EFI_RNG_PROTOCOL on real hardware
>> =================================
>>
>> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
>> following executable and run it from the UEFI Shell:
>> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>>
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>>
>> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
>> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
>> values and permutes them using a couple of rounds of AES.
>> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>>
>> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
>> code, and is not suitable for anything else.
>>
>> FS0:\> rngdxe
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Pass]
>>  -- Call RNG->GetInfo() interface :
>>      >> Supported RNG Algorithm (Count = 2) :
>>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>>  -- Call RNG->GetRNG() interface :
>>      >> RNG with default algorithm : [Pass]
>>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-CTR-256 : [Pass]
>>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>>      >> RNG with RAW Entropy : [Pass]
>>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>>           01) - 27
>>           02) - 61E8
>>           03) - 496FD8
>>           04) - DDD793BF
>>           05) - B6C37C8E23
>>           06) - 4D183C604A96
>>           07) - 9363311DB61298
>>           08) - 5715A7294F4E436E
>>           09) - F0D4D7BAA0DD52318E
>>           10) - C88C6EBCF4C0474D87C3
>>           11) - B5594602B482A643932172
>>           12) - CA7573F704B2089B726B9CF1
>>           13) - A93E9451CB533DCFBA87B97C33
>>           14) - 45AA7B83DB6044F7BBAB031F0D24
>>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>>           17) - 4BBB225E61C465F174254159467E65939F
>>           18) - 030A156C9616337A20070941E702827DA8E1
>>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>>
>
> Have you done any repeated boot testing?

Not automatically, no.

> When I originally did x86
> kASLR, I had a machine rebooting over and over spitting the _text line
> from /proc/kallsyms to the console. This both caught page table corner
> cases where the system was unbootable and let me run a statistical
> analysis of the offsets, just to make sure there wasn't any glaring
> error in either the RNG or the relocation.
>

Well, I conveniently punted the RNG problem to the firmware, so as far
as the quality of the randomness is concerned, it is easy to shift the
blame. For the other stuff (including the quality of the translation
between a random number and a KASRL offset), I highly appreciate your
input.

> Very cool!
>

Thanks!

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 00/13] arm64: implement support for KASLR
@ 2016-01-05 21:24     ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-05 21:24 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 21:08, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:25 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This series implements KASLR for arm64, by building the kernel as a PIE
>> executable that can relocate itself at runtime, and moving it to a random
>> offset in the vmalloc area. This v2 also implements physical randomization,
>> i.e., it allows the kernel to deal with being loaded at any physical offset
>> (modulo the required alignment), and invokes the EFI_RNG_PROTOCOL from the
>> UEFI stub to obtain random bits and perform the actual randomization of the
>> physical load address.
>
> This is great! Thanks for working through all these details.
>
>> Changes since v1/RFC:
>> - This series now implements fully independent virtual and physical address
>>   randomization at load time. I have recycled some patches from this series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/455151, and updated the
>>   final UEFI stub patch to randomize the physical address as well.
>
> I'd love to get virt/phy separated on x86. There was a series, but it
> still needs more work. Any one on the kernel-hardening list want to
> take a stab at this?
>
>> - Added a patch to deal with the way KVM on arm64 makes assumptions about the
>>   relation between kernel symbols and the linear mapping (on which the HYP
>>   mapping is based), as these assumptions cease to be valid once we move the
>>   kernel Image out of the linear mapping.
>> - Updated the module PLT patch so it works on BE kernels as well.
>> - Moved the constant Image header values to head.S, and updated the linker
>>   script to provide the kernel size using R_AARCH64_ABS32 relocation rather
>>   than a R_AARCH64_ABS64 relocation, since those are always resolved at build
>>   time. This allows me to get rid of the post-build perl script to swab header
>>   values on BE kernels.
>> - Minor style tweaks.
>>
>> Notes:
>> - These patches apply on top of Mark Rutland's pagetable rework series:
>>   http://thread.gmane.org/gmane.linux.ports.arm.kernel/462438
>> - The arm64 Image is uncompressed by default, and the Elf64_Rela format uses
>>   24 bytes per relocation entry. This results in considerable bloat (i.e., a
>>   couple of MBs worth of relocation data in an .init section). However, no
>>   build time postprocessing is required, we rely fully on the toolchain to
>>   produce the image
>> - We have to rely on the bootloader to supply some randomness in register x1
>>   upon kernel entry. Since we have no decompressor, it is simply not feasible
>>   to collect randomness in the head.S code path before mapping the kernel and
>>   enabling the MMU.
>> - The EFI_RNG_PROTOCOL that is invoked in patch #13 to supply randomness on
>>   UEFI systems is not universally available. A QEMU/KVM firmware image that
>>   implements a pseudo-random version is available here:
>>   http://people.linaro.org/~ard.biesheuvel/QEMU_EFI.fd.aarch64-rng.bz2
>>   (requires access to PMCCNTR_EL0 and support for AES instructions)
>>   See below for instructions how to run the pseudo-random version on real
>>   hardware.
>> - Only mildly tested. Help appreciated.
>>
>> Code can be found here:
>> git://git.linaro.org/people/ard.biesheuvel/linux-arm.git arm64-kaslr-v2
>> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-kaslr-v2
>>
>> Patch #1 updates the OF code to allow the minimum memblock physical address to
>> be overridden by the arch.
>>
>> Patch #2 introduces KIMAGE_VADDR as the base of the kernel virtual region.
>>
>> Patch #3 memblock_reserve()'s the .bss, swapper_pg_dir and idmap_pg_dir
>> individually.
>>
>> Patch #4 rewrites early_fixmap_init() so it does not rely on the linear mapping
>> (i.e., the use of phys_to_virt() is avoided)
>>
>> Patch #5 updates KVM on arm64 so it can deal with kernel symbols whose addresses
>> are not covered by the linear mapping.
>>
>> Patch #6 moves the kernel virtual mapping to the vmalloc area, along with the
>> module region which is kept right below it, as before.
>>
>> Patch #7 adds support for PLTs in modules so that relative branches can be
>> resolved via a PLT if the target is out of range.
>>
>> Patch #8 moves to the x86 version of the extable implementation so that it no
>> longer contains absolute addresses that require fixing up at relocation time,
>> but uses relative offsets instead.
>>
>> Patch #9 reverts some changes to the Image header population code so we no
>> longer depend on the linker to populate the header fields. This is necessary
>> since the R_AARCH64_ABS relocations that are emitted for these fields are not
>> resolved at build time for PIE executables.
>>
>> Patch #10 updates the code in head.S that needs to execute before relocation to
>> avoid the use of values that are subject to dynamic relocation. These values
>> will not be populated in PIE executables.
>>
>> Patch #11 allows the kernel Image to be loaded anywhere in physical memory, by
>> decoupling PHYS_OFFSET from the base of the kernel image.
>>
>> Patch #12 implements the core KASLR, by taking randomness supplied in register x1
>> and using it to move the kernel inside the vmalloc area.
>>
>> Patch #13 adds an invocation of the EFI_RNG_PROTOCOL to supply randomness to the
>> kernel proper.
>
> I see a few other things that we'll probably want to add:
>
> - kaslr/nokaslr command line (to either ignore boot loader hint or UEFI rng)
>

Yes, that makes sense. For the UEFI stub version of the randomization,
that is trivially achievable, since we already parse the command line
in that context. For the bare bootloader case, that is a bit more
involved, since we'd need to parse the FDT before applying the
relocations, or apply the relocations twice.

> - randomization of module load address (see get_module_load_offset in
> arch/x86/kernel/module.c)
>
> - panic reporting of offset (see register_kernel_offset_dumper in
> arch/x86/kernel/setup.c)
>
> - vmcoreinfo reporting of offset (though I can't find vmcoreinfo on
> arm64, so maybe not, as kexec appears unimplemented)
>

I will look into all of these. kexec support has been in flight for a
while now, but I have no idea when it is expected to land.

>> Ard Biesheuvel (13):
>>   of/fdt: make memblock minimum physical address arch configurable
>>   arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
>>   arm64: use more granular reservations for static page table
>>     allocations
>>   arm64: decouple early fixmap init from linear mapping
>>   arm64: kvm: deal with kernel symbols outside of linear mapping
>>   arm64: move kernel image to base of vmalloc area
>>   arm64: add support for module PLTs
>>   arm64: use relative references in exception tables
>>   arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
>>   arm64: avoid dynamic relocations in early boot code
>>   arm64: allow kernel Image to be loaded anywhere in physical memory
>>   arm64: add support for relocatable kernel
>>   arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
>>
>>  Documentation/arm64/booting.txt           |  15 ++-
>>  arch/arm/include/asm/kvm_asm.h            |   2 +
>>  arch/arm/include/asm/kvm_mmu.h            |   2 +
>>  arch/arm/kvm/arm.c                        |   9 +-
>>  arch/arm/kvm/mmu.c                        |  12 +-
>>  arch/arm64/Kconfig                        |  18 +++
>>  arch/arm64/Makefile                       |  10 +-
>>  arch/arm64/include/asm/assembler.h        |  17 ++-
>>  arch/arm64/include/asm/boot.h             |   5 +
>>  arch/arm64/include/asm/compiler.h         |   2 +
>>  arch/arm64/include/asm/futex.h            |   4 +-
>>  arch/arm64/include/asm/kasan.h            |  17 +--
>>  arch/arm64/include/asm/kernel-pgtable.h   |   5 +-
>>  arch/arm64/include/asm/kvm_asm.h          |  21 +--
>>  arch/arm64/include/asm/kvm_mmu.h          |   2 +
>>  arch/arm64/include/asm/memory.h           |  37 ++++--
>>  arch/arm64/include/asm/module.h           |  11 ++
>>  arch/arm64/include/asm/pgtable.h          |   7 -
>>  arch/arm64/include/asm/uaccess.h          |  16 +--
>>  arch/arm64/include/asm/virt.h             |   4 -
>>  arch/arm64/kernel/Makefile                |   1 +
>>  arch/arm64/kernel/armv8_deprecated.c      |   4 +-
>>  arch/arm64/kernel/efi-entry.S             |   9 +-
>>  arch/arm64/kernel/head.S                  | 133 ++++++++++++++++---
>>  arch/arm64/kernel/image.h                 |  37 ++----
>>  arch/arm64/kernel/module-plts.c           | 137 ++++++++++++++++++++
>>  arch/arm64/kernel/module.c                |   7 +
>>  arch/arm64/kernel/module.lds              |   4 +
>>  arch/arm64/kernel/setup.c                 |  15 ++-
>>  arch/arm64/kernel/vmlinux.lds.S           |  29 +++--
>>  arch/arm64/kvm/debug.c                    |   4 +-
>>  arch/arm64/mm/dump.c                      |  12 +-
>>  arch/arm64/mm/extable.c                   | 102 ++++++++++++++-
>>  arch/arm64/mm/init.c                      |  75 +++++++++--
>>  arch/arm64/mm/mmu.c                       | 132 +++++++------------
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 ++++++++++++++++---
>>  drivers/of/fdt.c                          |   5 +-
>>  include/linux/efi.h                       |   5 +-
>>  scripts/sortextable.c                     |   6 +-
>>  virt/kvm/arm/vgic-v3.c                    |   2 +-
>>  41 files changed, 813 insertions(+), 257 deletions(-)
>>  create mode 100644 arch/arm64/kernel/module-plts.c
>>  create mode 100644 arch/arm64/kernel/module.lds
>>
>>
>> EFI_RNG_PROTOCOL on real hardware
>> =================================
>>
>> To test whether your UEFI implements the EFI_RNG_PROTOCOL, download the
>> following executable and run it from the UEFI Shell:
>> http://people.linaro.org/~ard.biesheuvel/RngTest.efi
>>
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Fail - Status = Not Found]
>>
>> If your UEFI does not implement the EFI_RNG_PROTOCOL, you can download and
>> install the pseudo-random version that uses the generic timer and PMCCNTR_EL0
>> values and permutes them using a couple of rounds of AES.
>> http://people.linaro.org/~ard.biesheuvel/RngDxe.efi
>>
>> NOTE: not for production!! This is a quick and dirty hack to test the KASLR
>> code, and is not suitable for anything else.
>>
>> FS0:\> rngdxe
>> FS0:\> rngtest
>> UEFI RNG Protocol Testing :
>> ----------------------------
>>  -- Locate UEFI RNG Protocol : [Pass]
>>  -- Call RNG->GetInfo() interface :
>>      >> Supported RNG Algorithm (Count = 2) :
>>           0) 44F0DE6E-4D8C-4045-A8C7-4DD168856B9E
>>           1) E43176D7-B6E8-4827-B784-7FFDC4B68561
>>  -- Call RNG->GetRNG() interface :
>>      >> RNG with default algorithm : [Pass]
>>      >> RNG with SP800-90-HMAC-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-Hash-256 : [Fail - Status = Unsupported]
>>      >> RNG with SP800-90-CTR-256 : [Pass]
>>      >> RNG with X9.31-3DES : [Fail - Status = Unsupported]
>>      >> RNG with X9.31-AES : [Fail - Status = Unsupported]
>>      >> RNG with RAW Entropy : [Pass]
>>  -- Random Number Generation Test with default RNG Algorithm (20 Rounds):
>>           01) - 27
>>           02) - 61E8
>>           03) - 496FD8
>>           04) - DDD793BF
>>           05) - B6C37C8E23
>>           06) - 4D183C604A96
>>           07) - 9363311DB61298
>>           08) - 5715A7294F4E436E
>>           09) - F0D4D7BAA0DD52318E
>>           10) - C88C6EBCF4C0474D87C3
>>           11) - B5594602B482A643932172
>>           12) - CA7573F704B2089B726B9CF1
>>           13) - A93E9451CB533DCFBA87B97C33
>>           14) - 45AA7B83DB6044F7BBAB031F0D24
>>           15) - 3DD7A4D61F34ADCB400B5976730DCF
>>           16) - 4DD168D21FAB8F59708330D6A9BEB021
>>           17) - 4BBB225E61C465F174254159467E65939F
>>           18) - 030A156C9616337A20070941E702827DA8E1
>>           19) - AB0FC11C9A4E225011382A9D164D9D55CA2B64
>>           20) - 72B9B4735DC445E5DA6AF88DE965B7E87CB9A23C
>>
>
> Have you done any repeated boot testing?

Not automatically, no.

> When I originally did x86
> kASLR, I had a machine rebooting over and over spitting the _text line
> from /proc/kallsyms to the console. This both caught page table corner
> cases where the system was unbootable and let me run a statistical
> analysis of the offsets, just to make sure there wasn't any glaring
> error in either the RNG or the relocation.
>

Well, I conveniently punted the RNG problem to the firmware, so as far
as the quality of the randomness is concerned, it is easy to shift the
blame. For the other stuff (including the quality of the translation
between a random number and a KASRL offset), I highly appreciate your
input.

> Very cool!
>

Thanks!

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2016-01-05 19:51     ` Kees Cook
  (?)
@ 2016-01-06  7:51       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 20:51, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 03e02ebc1b0c..b17181eb4a43 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -109,7 +109,8 @@ Header notes:
>>                         1 - 4K
>>                         2 - 16K
>>                         3 - 64K
>> -  Bits 3-63:   Reserved.
>> +  Bit 3:       Relocatable kernel.
>> +  Bits 4-63:   Reserved.
>>
>>  - When image_size is zero, a bootloader should attempt to keep as much
>>    memory as possible free for use by the kernel immediately after the
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 54eeab140bca..f458fb9e0dce 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>>         depends on MODULES
>>         default y
>> +       select ARM64_MODULE_CMODEL_LARGE
>>         help
>>           This option builds kernel modules using the large memory model in
>>           order to avoid the use of the ADRP instruction, which can cause
>> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>>         bool
>>         select HAVE_MOD_ARCH_SPECIFIC
>>
>> +config ARM64_MODULE_CMODEL_LARGE
>> +       bool
>> +
>> +config ARM64_RELOCATABLE_KERNEL
>
> Should this be called "CONFIG_RELOCATABLE" instead, just to keep
> naming the same across x86, powerpw, and arm64?
>

Yes, I will change that.

>> +       bool "Kernel address space layout randomization (KASLR)"
>
> Strictly speaking, this enables KASLR, but doesn't provide it,
> correct? It still relies on the boot loader for the randomness?
>

Indeed.

>> +       select ARM64_MODULE_PLTS
>> +       select ARM64_MODULE_CMODEL_LARGE
>> +       help
>> +         This feature randomizes the virtual address of the kernel image, to
>> +         harden against exploits that rely on knowledge about the absolute
>> +         addresses of certain kernel data structures.
>> +
>>  endmenu
>>
>>  menu "Boot options"
>> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
>> index d4654830e536..75dc477d45f5 100644
>> --- a/arch/arm64/Makefile
>> +++ b/arch/arm64/Makefile
>> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>>  GZFLAGS                :=-9
>>
>> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
>> +LDFLAGS_vmlinux                += -pie
>> +endif
>> +
>>  KBUILD_DEFCONFIG := defconfig
>>
>>  # Check for binutils support for specific extensions
>> @@ -41,7 +45,7 @@ endif
>>
>>  CHECKFLAGS     += -D__aarch64__
>>
>> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
>> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>>  endif
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 557228658666..afab3e669e19 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>>  #define PHYS_OFFSET            ({ memstart_addr; })
>>
>> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
>> +extern u64                     kimage_vaddr;
>> +
>>  /* the offset between the kernel virtual and physical mappings */
>>  extern u64                     kimage_voffset;
>>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -59,8 +59,15 @@
>>
>>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +#define __HEAD_FLAG_RELOC      1
>> +#else
>> +#define __HEAD_FLAG_RELOC      0
>> +#endif
>> +
>>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
>> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
>> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
>> +                        (__HEAD_FLAG_RELOC << 3))
>>
>>  /*
>>   * Kernel startup entry point.
>> @@ -231,6 +238,9 @@ ENTRY(stext)
>>          */
>>         ldr     x27, 0f                         // address to jump to after
>>                                                 // MMU has been enabled
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x27, x27, x23                   // add KASLR displacement
>> +#endif
>>         adr_l   lr, __enable_mmu                // return (PIC) address
>>         b       __cpu_setup                     // initialise processor
>>  ENDPROC(stext)
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>         mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       /*
>> +        * Mask off the bits of the random value supplied in x1 so it can serve
>> +        * as a KASLR displacement value which will move the kernel image to a
>> +        * random offset in the lower half of the VMALLOC area.
>> +        */
>> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>> +
>>         adr_l   x0, boot_args                   // record the contents of
>>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>         stp     x2, x3, [x0, #16]
>> @@ -402,6 +422,9 @@ __create_page_tables:
>>          */
>>         mov     x0, x26                         // swapper_pg_dir
>>         ldr     x5, =KIMAGE_VADDR
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x5, x5, x23                     // add KASLR displacement
>> +#endif
>>         create_pgd_entry x0, x5, x3, x6
>>         ldr     w6, kernel_img_size
>>         add     x6, x6, x5
>> @@ -443,10 +466,52 @@ __mmap_switched:
>>         str     xzr, [x6], #8                   // Clear BSS
>>         b       1b
>>  2:
>> +
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE     0x403
>> +#define R_AARCH64_ABS64                0x101
>> +
>> +       /*
>> +        * Iterate over each entry in the relocation table, and apply the
>> +        * relocations in place.
>> +        */
>> +       adr_l   x8, __dynsym_start              // start of symbol table
>> +       adr_l   x9, __reloc_start               // start of reloc table
>> +       adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:     cmp     x9, x10
>> +       b.hs    2f
>> +       ldp     x11, x12, [x9], #24
>> +       ldr     x13, [x9, #-8]
>> +       cmp     w12, #R_AARCH64_RELATIVE
>> +       b.ne    1f
>> +       add     x13, x13, x23                   // relocate
>> +       str     x13, [x11, x23]
>> +       b       0b
>> +
>> +1:     cmp     w12, #R_AARCH64_ABS64
>> +       b.ne    0b
>> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +       add     x14, x15, x23                   // relocate
>> +       csel    x15, x14, x15, ne
>> +       add     x15, x13, x15
>> +       str     x15, [x11, x23]
>> +       b       0b
>> +
>> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +       dc      cvac, x8                        // value visible to secondaries
>> +       dsb     sy                              // with MMU off
>> +#endif
>> +
>>         adr_l   sp, initial_sp, x4
>>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>>         sub     x24, x0, x24                    // the kernel virtual and
>>         str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>         .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +       .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>         adrp    x26, swapper_pg_dir
>>         bl      __cpu_setup                     // initialise processor
>>
>> -       ldr     x8, =KIMAGE_VADDR
>> +       ldr     x8, kimage_vaddr
>>         ldr     w9, 0f
>>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>         b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -       static struct vm_struct vmlinux_vm __initdata = {
>> -               .addr           = (void *)KIMAGE_VADDR,
>> -               .size           = 0,
>> -               .flags          = VM_IOREMAP,
>> -               .caller         = setup_arch,
>> -       };
>> -
>> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                  1 << SWAPPER_BLOCK_SHIFT);
>> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +       static struct vm_struct vmlinux_vm __initdata;
>> +
>> +       vmlinux_vm.addr = (void *)kimage_vaddr;
>> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                  SWAPPER_BLOCK_SIZE);
>> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +       vmlinux_vm.flags = VM_IOREMAP;
>> +       vmlinux_vm.caller = setup_arch;
>> +
>>         vm_area_add_early(&vmlinux_vm);
>>
>>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>         conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +           boot_args[2] || boot_args[3]) {
>>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                         "This indicates a broken bootloader or old kernel\n",
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>         .altinstr_replacement : {
>>                 *(.altinstr_replacement)
>>         }
>> +       .rela : ALIGN(8) {
>> +               __reloc_start = .;
>> +               *(.rela .rela*)
>> +               __reloc_end = .;
>> +       }
>> +       .dynsym : ALIGN(8) {
>> +               __dynsym_start = .;
>> +               *(.dynsym)
>> +       }
>>
>>         . = ALIGN(PAGE_SIZE);
>>         __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>                 break;
>>         }  /* end switch */
>>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -       ||  r2(&ehdr->e_type) != ET_EXEC
>> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>                 fail_file();
>>         }
>>
>> --
>> 2.5.0
>>
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-06  7:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 January 2016 at 20:51, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 03e02ebc1b0c..b17181eb4a43 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -109,7 +109,8 @@ Header notes:
>>                         1 - 4K
>>                         2 - 16K
>>                         3 - 64K
>> -  Bits 3-63:   Reserved.
>> +  Bit 3:       Relocatable kernel.
>> +  Bits 4-63:   Reserved.
>>
>>  - When image_size is zero, a bootloader should attempt to keep as much
>>    memory as possible free for use by the kernel immediately after the
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 54eeab140bca..f458fb9e0dce 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>>         depends on MODULES
>>         default y
>> +       select ARM64_MODULE_CMODEL_LARGE
>>         help
>>           This option builds kernel modules using the large memory model in
>>           order to avoid the use of the ADRP instruction, which can cause
>> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>>         bool
>>         select HAVE_MOD_ARCH_SPECIFIC
>>
>> +config ARM64_MODULE_CMODEL_LARGE
>> +       bool
>> +
>> +config ARM64_RELOCATABLE_KERNEL
>
> Should this be called "CONFIG_RELOCATABLE" instead, just to keep
> naming the same across x86, powerpw, and arm64?
>

Yes, I will change that.

>> +       bool "Kernel address space layout randomization (KASLR)"
>
> Strictly speaking, this enables KASLR, but doesn't provide it,
> correct? It still relies on the boot loader for the randomness?
>

Indeed.

>> +       select ARM64_MODULE_PLTS
>> +       select ARM64_MODULE_CMODEL_LARGE
>> +       help
>> +         This feature randomizes the virtual address of the kernel image, to
>> +         harden against exploits that rely on knowledge about the absolute
>> +         addresses of certain kernel data structures.
>> +
>>  endmenu
>>
>>  menu "Boot options"
>> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
>> index d4654830e536..75dc477d45f5 100644
>> --- a/arch/arm64/Makefile
>> +++ b/arch/arm64/Makefile
>> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>>  GZFLAGS                :=-9
>>
>> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
>> +LDFLAGS_vmlinux                += -pie
>> +endif
>> +
>>  KBUILD_DEFCONFIG := defconfig
>>
>>  # Check for binutils support for specific extensions
>> @@ -41,7 +45,7 @@ endif
>>
>>  CHECKFLAGS     += -D__aarch64__
>>
>> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
>> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>>  endif
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 557228658666..afab3e669e19 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>>  #define PHYS_OFFSET            ({ memstart_addr; })
>>
>> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
>> +extern u64                     kimage_vaddr;
>> +
>>  /* the offset between the kernel virtual and physical mappings */
>>  extern u64                     kimage_voffset;
>>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -59,8 +59,15 @@
>>
>>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +#define __HEAD_FLAG_RELOC      1
>> +#else
>> +#define __HEAD_FLAG_RELOC      0
>> +#endif
>> +
>>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
>> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
>> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
>> +                        (__HEAD_FLAG_RELOC << 3))
>>
>>  /*
>>   * Kernel startup entry point.
>> @@ -231,6 +238,9 @@ ENTRY(stext)
>>          */
>>         ldr     x27, 0f                         // address to jump to after
>>                                                 // MMU has been enabled
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x27, x27, x23                   // add KASLR displacement
>> +#endif
>>         adr_l   lr, __enable_mmu                // return (PIC) address
>>         b       __cpu_setup                     // initialise processor
>>  ENDPROC(stext)
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>         mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       /*
>> +        * Mask off the bits of the random value supplied in x1 so it can serve
>> +        * as a KASLR displacement value which will move the kernel image to a
>> +        * random offset in the lower half of the VMALLOC area.
>> +        */
>> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>> +
>>         adr_l   x0, boot_args                   // record the contents of
>>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>         stp     x2, x3, [x0, #16]
>> @@ -402,6 +422,9 @@ __create_page_tables:
>>          */
>>         mov     x0, x26                         // swapper_pg_dir
>>         ldr     x5, =KIMAGE_VADDR
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x5, x5, x23                     // add KASLR displacement
>> +#endif
>>         create_pgd_entry x0, x5, x3, x6
>>         ldr     w6, kernel_img_size
>>         add     x6, x6, x5
>> @@ -443,10 +466,52 @@ __mmap_switched:
>>         str     xzr, [x6], #8                   // Clear BSS
>>         b       1b
>>  2:
>> +
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE     0x403
>> +#define R_AARCH64_ABS64                0x101
>> +
>> +       /*
>> +        * Iterate over each entry in the relocation table, and apply the
>> +        * relocations in place.
>> +        */
>> +       adr_l   x8, __dynsym_start              // start of symbol table
>> +       adr_l   x9, __reloc_start               // start of reloc table
>> +       adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:     cmp     x9, x10
>> +       b.hs    2f
>> +       ldp     x11, x12, [x9], #24
>> +       ldr     x13, [x9, #-8]
>> +       cmp     w12, #R_AARCH64_RELATIVE
>> +       b.ne    1f
>> +       add     x13, x13, x23                   // relocate
>> +       str     x13, [x11, x23]
>> +       b       0b
>> +
>> +1:     cmp     w12, #R_AARCH64_ABS64
>> +       b.ne    0b
>> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +       add     x14, x15, x23                   // relocate
>> +       csel    x15, x14, x15, ne
>> +       add     x15, x13, x15
>> +       str     x15, [x11, x23]
>> +       b       0b
>> +
>> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +       dc      cvac, x8                        // value visible to secondaries
>> +       dsb     sy                              // with MMU off
>> +#endif
>> +
>>         adr_l   sp, initial_sp, x4
>>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>>         sub     x24, x0, x24                    // the kernel virtual and
>>         str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>         .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +       .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>         adrp    x26, swapper_pg_dir
>>         bl      __cpu_setup                     // initialise processor
>>
>> -       ldr     x8, =KIMAGE_VADDR
>> +       ldr     x8, kimage_vaddr
>>         ldr     w9, 0f
>>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>         b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -       static struct vm_struct vmlinux_vm __initdata = {
>> -               .addr           = (void *)KIMAGE_VADDR,
>> -               .size           = 0,
>> -               .flags          = VM_IOREMAP,
>> -               .caller         = setup_arch,
>> -       };
>> -
>> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                  1 << SWAPPER_BLOCK_SHIFT);
>> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +       static struct vm_struct vmlinux_vm __initdata;
>> +
>> +       vmlinux_vm.addr = (void *)kimage_vaddr;
>> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                  SWAPPER_BLOCK_SIZE);
>> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +       vmlinux_vm.flags = VM_IOREMAP;
>> +       vmlinux_vm.caller = setup_arch;
>> +
>>         vm_area_add_early(&vmlinux_vm);
>>
>>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>         conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +           boot_args[2] || boot_args[3]) {
>>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                         "This indicates a broken bootloader or old kernel\n",
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>         .altinstr_replacement : {
>>                 *(.altinstr_replacement)
>>         }
>> +       .rela : ALIGN(8) {
>> +               __reloc_start = .;
>> +               *(.rela .rela*)
>> +               __reloc_end = .;
>> +       }
>> +       .dynsym : ALIGN(8) {
>> +               __dynsym_start = .;
>> +               *(.dynsym)
>> +       }
>>
>>         . = ALIGN(PAGE_SIZE);
>>         __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>                 break;
>>         }  /* end switch */
>>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -       ||  r2(&ehdr->e_type) != ET_EXEC
>> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>                 fail_file();
>>         }
>>
>> --
>> 2.5.0
>>
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-06  7:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 20:51, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 03e02ebc1b0c..b17181eb4a43 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -109,7 +109,8 @@ Header notes:
>>                         1 - 4K
>>                         2 - 16K
>>                         3 - 64K
>> -  Bits 3-63:   Reserved.
>> +  Bit 3:       Relocatable kernel.
>> +  Bits 4-63:   Reserved.
>>
>>  - When image_size is zero, a bootloader should attempt to keep as much
>>    memory as possible free for use by the kernel immediately after the
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 54eeab140bca..f458fb9e0dce 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -363,6 +363,7 @@ config ARM64_ERRATUM_843419
>>         bool "Cortex-A53: 843419: A load or store might access an incorrect address"
>>         depends on MODULES
>>         default y
>> +       select ARM64_MODULE_CMODEL_LARGE
>>         help
>>           This option builds kernel modules using the large memory model in
>>           order to avoid the use of the ADRP instruction, which can cause
>> @@ -709,6 +710,18 @@ config ARM64_MODULE_PLTS
>>         bool
>>         select HAVE_MOD_ARCH_SPECIFIC
>>
>> +config ARM64_MODULE_CMODEL_LARGE
>> +       bool
>> +
>> +config ARM64_RELOCATABLE_KERNEL
>
> Should this be called "CONFIG_RELOCATABLE" instead, just to keep
> naming the same across x86, powerpw, and arm64?
>

Yes, I will change that.

>> +       bool "Kernel address space layout randomization (KASLR)"
>
> Strictly speaking, this enables KASLR, but doesn't provide it,
> correct? It still relies on the boot loader for the randomness?
>

Indeed.

>> +       select ARM64_MODULE_PLTS
>> +       select ARM64_MODULE_CMODEL_LARGE
>> +       help
>> +         This feature randomizes the virtual address of the kernel image, to
>> +         harden against exploits that rely on knowledge about the absolute
>> +         addresses of certain kernel data structures.
>> +
>>  endmenu
>>
>>  menu "Boot options"
>> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
>> index d4654830e536..75dc477d45f5 100644
>> --- a/arch/arm64/Makefile
>> +++ b/arch/arm64/Makefile
>> @@ -15,6 +15,10 @@ CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
>>  OBJCOPYFLAGS   :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
>>  GZFLAGS                :=-9
>>
>> +ifneq ($(CONFIG_ARM64_RELOCATABLE_KERNEL),)
>> +LDFLAGS_vmlinux                += -pie
>> +endif
>> +
>>  KBUILD_DEFCONFIG := defconfig
>>
>>  # Check for binutils support for specific extensions
>> @@ -41,7 +45,7 @@ endif
>>
>>  CHECKFLAGS     += -D__aarch64__
>>
>> -ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
>> +ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
>>  KBUILD_CFLAGS_MODULE   += -mcmodel=large
>>  endif
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 557228658666..afab3e669e19 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -121,6 +121,9 @@ extern phys_addr_t          memstart_addr;
>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>>  #define PHYS_OFFSET            ({ memstart_addr; })
>>
>> +/* the virtual base of the kernel image (minus TEXT_OFFSET) */
>> +extern u64                     kimage_vaddr;
>> +
>>  /* the offset between the kernel virtual and physical mappings */
>>  extern u64                     kimage_voffset;
>>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -59,8 +59,15 @@
>>
>>  #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +#define __HEAD_FLAG_RELOC      1
>> +#else
>> +#define __HEAD_FLAG_RELOC      0
>> +#endif
>> +
>>  #define __HEAD_FLAGS   ((__HEAD_FLAG_BE << 0) |        \
>> -                        (__HEAD_FLAG_PAGE_SIZE << 1))
>> +                        (__HEAD_FLAG_PAGE_SIZE << 1) | \
>> +                        (__HEAD_FLAG_RELOC << 3))
>>
>>  /*
>>   * Kernel startup entry point.
>> @@ -231,6 +238,9 @@ ENTRY(stext)
>>          */
>>         ldr     x27, 0f                         // address to jump to after
>>                                                 // MMU has been enabled
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x27, x27, x23                   // add KASLR displacement
>> +#endif
>>         adr_l   lr, __enable_mmu                // return (PIC) address
>>         b       __cpu_setup                     // initialise processor
>>  ENDPROC(stext)
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>         mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       /*
>> +        * Mask off the bits of the random value supplied in x1 so it can serve
>> +        * as a KASLR displacement value which will move the kernel image to a
>> +        * random offset in the lower half of the VMALLOC area.
>> +        */
>> +       mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +       and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>> +
>>         adr_l   x0, boot_args                   // record the contents of
>>         stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>         stp     x2, x3, [x0, #16]
>> @@ -402,6 +422,9 @@ __create_page_tables:
>>          */
>>         mov     x0, x26                         // swapper_pg_dir
>>         ldr     x5, =KIMAGE_VADDR
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +       add     x5, x5, x23                     // add KASLR displacement
>> +#endif
>>         create_pgd_entry x0, x5, x3, x6
>>         ldr     w6, kernel_img_size
>>         add     x6, x6, x5
>> @@ -443,10 +466,52 @@ __mmap_switched:
>>         str     xzr, [x6], #8                   // Clear BSS
>>         b       1b
>>  2:
>> +
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE     0x403
>> +#define R_AARCH64_ABS64                0x101
>> +
>> +       /*
>> +        * Iterate over each entry in the relocation table, and apply the
>> +        * relocations in place.
>> +        */
>> +       adr_l   x8, __dynsym_start              // start of symbol table
>> +       adr_l   x9, __reloc_start               // start of reloc table
>> +       adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:     cmp     x9, x10
>> +       b.hs    2f
>> +       ldp     x11, x12, [x9], #24
>> +       ldr     x13, [x9, #-8]
>> +       cmp     w12, #R_AARCH64_RELATIVE
>> +       b.ne    1f
>> +       add     x13, x13, x23                   // relocate
>> +       str     x13, [x11, x23]
>> +       b       0b
>> +
>> +1:     cmp     w12, #R_AARCH64_ABS64
>> +       b.ne    0b
>> +       add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +       add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +       ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +       ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +       cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +       add     x14, x15, x23                   // relocate
>> +       csel    x15, x14, x15, ne
>> +       add     x15, x13, x15
>> +       str     x15, [x11, x23]
>> +       b       0b
>> +
>> +2:     adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +       dc      cvac, x8                        // value visible to secondaries
>> +       dsb     sy                              // with MMU off
>> +#endif
>> +
>>         adr_l   sp, initial_sp, x4
>>         str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -       ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +       ldr_l   x0, kimage_vaddr                // Save the offset between
>>         sub     x24, x0, x24                    // the kernel virtual and
>>         str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>         .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +       .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>         adrp    x26, swapper_pg_dir
>>         bl      __cpu_setup                     // initialise processor
>>
>> -       ldr     x8, =KIMAGE_VADDR
>> +       ldr     x8, kimage_vaddr
>>         ldr     w9, 0f
>>         sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>         b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -       static struct vm_struct vmlinux_vm __initdata = {
>> -               .addr           = (void *)KIMAGE_VADDR,
>> -               .size           = 0,
>> -               .flags          = VM_IOREMAP,
>> -               .caller         = setup_arch,
>> -       };
>> -
>> -       vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                  1 << SWAPPER_BLOCK_SHIFT);
>> -       vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +       static struct vm_struct vmlinux_vm __initdata;
>> +
>> +       vmlinux_vm.addr = (void *)kimage_vaddr;
>> +       vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                  SWAPPER_BLOCK_SIZE);
>> +       vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +       vmlinux_vm.flags = VM_IOREMAP;
>> +       vmlinux_vm.caller = setup_arch;
>> +
>>         vm_area_add_early(&vmlinux_vm);
>>
>>         pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>         conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -       if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +       if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +           boot_args[2] || boot_args[3]) {
>>                 pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                         "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                         "This indicates a broken bootloader or old kernel\n",
>> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>         .altinstr_replacement : {
>>                 *(.altinstr_replacement)
>>         }
>> +       .rela : ALIGN(8) {
>> +               __reloc_start = .;
>> +               *(.rela .rela*)
>> +               __reloc_end = .;
>> +       }
>> +       .dynsym : ALIGN(8) {
>> +               __dynsym_start = .;
>> +               *(.dynsym)
>> +       }
>>
>>         . = ALIGN(PAGE_SIZE);
>>         __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>                 break;
>>         }  /* end switch */
>>         if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -       ||  r2(&ehdr->e_type) != ET_EXEC
>> +       || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>         ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -               fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +               fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>                 fail_file();
>>         }
>>
>> --
>> 2.5.0
>>
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2016-01-05 19:53     ` Kees Cook
  (?)
@ 2016-01-06  7:51       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 20:53, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>
> This logic seems like it should be under the name
> CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
> keep naming conventions the same across architectures to avoid
> confusion.)
>

Indeed.


>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
>> index f82036e02485..f41073dde7e0 100644
>> --- a/arch/arm64/kernel/efi-entry.S
>> +++ b/arch/arm64/kernel/efi-entry.S
>> @@ -110,7 +110,7 @@ ENTRY(entry)
>>  2:
>>         /* Jump to kernel entry point */
>>         mov     x0, x20
>> -       mov     x1, xzr
>> +       ldr     x1, efi_rnd
>>         mov     x2, xzr
>>         mov     x3, xzr
>>         br      x21
>> @@ -119,6 +119,9 @@ efi_load_fail:
>>         mov     x0, #EFI_LOAD_ERROR
>>         ldp     x29, x30, [sp], #32
>>         ret
>> +ENDPROC(entry)
>> +
>> +ENTRY(efi_rnd)
>> +       .quad   0, 0
>>
>>  entry_end:
>> -ENDPROC(entry)
>> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
>> index 950c87f5d279..f580bcdfae4f 100644
>> --- a/drivers/firmware/efi/libstub/arm-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm-stub.c
>> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>>         out->output_string(out, str);
>>  }
>>
>> -
>>  /*
>>   * This function handles the architcture specific differences between arm and
>>   * arm64 regarding where the kernel image must be loaded and any memory that
>> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
>> index 78dfbd34b6bf..4e5c306346b4 100644
>> --- a/drivers/firmware/efi/libstub/arm64-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
>> @@ -13,6 +13,68 @@
>>  #include <asm/efi.h>
>>  #include <asm/sections.h>
>>
>> +struct efi_rng_protocol_t {
>> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
>> +                                unsigned long *,
>> +                                efi_guid_t *);
>> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
>> +                               efi_guid_t *,
>> +                               unsigned long,
>> +                               u8 *out);
>> +};
>> +
>> +extern struct {
>> +       u64     virt_seed;
>> +       u64     phys_seed;
>> +} efi_rnd;
>> +
>> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
>> +{
>> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
>> +       efi_status_t status;
>> +       struct efi_rng_protocol_t *rng;
>> +
>> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
>> +                                                     (void **)&rng);
>> +       if (status == EFI_NOT_FOUND) {
>> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
>> +               return EFI_SUCCESS;
>> +       }
>> +
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
>> +}
>> +
>> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>> +{
>> +       unsigned long map_size, desc_size;
>> +       efi_memory_desc_t *memory_map;
>> +       efi_status_t status;
>> +       int l;
>> +
>> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
>> +                                   &desc_size, NULL, NULL);
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       for (l = 0; l < map_size; l += desc_size) {
>> +               efi_memory_desc_t *md = (void *)memory_map + l;
>> +
>> +               if (md->attribute & EFI_MEMORY_WB) {
>> +                       u64 phys_end = md->phys_addr +
>> +                                      md->num_pages * EFI_PAGE_SIZE;
>> +                       if (phys_end > *top)
>> +                               *top = phys_end;
>> +               }
>> +       }
>> +
>> +       efi_call_early(free_pool, memory_map);
>> +
>> +       return EFI_SUCCESS;
>> +}
>> +
>>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                                         unsigned long *image_addr,
>>                                         unsigned long *image_size,
>> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         void *old_image_addr = (void *)*image_addr;
>>         unsigned long preferred_offset;
>>
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
>> +               status = efi_get_random_bytes(sys_table_arg);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
>> +                       return status;
>> +               }
>> +       }
>> +
>>         /*
>>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>>          * a 2 MB aligned base, which itself may be lower than dram_base, as
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         if (preferred_offset < dram_base)
>>                 preferred_offset += SZ_2M;
>>
>> -       /* Relocate the image, if required. */
>>         kernel_size = _edata - _text;
>> -       if (*image_addr != preferred_offset) {
>> -               kernel_memsize = kernel_size + (_end - _edata);
>> +       kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +               /*
>> +                * If KASLR is enabled, and we have some randomness available,
>> +                * locate the kernel at a randomized offset in physical memory.
>> +                */
>> +               u64 dram_top = dram_base;
>> +
>> +               status = get_dram_top(sys_table_arg, &dram_top);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                       return status;
>> +               }
>> +
>> +               kernel_memsize += SZ_2M;
>> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                   EFI_PAGE_SIZE;
>>
>>                 /*
>> -                * First, try a straight allocation at the preferred offset.
>> +                * Use the random seed to scale the size and add it to the DRAM
>> +                * base. Note that this may give suboptimal results on systems
>> +                * with discontiguous DRAM regions with large holes between them.
>> +                */
>> +               *reserve_addr = dram_base +
>> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>> +
>> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>> +                                       EFI_LOADER_DATA, nr_pages,
>> +                                       (efi_physical_addr_t *)reserve_addr);
>> +
>> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
>> +       } else {
>> +               /*
>> +                * Else, try a straight allocation at the preferred offset.
>>                  * This will work around the issue where, if dram_base == 0x0,
>>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>>                  * address of the allocation to be mistaken for a FAIL return
>> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                  * Mustang), we can still place the kernel at the address
>>                  * 'dram_base + TEXT_OFFSET'.
>>                  */
>> +               if (*image_addr == preferred_offset)
>> +                       return EFI_SUCCESS;
>> +
>>                 *image_addr = *reserve_addr = preferred_offset;
>>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>>                            EFI_PAGE_SIZE;
>>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>>                                         EFI_LOADER_DATA, nr_pages,
>>                                         (efi_physical_addr_t *)reserve_addr);
>> +       }
>> +
>> +       if (status != EFI_SUCCESS) {
>> +               kernel_memsize += TEXT_OFFSET;
>> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> +                                      SZ_2M, reserve_addr);
>> +
>>                 if (status != EFI_SUCCESS) {
>> -                       kernel_memsize += TEXT_OFFSET;
>> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> -                                              SZ_2M, reserve_addr);
>> -
>> -                       if (status != EFI_SUCCESS) {
>> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> -                               return status;
>> -                       }
>> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
>> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> +                       return status;
>>                 }
>> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> -               *reserve_size = kernel_memsize;
>> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>>         }
>> -
>> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> +       *reserve_size = kernel_memsize;
>>
>>         return EFI_SUCCESS;
>>  }
>> diff --git a/include/linux/efi.h b/include/linux/efi.h
>> index 569b5a866bb1..13783fdc9bdd 100644
>> --- a/include/linux/efi.h
>> +++ b/include/linux/efi.h
>> @@ -299,7 +299,7 @@ typedef struct {
>>         void *open_protocol_information;
>>         void *protocols_per_handle;
>>         void *locate_handle_buffer;
>> -       void *locate_protocol;
>> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>>         void *install_multiple_protocol_interfaces;
>>         void *uninstall_multiple_protocol_interfaces;
>>         void *calculate_crc32;
>> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>>  #define EFI_PROPERTIES_TABLE_GUID \
>>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>>
>> +#define EFI_RNG_PROTOCOL_GUID \
>> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
>> +
>>  typedef struct {
>>         efi_guid_t guid;
>>         u64 table;
>> --
>> 2.5.0
>>
>
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-06  7:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 January 2016 at 20:53, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>
> This logic seems like it should be under the name
> CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
> keep naming conventions the same across architectures to avoid
> confusion.)
>

Indeed.


>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
>> index f82036e02485..f41073dde7e0 100644
>> --- a/arch/arm64/kernel/efi-entry.S
>> +++ b/arch/arm64/kernel/efi-entry.S
>> @@ -110,7 +110,7 @@ ENTRY(entry)
>>  2:
>>         /* Jump to kernel entry point */
>>         mov     x0, x20
>> -       mov     x1, xzr
>> +       ldr     x1, efi_rnd
>>         mov     x2, xzr
>>         mov     x3, xzr
>>         br      x21
>> @@ -119,6 +119,9 @@ efi_load_fail:
>>         mov     x0, #EFI_LOAD_ERROR
>>         ldp     x29, x30, [sp], #32
>>         ret
>> +ENDPROC(entry)
>> +
>> +ENTRY(efi_rnd)
>> +       .quad   0, 0
>>
>>  entry_end:
>> -ENDPROC(entry)
>> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
>> index 950c87f5d279..f580bcdfae4f 100644
>> --- a/drivers/firmware/efi/libstub/arm-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm-stub.c
>> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>>         out->output_string(out, str);
>>  }
>>
>> -
>>  /*
>>   * This function handles the architcture specific differences between arm and
>>   * arm64 regarding where the kernel image must be loaded and any memory that
>> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
>> index 78dfbd34b6bf..4e5c306346b4 100644
>> --- a/drivers/firmware/efi/libstub/arm64-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
>> @@ -13,6 +13,68 @@
>>  #include <asm/efi.h>
>>  #include <asm/sections.h>
>>
>> +struct efi_rng_protocol_t {
>> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
>> +                                unsigned long *,
>> +                                efi_guid_t *);
>> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
>> +                               efi_guid_t *,
>> +                               unsigned long,
>> +                               u8 *out);
>> +};
>> +
>> +extern struct {
>> +       u64     virt_seed;
>> +       u64     phys_seed;
>> +} efi_rnd;
>> +
>> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
>> +{
>> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
>> +       efi_status_t status;
>> +       struct efi_rng_protocol_t *rng;
>> +
>> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
>> +                                                     (void **)&rng);
>> +       if (status == EFI_NOT_FOUND) {
>> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
>> +               return EFI_SUCCESS;
>> +       }
>> +
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
>> +}
>> +
>> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>> +{
>> +       unsigned long map_size, desc_size;
>> +       efi_memory_desc_t *memory_map;
>> +       efi_status_t status;
>> +       int l;
>> +
>> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
>> +                                   &desc_size, NULL, NULL);
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       for (l = 0; l < map_size; l += desc_size) {
>> +               efi_memory_desc_t *md = (void *)memory_map + l;
>> +
>> +               if (md->attribute & EFI_MEMORY_WB) {
>> +                       u64 phys_end = md->phys_addr +
>> +                                      md->num_pages * EFI_PAGE_SIZE;
>> +                       if (phys_end > *top)
>> +                               *top = phys_end;
>> +               }
>> +       }
>> +
>> +       efi_call_early(free_pool, memory_map);
>> +
>> +       return EFI_SUCCESS;
>> +}
>> +
>>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                                         unsigned long *image_addr,
>>                                         unsigned long *image_size,
>> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         void *old_image_addr = (void *)*image_addr;
>>         unsigned long preferred_offset;
>>
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
>> +               status = efi_get_random_bytes(sys_table_arg);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
>> +                       return status;
>> +               }
>> +       }
>> +
>>         /*
>>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>>          * a 2 MB aligned base, which itself may be lower than dram_base, as
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         if (preferred_offset < dram_base)
>>                 preferred_offset += SZ_2M;
>>
>> -       /* Relocate the image, if required. */
>>         kernel_size = _edata - _text;
>> -       if (*image_addr != preferred_offset) {
>> -               kernel_memsize = kernel_size + (_end - _edata);
>> +       kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +               /*
>> +                * If KASLR is enabled, and we have some randomness available,
>> +                * locate the kernel at a randomized offset in physical memory.
>> +                */
>> +               u64 dram_top = dram_base;
>> +
>> +               status = get_dram_top(sys_table_arg, &dram_top);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                       return status;
>> +               }
>> +
>> +               kernel_memsize += SZ_2M;
>> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                   EFI_PAGE_SIZE;
>>
>>                 /*
>> -                * First, try a straight allocation at the preferred offset.
>> +                * Use the random seed to scale the size and add it to the DRAM
>> +                * base. Note that this may give suboptimal results on systems
>> +                * with discontiguous DRAM regions with large holes between them.
>> +                */
>> +               *reserve_addr = dram_base +
>> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>> +
>> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>> +                                       EFI_LOADER_DATA, nr_pages,
>> +                                       (efi_physical_addr_t *)reserve_addr);
>> +
>> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
>> +       } else {
>> +               /*
>> +                * Else, try a straight allocation at the preferred offset.
>>                  * This will work around the issue where, if dram_base == 0x0,
>>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>>                  * address of the allocation to be mistaken for a FAIL return
>> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                  * Mustang), we can still place the kernel at the address
>>                  * 'dram_base + TEXT_OFFSET'.
>>                  */
>> +               if (*image_addr == preferred_offset)
>> +                       return EFI_SUCCESS;
>> +
>>                 *image_addr = *reserve_addr = preferred_offset;
>>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>>                            EFI_PAGE_SIZE;
>>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>>                                         EFI_LOADER_DATA, nr_pages,
>>                                         (efi_physical_addr_t *)reserve_addr);
>> +       }
>> +
>> +       if (status != EFI_SUCCESS) {
>> +               kernel_memsize += TEXT_OFFSET;
>> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> +                                      SZ_2M, reserve_addr);
>> +
>>                 if (status != EFI_SUCCESS) {
>> -                       kernel_memsize += TEXT_OFFSET;
>> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> -                                              SZ_2M, reserve_addr);
>> -
>> -                       if (status != EFI_SUCCESS) {
>> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> -                               return status;
>> -                       }
>> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
>> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> +                       return status;
>>                 }
>> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> -               *reserve_size = kernel_memsize;
>> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>>         }
>> -
>> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> +       *reserve_size = kernel_memsize;
>>
>>         return EFI_SUCCESS;
>>  }
>> diff --git a/include/linux/efi.h b/include/linux/efi.h
>> index 569b5a866bb1..13783fdc9bdd 100644
>> --- a/include/linux/efi.h
>> +++ b/include/linux/efi.h
>> @@ -299,7 +299,7 @@ typedef struct {
>>         void *open_protocol_information;
>>         void *protocols_per_handle;
>>         void *locate_handle_buffer;
>> -       void *locate_protocol;
>> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>>         void *install_multiple_protocol_interfaces;
>>         void *uninstall_multiple_protocol_interfaces;
>>         void *calculate_crc32;
>> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>>  #define EFI_PROPERTIES_TABLE_GUID \
>>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>>
>> +#define EFI_RNG_PROTOCOL_GUID \
>> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
>> +
>>  typedef struct {
>>         efi_guid_t guid;
>>         u64 table;
>> --
>> 2.5.0
>>
>
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-06  7:51       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06  7:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, LKML, Stuart Yoder, Sharma Bhupesh,
	Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 5 January 2016 at 20:53, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Dec 30, 2015 at 7:26 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>
> This logic seems like it should be under the name
> CONFIG_RANDOMIZE_BASE and depend on UEFI? (Again, I'm just trying to
> keep naming conventions the same across architectures to avoid
> confusion.)
>

Indeed.


>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
>> index f82036e02485..f41073dde7e0 100644
>> --- a/arch/arm64/kernel/efi-entry.S
>> +++ b/arch/arm64/kernel/efi-entry.S
>> @@ -110,7 +110,7 @@ ENTRY(entry)
>>  2:
>>         /* Jump to kernel entry point */
>>         mov     x0, x20
>> -       mov     x1, xzr
>> +       ldr     x1, efi_rnd
>>         mov     x2, xzr
>>         mov     x3, xzr
>>         br      x21
>> @@ -119,6 +119,9 @@ efi_load_fail:
>>         mov     x0, #EFI_LOAD_ERROR
>>         ldp     x29, x30, [sp], #32
>>         ret
>> +ENDPROC(entry)
>> +
>> +ENTRY(efi_rnd)
>> +       .quad   0, 0
>>
>>  entry_end:
>> -ENDPROC(entry)
>> diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
>> index 950c87f5d279..f580bcdfae4f 100644
>> --- a/drivers/firmware/efi/libstub/arm-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm-stub.c
>> @@ -145,7 +145,6 @@ void efi_char16_printk(efi_system_table_t *sys_table_arg,
>>         out->output_string(out, str);
>>  }
>>
>> -
>>  /*
>>   * This function handles the architcture specific differences between arm and
>>   * arm64 regarding where the kernel image must be loaded and any memory that
>> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
>> index 78dfbd34b6bf..4e5c306346b4 100644
>> --- a/drivers/firmware/efi/libstub/arm64-stub.c
>> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
>> @@ -13,6 +13,68 @@
>>  #include <asm/efi.h>
>>  #include <asm/sections.h>
>>
>> +struct efi_rng_protocol_t {
>> +       efi_status_t (*get_info)(struct efi_rng_protocol_t *,
>> +                                unsigned long *,
>> +                                efi_guid_t *);
>> +       efi_status_t (*get_rng)(struct efi_rng_protocol_t *,
>> +                               efi_guid_t *,
>> +                               unsigned long,
>> +                               u8 *out);
>> +};
>> +
>> +extern struct {
>> +       u64     virt_seed;
>> +       u64     phys_seed;
>> +} efi_rnd;
>> +
>> +static int efi_get_random_bytes(efi_system_table_t *sys_table)
>> +{
>> +       efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
>> +       efi_status_t status;
>> +       struct efi_rng_protocol_t *rng;
>> +
>> +       status = sys_table->boottime->locate_protocol(&rng_proto, NULL,
>> +                                                     (void **)&rng);
>> +       if (status == EFI_NOT_FOUND) {
>> +               pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
>> +               return EFI_SUCCESS;
>> +       }
>> +
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       return rng->get_rng(rng, NULL, sizeof(efi_rnd), (u8 *)&efi_rnd);
>> +}
>> +
>> +static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>> +{
>> +       unsigned long map_size, desc_size;
>> +       efi_memory_desc_t *memory_map;
>> +       efi_status_t status;
>> +       int l;
>> +
>> +       status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
>> +                                   &desc_size, NULL, NULL);
>> +       if (status != EFI_SUCCESS)
>> +               return status;
>> +
>> +       for (l = 0; l < map_size; l += desc_size) {
>> +               efi_memory_desc_t *md = (void *)memory_map + l;
>> +
>> +               if (md->attribute & EFI_MEMORY_WB) {
>> +                       u64 phys_end = md->phys_addr +
>> +                                      md->num_pages * EFI_PAGE_SIZE;
>> +                       if (phys_end > *top)
>> +                               *top = phys_end;
>> +               }
>> +       }
>> +
>> +       efi_call_early(free_pool, memory_map);
>> +
>> +       return EFI_SUCCESS;
>> +}
>> +
>>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                                         unsigned long *image_addr,
>>                                         unsigned long *image_size,
>> @@ -27,6 +89,14 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         void *old_image_addr = (void *)*image_addr;
>>         unsigned long preferred_offset;
>>
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL)) {
>> +               status = efi_get_random_bytes(sys_table_arg);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "efi_get_random_bytes() failed\n");
>> +                       return status;
>> +               }
>> +       }
>> +
>>         /*
>>          * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
>>          * a 2 MB aligned base, which itself may be lower than dram_base, as
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>         if (preferred_offset < dram_base)
>>                 preferred_offset += SZ_2M;
>>
>> -       /* Relocate the image, if required. */
>>         kernel_size = _edata - _text;
>> -       if (*image_addr != preferred_offset) {
>> -               kernel_memsize = kernel_size + (_end - _edata);
>> +       kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +       if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +               /*
>> +                * If KASLR is enabled, and we have some randomness available,
>> +                * locate the kernel at a randomized offset in physical memory.
>> +                */
>> +               u64 dram_top = dram_base;
>> +
>> +               status = get_dram_top(sys_table_arg, &dram_top);
>> +               if (status != EFI_SUCCESS) {
>> +                       pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                       return status;
>> +               }
>> +
>> +               kernel_memsize += SZ_2M;
>> +               nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                   EFI_PAGE_SIZE;
>>
>>                 /*
>> -                * First, try a straight allocation at the preferred offset.
>> +                * Use the random seed to scale the size and add it to the DRAM
>> +                * base. Note that this may give suboptimal results on systems
>> +                * with discontiguous DRAM regions with large holes between them.
>> +                */
>> +               *reserve_addr = dram_base +
>> +                       ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>> +
>> +               status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>> +                                       EFI_LOADER_DATA, nr_pages,
>> +                                       (efi_physical_addr_t *)reserve_addr);
>> +
>> +               *image_addr = round_up(*reserve_addr, SZ_2M) + TEXT_OFFSET;
>> +       } else {
>> +               /*
>> +                * Else, try a straight allocation at the preferred offset.
>>                  * This will work around the issue where, if dram_base == 0x0,
>>                  * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
>>                  * address of the allocation to be mistaken for a FAIL return
>> @@ -52,27 +151,30 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>                  * Mustang), we can still place the kernel at the address
>>                  * 'dram_base + TEXT_OFFSET'.
>>                  */
>> +               if (*image_addr == preferred_offset)
>> +                       return EFI_SUCCESS;
>> +
>>                 *image_addr = *reserve_addr = preferred_offset;
>>                 nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>>                            EFI_PAGE_SIZE;
>>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_ADDRESS,
>>                                         EFI_LOADER_DATA, nr_pages,
>>                                         (efi_physical_addr_t *)reserve_addr);
>> +       }
>> +
>> +       if (status != EFI_SUCCESS) {
>> +               kernel_memsize += TEXT_OFFSET;
>> +               status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> +                                      SZ_2M, reserve_addr);
>> +
>>                 if (status != EFI_SUCCESS) {
>> -                       kernel_memsize += TEXT_OFFSET;
>> -                       status = efi_low_alloc(sys_table_arg, kernel_memsize,
>> -                                              SZ_2M, reserve_addr);
>> -
>> -                       if (status != EFI_SUCCESS) {
>> -                               pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> -                               return status;
>> -                       }
>> -                       *image_addr = *reserve_addr + TEXT_OFFSET;
>> +                       pr_efi_err(sys_table_arg, "Failed to relocate kernel\n");
>> +                       return status;
>>                 }
>> -               memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> -               *reserve_size = kernel_memsize;
>> +               *image_addr = *reserve_addr + TEXT_OFFSET;
>>         }
>> -
>> +       memcpy((void *)*image_addr, old_image_addr, kernel_size);
>> +       *reserve_size = kernel_memsize;
>>
>>         return EFI_SUCCESS;
>>  }
>> diff --git a/include/linux/efi.h b/include/linux/efi.h
>> index 569b5a866bb1..13783fdc9bdd 100644
>> --- a/include/linux/efi.h
>> +++ b/include/linux/efi.h
>> @@ -299,7 +299,7 @@ typedef struct {
>>         void *open_protocol_information;
>>         void *protocols_per_handle;
>>         void *locate_handle_buffer;
>> -       void *locate_protocol;
>> +       efi_status_t (*locate_protocol)(efi_guid_t *, void *, void **);
>>         void *install_multiple_protocol_interfaces;
>>         void *uninstall_multiple_protocol_interfaces;
>>         void *calculate_crc32;
>> @@ -599,6 +599,9 @@ void efi_native_runtime_setup(void);
>>  #define EFI_PROPERTIES_TABLE_GUID \
>>      EFI_GUID(  0x880aaca3, 0x4adc, 0x4a04, 0x90, 0x79, 0xb7, 0x47, 0x34, 0x08, 0x25, 0xe5 )
>>
>> +#define EFI_RNG_PROTOCOL_GUID \
>> +    EFI_GUID(  0x3152bca5, 0xeade, 0x433d, 0x86, 0x2e, 0xc0, 0x1c, 0xdc, 0x29, 0x1f, 0x44 )
>> +
>>  typedef struct {
>>         efi_guid_t guid;
>>         u64 table;
>> --
>> 2.5.0
>>
>
>
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-06 16:35     ` James Morse
  -1 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-06 16:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> Since the early fixmap page tables are populated using pages that are
> part of the static footprint of the kernel, they are covered by the
> initial kernel mapping, and we can refer to them without using __va/__pa
> translations, which are tied to the linear mapping.
> 
> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
> virtual mapping, regardless of whether or not it intersects with the linear
> mapping. This will allow us to move the kernel out of the linear mapping in
> a subsequent patch.
> 

I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.

git bisect pointed to this patch. From the debugger it looks like
rubbish is ending up the page tables after early_fixmap_init(), printing
bits of bm_pmd and friends shows these aren't zeroed.

I think this is because the section(".pgdir") is dragging these outside
the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().

The following inelegant patch fixes this problem for me:
----------------------------%<----------------------------
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..15fc9712ddc1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
        if (pgd_none(*pgd)) {
                static pud_t bm_pud[PTRS_PER_PUD] __pgdir;

+               memset(bm_pud, 0, sizeof(bm_pud));
                pgd_populate(&init_mm, pgd, bm_pud);
                memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
        }
@@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
        if (pud_none(*pud)) {
                static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;

+               memset(bm_pmd, 0, sizeof(bm_pmd));
                pud_populate(&init_mm, pud, bm_pmd);
                memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
        }
@@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
        if (pmd_none(*pmd)) {
                static pte_t bm_pte[PTRS_PER_PTE] __pgdir;

+               memset(bm_pte, 0, sizeof(bm_pte));
                pmd_populate_kernel(&init_mm, pmd, bm_pte);
                memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
        }
----------------------------%<----------------------------

I'm sure there is a better way!


Thanks,

James


^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-06 16:35     ` James Morse
  0 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-06 16:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> Since the early fixmap page tables are populated using pages that are
> part of the static footprint of the kernel, they are covered by the
> initial kernel mapping, and we can refer to them without using __va/__pa
> translations, which are tied to the linear mapping.
> 
> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
> virtual mapping, regardless of whether or not it intersects with the linear
> mapping. This will allow us to move the kernel out of the linear mapping in
> a subsequent patch.
> 

I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.

git bisect pointed to this patch. From the debugger it looks like
rubbish is ending up the page tables after early_fixmap_init(), printing
bits of bm_pmd and friends shows these aren't zeroed.

I think this is because the section(".pgdir") is dragging these outside
the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().

The following inelegant patch fixes this problem for me:
----------------------------%<----------------------------
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..15fc9712ddc1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
        if (pgd_none(*pgd)) {
                static pud_t bm_pud[PTRS_PER_PUD] __pgdir;

+               memset(bm_pud, 0, sizeof(bm_pud));
                pgd_populate(&init_mm, pgd, bm_pud);
                memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
        }
@@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
        if (pud_none(*pud)) {
                static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;

+               memset(bm_pmd, 0, sizeof(bm_pmd));
                pud_populate(&init_mm, pud, bm_pmd);
                memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
        }
@@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
        if (pmd_none(*pmd)) {
                static pte_t bm_pte[PTRS_PER_PTE] __pgdir;

+               memset(bm_pte, 0, sizeof(bm_pte));
                pmd_populate_kernel(&init_mm, pmd, bm_pte);
                memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
        }
----------------------------%<----------------------------

I'm sure there is a better way!


Thanks,

James

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-06 16:35     ` James Morse
  0 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-06 16:35 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> Since the early fixmap page tables are populated using pages that are
> part of the static footprint of the kernel, they are covered by the
> initial kernel mapping, and we can refer to them without using __va/__pa
> translations, which are tied to the linear mapping.
> 
> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
> virtual mapping, regardless of whether or not it intersects with the linear
> mapping. This will allow us to move the kernel out of the linear mapping in
> a subsequent patch.
> 

I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.

git bisect pointed to this patch. From the debugger it looks like
rubbish is ending up the page tables after early_fixmap_init(), printing
bits of bm_pmd and friends shows these aren't zeroed.

I think this is because the section(".pgdir") is dragging these outside
the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().

The following inelegant patch fixes this problem for me:
----------------------------%<----------------------------
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a78fc5a882da..15fc9712ddc1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
        if (pgd_none(*pgd)) {
                static pud_t bm_pud[PTRS_PER_PUD] __pgdir;

+               memset(bm_pud, 0, sizeof(bm_pud));
                pgd_populate(&init_mm, pgd, bm_pud);
                memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
        }
@@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
        if (pud_none(*pud)) {
                static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;

+               memset(bm_pmd, 0, sizeof(bm_pmd));
                pud_populate(&init_mm, pud, bm_pmd);
                memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
        }
@@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
        if (pmd_none(*pmd)) {
                static pte_t bm_pte[PTRS_PER_PTE] __pgdir;

+               memset(bm_pte, 0, sizeof(bm_pte));
                pmd_populate_kernel(&init_mm, pmd, bm_pte);
                memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
        }
----------------------------%<----------------------------

I'm sure there is a better way!


Thanks,

James

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
  2016-01-06 16:35     ` James Morse
  (?)
@ 2016-01-06 16:42       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06 16:42 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier,
	Christoffer Dall

On 6 January 2016 at 17:35, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> Since the early fixmap page tables are populated using pages that are
>> part of the static footprint of the kernel, they are covered by the
>> initial kernel mapping, and we can refer to them without using __va/__pa
>> translations, which are tied to the linear mapping.
>>
>> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
>> virtual mapping, regardless of whether or not it intersects with the linear
>> mapping. This will allow us to move the kernel out of the linear mapping in
>> a subsequent patch.
>>
>
> I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
> ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.
>
> git bisect pointed to this patch. From the debugger it looks like
> rubbish is ending up the page tables after early_fixmap_init(), printing
> bits of bm_pmd and friends shows these aren't zeroed.
>
> I think this is because the section(".pgdir") is dragging these outside
> the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().
>

Thanks for spotting that! This code runs happily on my Seattle A0, but
it is obviously incorrect.

> The following inelegant patch fixes this problem for me:
> ----------------------------%<----------------------------
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a78fc5a882da..15fc9712ddc1 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
>         if (pgd_none(*pgd)) {
>                 static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>
> +               memset(bm_pud, 0, sizeof(bm_pud));
>                 pgd_populate(&init_mm, pgd, bm_pud);
>                 memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>         }
> @@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
>         if (pud_none(*pud)) {
>                 static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>
> +               memset(bm_pmd, 0, sizeof(bm_pmd));
>                 pud_populate(&init_mm, pud, bm_pmd);
>                 memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>         }
> @@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
>         if (pmd_none(*pmd)) {
>                 static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>
> +               memset(bm_pte, 0, sizeof(bm_pte));
>                 pmd_populate_kernel(&init_mm, pmd, bm_pte);
>                 memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>         }
> ----------------------------%<----------------------------
>

Actually, this looks fine to me. I will fold this into my patch

NOTE: I have a -v3 version up on git.linaro.org now, with a couple of changes.

Thanks!

Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-06 16:42       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06 16:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 6 January 2016 at 17:35, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> Since the early fixmap page tables are populated using pages that are
>> part of the static footprint of the kernel, they are covered by the
>> initial kernel mapping, and we can refer to them without using __va/__pa
>> translations, which are tied to the linear mapping.
>>
>> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
>> virtual mapping, regardless of whether or not it intersects with the linear
>> mapping. This will allow us to move the kernel out of the linear mapping in
>> a subsequent patch.
>>
>
> I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
> ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.
>
> git bisect pointed to this patch. From the debugger it looks like
> rubbish is ending up the page tables after early_fixmap_init(), printing
> bits of bm_pmd and friends shows these aren't zeroed.
>
> I think this is because the section(".pgdir") is dragging these outside
> the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().
>

Thanks for spotting that! This code runs happily on my Seattle A0, but
it is obviously incorrect.

> The following inelegant patch fixes this problem for me:
> ----------------------------%<----------------------------
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a78fc5a882da..15fc9712ddc1 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
>         if (pgd_none(*pgd)) {
>                 static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>
> +               memset(bm_pud, 0, sizeof(bm_pud));
>                 pgd_populate(&init_mm, pgd, bm_pud);
>                 memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>         }
> @@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
>         if (pud_none(*pud)) {
>                 static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>
> +               memset(bm_pmd, 0, sizeof(bm_pmd));
>                 pud_populate(&init_mm, pud, bm_pmd);
>                 memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>         }
> @@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
>         if (pmd_none(*pmd)) {
>                 static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>
> +               memset(bm_pte, 0, sizeof(bm_pte));
>                 pmd_populate_kernel(&init_mm, pmd, bm_pte);
>                 memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>         }
> ----------------------------%<----------------------------
>

Actually, this looks fine to me. I will fold this into my patch

NOTE: I have a -v3 version up on git.linaro.org now, with a couple of changes.

Thanks!

Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-06 16:42       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-06 16:42 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier,
	Christoffer Dall

On 6 January 2016 at 17:35, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> Since the early fixmap page tables are populated using pages that are
>> part of the static footprint of the kernel, they are covered by the
>> initial kernel mapping, and we can refer to them without using __va/__pa
>> translations, which are tied to the linear mapping.
>>
>> Instead, let's introduce __phys_to_kimg, which will be tied to the kernel
>> virtual mapping, regardless of whether or not it intersects with the linear
>> mapping. This will allow us to move the kernel out of the linear mapping in
>> a subsequent patch.
>>
>
> I gave your arm64-kaslr-v2 branch a go on juno r1, currently with
> ARM64_RELOCATABLE_KERNEL=n, to find it didn't boot.
>
> git bisect pointed to this patch. From the debugger it looks like
> rubbish is ending up the page tables after early_fixmap_init(), printing
> bits of bm_pmd and friends shows these aren't zeroed.
>
> I think this is because the section(".pgdir") is dragging these outside
> the __bss_start/__bss_stop range that is zeroed in head.S:__mmap_switched().
>

Thanks for spotting that! This code runs happily on my Seattle A0, but
it is obviously incorrect.

> The following inelegant patch fixes this problem for me:
> ----------------------------%<----------------------------
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a78fc5a882da..15fc9712ddc1 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -559,6 +559,7 @@ void __init early_fixmap_init(void)
>         if (pgd_none(*pgd)) {
>                 static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>
> +               memset(bm_pud, 0, sizeof(bm_pud));
>                 pgd_populate(&init_mm, pgd, bm_pud);
>                 memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>         }
> @@ -570,6 +571,7 @@ void __init early_fixmap_init(void)
>         if (pud_none(*pud)) {
>                 static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>
> +               memset(bm_pmd, 0, sizeof(bm_pmd));
>                 pud_populate(&init_mm, pud, bm_pmd);
>                 memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>         }
> @@ -580,6 +582,7 @@ void __init early_fixmap_init(void)
>         if (pmd_none(*pmd)) {
>                 static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>
> +               memset(bm_pte, 0, sizeof(bm_pte));
>                 pmd_populate_kernel(&init_mm, pmd, bm_pte);
>                 memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>         }
> ----------------------------%<----------------------------
>

Actually, this looks fine to me. I will fold this into my patch

NOTE: I have a -v3 version up on git.linaro.org now, with a couple of changes.

Thanks!

Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-07 13:55     ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 13:55 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> Before introducing new statically allocated page tables and increasing
> their alignment in subsequent patches, update the reservation logic
> so that only pages that are in actual use end up as reserved with
> memblock.

Could you add something to the commit message about what this will gain
us (i.e. which pages we don't have to reserve)? It's not immediately
obvious why we'd have page tables we wouldn't want to reserve.

>From the looks of the next patch we won't have redundant levels of
fixmap table for a given configuration, so I guess we're catering for
the case the fixmap shares a pgd/pud/pmd entry with the image mapping?

Does that happen? If so that would invalidate the assumption I make when
copying the fixmap over in [1] (see map_kernel).

To handle that either we need some special logic to copy over the
relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
avoid sharing a pgd entry.

Thoughts?

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/397114.html

> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/mm/init.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6bacba847923..8e678d05ad84 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -36,6 +36,7 @@
>  #include <linux/swiotlb.h>
>  
>  #include <asm/fixmap.h>
> +#include <asm/kernel-pgtable.h>
>  #include <asm/memory.h>
>  #include <asm/sections.h>
>  #include <asm/setup.h>
> @@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
>  	 * Register the kernel text, kernel data, initrd, and initial
>  	 * pagetables with memblock.
>  	 */
> -	memblock_reserve(__pa(_text), _end - _text);
> +	memblock_reserve(__pa(_text), __bss_stop - _text);
>  #ifdef CONFIG_BLK_DEV_INITRD
>  	if (initrd_start)
>  		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>  #endif
> +	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
> +	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
>  
>  	early_init_fdt_scan_reserved_mem();
>  
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 13:55     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> Before introducing new statically allocated page tables and increasing
> their alignment in subsequent patches, update the reservation logic
> so that only pages that are in actual use end up as reserved with
> memblock.

Could you add something to the commit message about what this will gain
us (i.e. which pages we don't have to reserve)? It's not immediately
obvious why we'd have page tables we wouldn't want to reserve.

>From the looks of the next patch we won't have redundant levels of
fixmap table for a given configuration, so I guess we're catering for
the case the fixmap shares a pgd/pud/pmd entry with the image mapping?

Does that happen? If so that would invalidate the assumption I make when
copying the fixmap over in [1] (see map_kernel).

To handle that either we need some special logic to copy over the
relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
avoid sharing a pgd entry.

Thoughts?

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/397114.html

> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/mm/init.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6bacba847923..8e678d05ad84 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -36,6 +36,7 @@
>  #include <linux/swiotlb.h>
>  
>  #include <asm/fixmap.h>
> +#include <asm/kernel-pgtable.h>
>  #include <asm/memory.h>
>  #include <asm/sections.h>
>  #include <asm/setup.h>
> @@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
>  	 * Register the kernel text, kernel data, initrd, and initial
>  	 * pagetables with memblock.
>  	 */
> -	memblock_reserve(__pa(_text), _end - _text);
> +	memblock_reserve(__pa(_text), __bss_stop - _text);
>  #ifdef CONFIG_BLK_DEV_INITRD
>  	if (initrd_start)
>  		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>  #endif
> +	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
> +	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
>  
>  	early_init_fdt_scan_reserved_mem();
>  
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 13:55     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 13:55 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> Before introducing new statically allocated page tables and increasing
> their alignment in subsequent patches, update the reservation logic
> so that only pages that are in actual use end up as reserved with
> memblock.

Could you add something to the commit message about what this will gain
us (i.e. which pages we don't have to reserve)? It's not immediately
obvious why we'd have page tables we wouldn't want to reserve.

>From the looks of the next patch we won't have redundant levels of
fixmap table for a given configuration, so I guess we're catering for
the case the fixmap shares a pgd/pud/pmd entry with the image mapping?

Does that happen? If so that would invalidate the assumption I make when
copying the fixmap over in [1] (see map_kernel).

To handle that either we need some special logic to copy over the
relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
avoid sharing a pgd entry.

Thoughts?

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/397114.html

> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/mm/init.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6bacba847923..8e678d05ad84 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -36,6 +36,7 @@
>  #include <linux/swiotlb.h>
>  
>  #include <asm/fixmap.h>
> +#include <asm/kernel-pgtable.h>
>  #include <asm/memory.h>
>  #include <asm/sections.h>
>  #include <asm/setup.h>
> @@ -165,11 +166,13 @@ void __init arm64_memblock_init(void)
>  	 * Register the kernel text, kernel data, initrd, and initial
>  	 * pagetables with memblock.
>  	 */
> -	memblock_reserve(__pa(_text), _end - _text);
> +	memblock_reserve(__pa(_text), __bss_stop - _text);
>  #ifdef CONFIG_BLK_DEV_INITRD
>  	if (initrd_start)
>  		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>  #endif
> +	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
> +	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
>  
>  	early_init_fdt_scan_reserved_mem();
>  
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
  2016-01-07 13:55     ` Mark Rutland
  (?)
@ 2016-01-07 14:02       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-07 14:02 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
>> Before introducing new statically allocated page tables and increasing
>> their alignment in subsequent patches, update the reservation logic
>> so that only pages that are in actual use end up as reserved with
>> memblock.
>
> Could you add something to the commit message about what this will gain
> us (i.e. which pages we don't have to reserve)? It's not immediately
> obvious why we'd have page tables we wouldn't want to reserve.
>

OK. In the original series, I also aligned the pgdir section to a log2
upper bound of its size, but that is not necessary anymore with your
changes. So the original goal was to avoid reserving the alignment
padding as well as the pgdirs that end up unused

> From the looks of the next patch we won't have redundant levels of
> fixmap table for a given configuration, so I guess we're catering for
> the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
>
> Does that happen? If so that would invalidate the assumption I make when
> copying the fixmap over in [1] (see map_kernel).
>

It is a lot less likely to happen now that I moved the kernel to the
start of the vmalloc area rather than right below PAGE_OFFSET. But in
general, it seems sensible to only populate entries after confirming
that they are in fact vacant.

> To handle that either we need some special logic to copy over the
> relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> avoid sharing a pgd entry.
>
> Thoughts?
>

Yes, I have added that to my v3 version of the vmalloc base move patch here

https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

but I think 16k/4 levels is the only config affected when the kernel
is always in the lower half of the vmalloc area. That also implies
that the fixmap pgd is either always shared, or never, depending on
the build time config, so I could probably simplify that part
somewhat.

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 14:02       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-07 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
>> Before introducing new statically allocated page tables and increasing
>> their alignment in subsequent patches, update the reservation logic
>> so that only pages that are in actual use end up as reserved with
>> memblock.
>
> Could you add something to the commit message about what this will gain
> us (i.e. which pages we don't have to reserve)? It's not immediately
> obvious why we'd have page tables we wouldn't want to reserve.
>

OK. In the original series, I also aligned the pgdir section to a log2
upper bound of its size, but that is not necessary anymore with your
changes. So the original goal was to avoid reserving the alignment
padding as well as the pgdirs that end up unused

> From the looks of the next patch we won't have redundant levels of
> fixmap table for a given configuration, so I guess we're catering for
> the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
>
> Does that happen? If so that would invalidate the assumption I make when
> copying the fixmap over in [1] (see map_kernel).
>

It is a lot less likely to happen now that I moved the kernel to the
start of the vmalloc area rather than right below PAGE_OFFSET. But in
general, it seems sensible to only populate entries after confirming
that they are in fact vacant.

> To handle that either we need some special logic to copy over the
> relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> avoid sharing a pgd entry.
>
> Thoughts?
>

Yes, I have added that to my v3 version of the vmalloc base move patch here

https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

but I think 16k/4 levels is the only config affected when the kernel
is always in the lower half of the vmalloc area. That also implies
that the fixmap pgd is either always shared, or never, depending on
the build time config, so I could probably simplify that part
somewhat.

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 14:02       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-07 14:02 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
>> Before introducing new statically allocated page tables and increasing
>> their alignment in subsequent patches, update the reservation logic
>> so that only pages that are in actual use end up as reserved with
>> memblock.
>
> Could you add something to the commit message about what this will gain
> us (i.e. which pages we don't have to reserve)? It's not immediately
> obvious why we'd have page tables we wouldn't want to reserve.
>

OK. In the original series, I also aligned the pgdir section to a log2
upper bound of its size, but that is not necessary anymore with your
changes. So the original goal was to avoid reserving the alignment
padding as well as the pgdirs that end up unused

> From the looks of the next patch we won't have redundant levels of
> fixmap table for a given configuration, so I guess we're catering for
> the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
>
> Does that happen? If so that would invalidate the assumption I make when
> copying the fixmap over in [1] (see map_kernel).
>

It is a lot less likely to happen now that I moved the kernel to the
start of the vmalloc area rather than right below PAGE_OFFSET. But in
general, it seems sensible to only populate entries after confirming
that they are in fact vacant.

> To handle that either we need some special logic to copy over the
> relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> avoid sharing a pgd entry.
>
> Thoughts?
>

Yes, I have added that to my v3 version of the vmalloc base move patch here

https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

but I think 16k/4 levels is the only config affected when the kernel
is always in the lower half of the vmalloc area. That also implies
that the fixmap pgd is either always shared, or never, depending on
the build time config, so I could probably simplify that part
somewhat.

-- 
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
  2016-01-07 14:02       ` Ard Biesheuvel
  (?)
@ 2016-01-07 14:25         ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 14:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Thu, Jan 07, 2016 at 03:02:00PM +0100, Ard Biesheuvel wrote:
> On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> >> Before introducing new statically allocated page tables and increasing
> >> their alignment in subsequent patches, update the reservation logic
> >> so that only pages that are in actual use end up as reserved with
> >> memblock.
> >
> > Could you add something to the commit message about what this will gain
> > us (i.e. which pages we don't have to reserve)? It's not immediately
> > obvious why we'd have page tables we wouldn't want to reserve.
> >
> 
> OK. In the original series, I also aligned the pgdir section to a log2
> upper bound of its size, but that is not necessary anymore with your
> changes. So the original goal was to avoid reserving the alignment
> padding as well as the pgdirs that end up unused

Ah, I see.

> > From the looks of the next patch we won't have redundant levels of
> > fixmap table for a given configuration, so I guess we're catering for
> > the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
> >
> > Does that happen? If so that would invalidate the assumption I make when
> > copying the fixmap over in [1] (see map_kernel).
> >
> 
> It is a lot less likely to happen now that I moved the kernel to the
> start of the vmalloc area rather than right below PAGE_OFFSET. But in
> general, it seems sensible to only populate entries after confirming
> that they are in fact vacant.

Sure.

> > To handle that either we need some special logic to copy over the
> > relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> > avoid sharing a pgd entry.
> >
> > Thoughts?
> >
> 
> Yes, I have added that to my v3 version of the vmalloc base move patch here
> 
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

Ah, great!

> but I think 16k/4 levels is the only config affected when the kernel
> is always in the lower half of the vmalloc area. That also implies
> that the fixmap pgd is either always shared, or never, depending on
> the build time config, so I could probably simplify that part
> somewhat.

Ok.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 14:25         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 07, 2016 at 03:02:00PM +0100, Ard Biesheuvel wrote:
> On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> >> Before introducing new statically allocated page tables and increasing
> >> their alignment in subsequent patches, update the reservation logic
> >> so that only pages that are in actual use end up as reserved with
> >> memblock.
> >
> > Could you add something to the commit message about what this will gain
> > us (i.e. which pages we don't have to reserve)? It's not immediately
> > obvious why we'd have page tables we wouldn't want to reserve.
> >
> 
> OK. In the original series, I also aligned the pgdir section to a log2
> upper bound of its size, but that is not necessary anymore with your
> changes. So the original goal was to avoid reserving the alignment
> padding as well as the pgdirs that end up unused

Ah, I see.

> > From the looks of the next patch we won't have redundant levels of
> > fixmap table for a given configuration, so I guess we're catering for
> > the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
> >
> > Does that happen? If so that would invalidate the assumption I make when
> > copying the fixmap over in [1] (see map_kernel).
> >
> 
> It is a lot less likely to happen now that I moved the kernel to the
> start of the vmalloc area rather than right below PAGE_OFFSET. But in
> general, it seems sensible to only populate entries after confirming
> that they are in fact vacant.

Sure.

> > To handle that either we need some special logic to copy over the
> > relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> > avoid sharing a pgd entry.
> >
> > Thoughts?
> >
> 
> Yes, I have added that to my v3 version of the vmalloc base move patch here
> 
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

Ah, great!

> but I think 16k/4 levels is the only config affected when the kernel
> is always in the lower half of the vmalloc area. That also implies
> that the fixmap pgd is either always shared, or never, depending on
> the build time config, so I could probably simplify that part
> somewhat.

Ok.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations
@ 2016-01-07 14:25         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 14:25 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Thu, Jan 07, 2016 at 03:02:00PM +0100, Ard Biesheuvel wrote:
> On 7 January 2016 at 14:55, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:02PM +0100, Ard Biesheuvel wrote:
> >> Before introducing new statically allocated page tables and increasing
> >> their alignment in subsequent patches, update the reservation logic
> >> so that only pages that are in actual use end up as reserved with
> >> memblock.
> >
> > Could you add something to the commit message about what this will gain
> > us (i.e. which pages we don't have to reserve)? It's not immediately
> > obvious why we'd have page tables we wouldn't want to reserve.
> >
> 
> OK. In the original series, I also aligned the pgdir section to a log2
> upper bound of its size, but that is not necessary anymore with your
> changes. So the original goal was to avoid reserving the alignment
> padding as well as the pgdirs that end up unused

Ah, I see.

> > From the looks of the next patch we won't have redundant levels of
> > fixmap table for a given configuration, so I guess we're catering for
> > the case the fixmap shares a pgd/pud/pmd entry with the image mapping?
> >
> > Does that happen? If so that would invalidate the assumption I make when
> > copying the fixmap over in [1] (see map_kernel).
> >
> 
> It is a lot less likely to happen now that I moved the kernel to the
> start of the vmalloc area rather than right below PAGE_OFFSET. But in
> general, it seems sensible to only populate entries after confirming
> that they are in fact vacant.

Sure.

> > To handle that either we need some special logic to copy over the
> > relevant bits for the fixmap (as with kasan_copy_shadow), or we need to
> > avoid sharing a pgd entry.
> >
> > Thoughts?
> >
> 
> Yes, I have added that to my v3 version of the vmalloc base move patch here
> 
> https://git.linaro.org/people/ard.biesheuvel/linux-arm.git/commitdiff/0beef2c1a6bfc90cc116a6ba1b24f2ba35e7e5f6

Ah, great!

> but I think 16k/4 levels is the only config affected when the kernel
> is always in the lower half of the vmalloc area. That also implies
> that the fixmap pgd is either always shared, or never, depending on
> the build time config, so I could probably simplify that part
> somewhat.

Ok.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-07 18:46     ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 18:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

Hi Ard,

I had a go at testing this on Juno with a hacked-up PRNG, and while
everything seems to work, I think we need to make the address selection
more robust to sparse memory maps (which I believe they are going to be
fairly common).

Info dump below and suggestion below.

Other than that, this looks really nice -- I'll do other review in a
separate reply.

On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)

[...]

> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>  	if (preferred_offset < dram_base)
>  		preferred_offset += SZ_2M;
>  
> -	/* Relocate the image, if required. */
>  	kernel_size = _edata - _text;
> -	if (*image_addr != preferred_offset) {
> -		kernel_memsize = kernel_size + (_end - _edata);
> +	kernel_memsize = kernel_size + (_end - _edata);
> +
> +	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +		/*
> +		 * If KASLR is enabled, and we have some randomness available,
> +		 * locate the kernel at a randomized offset in physical memory.
> +		 */
> +		u64 dram_top = dram_base;
> +
> +		status = get_dram_top(sys_table_arg, &dram_top);
> +		if (status != EFI_SUCCESS) {
> +			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +			return status;
> +		}
> +
> +		kernel_memsize += SZ_2M;
> +		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +				    EFI_PAGE_SIZE;
>  
>  		/*
> -		 * First, try a straight allocation at the preferred offset.
> +		 * Use the random seed to scale the size and add it to the DRAM
> +		 * base. Note that this may give suboptimal results on systems
> +		 * with discontiguous DRAM regions with large holes between them.
> +		 */
> +		*reserve_addr = dram_base +
> +			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;

I think that "suboptimal" is somewhat an understatement. Across 10
consecutive runs I ended up getting the same address 7 times:

EFI stub: Seed is 0x0a82016804fdc064
EFI stub: KASLR reserve address is 0x0000000832c48000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a820168050c09b2
EFI stub: KASLR reserve address is 0x00000000c59e0000
EFI stub: Loading kernel to physical address 0x00000000c4e80000 *

EFI stub: Seed is 0x0a8001680511c701
EFI stub: KASLR reserve address is 0x00000007feb40000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a8001680094d2a2
EFI stub: KASLR reserve address is 0x0000000895bd0000
EFI stub: Loading kernel to physical address 0x0000000895080000 *

EFI stub: Seed is 0x88820167ea986527
EFI stub: KASLR reserve address is 0x00000000bc570000
EFI stub: Loading kernel to physical address 0x00000000bb880000 *

EFI stub: Seed is 0x0882116805029414
EFI stub: KASLR reserve address is 0x00000005955a0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x8a821168050104ab
EFI stub: KASLR reserve address is 0x0000000639600000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08820168050671c6
EFI stub: KASLR reserve address is 0x00000005250f0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08821167ea67381f
EFI stub: KASLR reserve address is 0x000000080e538000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a801168050cb810
EFI stub: KASLR reserve address is 0x00000006b20e0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
each of the 16 bit chunks (see diff at the end of hte email). Judging by
the reserve addresses, I don't think the PRNG is to blame -- it's just
that that gaps are large relative to the available RAM and swallow up
much of the entropy, forcing a fall back to the same address.

One thing we could do is to perform the address selection in the space
of available memory, excluding gaps entirely. i.e. sum up the available
memory, select the Nth available byte, then walk the memory map to
convert that back to a real address. We might still choose an address
that cannot be used (e.g. if the kernel would hang over the end of a
region), but it'd be rarer than hitting a gap.

Thoughts?

For the above, my EFI memory map looks like:

[    0.000000] Processing EFI memory map:
[    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]

I've included my local hacks below in case they are useful.

Thanks,
Mark.

---->8----
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 27a1a92..00c6640 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -30,6 +30,34 @@ extern struct {
 
 extern bool kaslr;
 
+static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
+{
+       const char hex[16] = "0123456789abcdef";
+       char *strp, str[] = "0x0000000000000000";
+       strp = str + 18;
+
+       do {
+               *(--strp) = hex[val & 0xf];
+       } while (val >>= 4);
+
+       efi_printk(sys_table_arg, str);
+}
+
+static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
+{
+       u64 seed;
+       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
+
+       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
+
+       pr_efi(sys_table, "Seed is ");
+       log_hex(sys_table, seed);
+       efi_printk(sys_table, "\n");
+
+       efi_rnd.virt_seed = seed;
+       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
+}
+
 static int efi_get_random_bytes(efi_system_table_t *sys_table)
 {
        efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
@@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
                                                      (void **)&rng);
        if (status == EFI_NOT_FOUND) {
                pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+               dodgy_get_random_bytes(sys_table);
                return EFI_SUCCESS;
        }
 
@@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
        return EFI_SUCCESS;
 }
 
+static void log_kernel_address(efi_system_table_t *sys_table_arg,
+                              unsigned long addr, unsigned long kaslr_addr)
+{
+       pr_efi(sys_table_arg, "KASLR reserve address is ");
+       log_hex(sys_table_arg, kaslr_addr);
+       efi_printk(sys_table_arg, "\n");
+       pr_efi(sys_table_arg, "Loading kernel to physical address ");
+       log_hex(sys_table_arg, addr);
+       efi_printk(sys_table_arg, "\n");
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                                        unsigned long *image_addr,
                                        unsigned long *image_size,
@@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
        unsigned long nr_pages;
        void *old_image_addr = (void *)*image_addr;
        unsigned long preferred_offset;
+       unsigned long kaslr_address = 0;
 
        if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
                if (kaslr) {
@@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                 * base. Note that this may give suboptimal results on systems
                 * with discontiguous DRAM regions with large holes between them.
                 */
-               *reserve_addr = dram_base +
+               kaslr_address = dram_base +
                        ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+               *reserve_addr = kaslr_address;
 
                status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
                                        EFI_LOADER_DATA, nr_pages,
@@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                }
                *image_addr = *reserve_addr + TEXT_OFFSET;
        }
+
+       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
+
        memcpy((void *)*image_addr, old_image_addr, kernel_size);
        *reserve_size = kernel_memsize;
 

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-07 18:46     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 18:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

I had a go at testing this on Juno with a hacked-up PRNG, and while
everything seems to work, I think we need to make the address selection
more robust to sparse memory maps (which I believe they are going to be
fairly common).

Info dump below and suggestion below.

Other than that, this looks really nice -- I'll do other review in a
separate reply.

On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)

[...]

> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>  	if (preferred_offset < dram_base)
>  		preferred_offset += SZ_2M;
>  
> -	/* Relocate the image, if required. */
>  	kernel_size = _edata - _text;
> -	if (*image_addr != preferred_offset) {
> -		kernel_memsize = kernel_size + (_end - _edata);
> +	kernel_memsize = kernel_size + (_end - _edata);
> +
> +	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +		/*
> +		 * If KASLR is enabled, and we have some randomness available,
> +		 * locate the kernel at a randomized offset in physical memory.
> +		 */
> +		u64 dram_top = dram_base;
> +
> +		status = get_dram_top(sys_table_arg, &dram_top);
> +		if (status != EFI_SUCCESS) {
> +			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +			return status;
> +		}
> +
> +		kernel_memsize += SZ_2M;
> +		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +				    EFI_PAGE_SIZE;
>  
>  		/*
> -		 * First, try a straight allocation at the preferred offset.
> +		 * Use the random seed to scale the size and add it to the DRAM
> +		 * base. Note that this may give suboptimal results on systems
> +		 * with discontiguous DRAM regions with large holes between them.
> +		 */
> +		*reserve_addr = dram_base +
> +			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;

I think that "suboptimal" is somewhat an understatement. Across 10
consecutive runs I ended up getting the same address 7 times:

EFI stub: Seed is 0x0a82016804fdc064
EFI stub: KASLR reserve address is 0x0000000832c48000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a820168050c09b2
EFI stub: KASLR reserve address is 0x00000000c59e0000
EFI stub: Loading kernel to physical address 0x00000000c4e80000 *

EFI stub: Seed is 0x0a8001680511c701
EFI stub: KASLR reserve address is 0x00000007feb40000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a8001680094d2a2
EFI stub: KASLR reserve address is 0x0000000895bd0000
EFI stub: Loading kernel to physical address 0x0000000895080000 *

EFI stub: Seed is 0x88820167ea986527
EFI stub: KASLR reserve address is 0x00000000bc570000
EFI stub: Loading kernel to physical address 0x00000000bb880000 *

EFI stub: Seed is 0x0882116805029414
EFI stub: KASLR reserve address is 0x00000005955a0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x8a821168050104ab
EFI stub: KASLR reserve address is 0x0000000639600000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08820168050671c6
EFI stub: KASLR reserve address is 0x00000005250f0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08821167ea67381f
EFI stub: KASLR reserve address is 0x000000080e538000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a801168050cb810
EFI stub: KASLR reserve address is 0x00000006b20e0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
each of the 16 bit chunks (see diff at the end of hte email). Judging by
the reserve addresses, I don't think the PRNG is to blame -- it's just
that that gaps are large relative to the available RAM and swallow up
much of the entropy, forcing a fall back to the same address.

One thing we could do is to perform the address selection in the space
of available memory, excluding gaps entirely. i.e. sum up the available
memory, select the Nth available byte, then walk the memory map to
convert that back to a real address. We might still choose an address
that cannot be used (e.g. if the kernel would hang over the end of a
region), but it'd be rarer than hitting a gap.

Thoughts?

For the above, my EFI memory map looks like:

[    0.000000] Processing EFI memory map:
[    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]

I've included my local hacks below in case they are useful.

Thanks,
Mark.

---->8----
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 27a1a92..00c6640 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -30,6 +30,34 @@ extern struct {
 
 extern bool kaslr;
 
+static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
+{
+       const char hex[16] = "0123456789abcdef";
+       char *strp, str[] = "0x0000000000000000";
+       strp = str + 18;
+
+       do {
+               *(--strp) = hex[val & 0xf];
+       } while (val >>= 4);
+
+       efi_printk(sys_table_arg, str);
+}
+
+static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
+{
+       u64 seed;
+       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
+
+       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
+
+       pr_efi(sys_table, "Seed is ");
+       log_hex(sys_table, seed);
+       efi_printk(sys_table, "\n");
+
+       efi_rnd.virt_seed = seed;
+       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
+}
+
 static int efi_get_random_bytes(efi_system_table_t *sys_table)
 {
        efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
@@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
                                                      (void **)&rng);
        if (status == EFI_NOT_FOUND) {
                pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+               dodgy_get_random_bytes(sys_table);
                return EFI_SUCCESS;
        }
 
@@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
        return EFI_SUCCESS;
 }
 
+static void log_kernel_address(efi_system_table_t *sys_table_arg,
+                              unsigned long addr, unsigned long kaslr_addr)
+{
+       pr_efi(sys_table_arg, "KASLR reserve address is ");
+       log_hex(sys_table_arg, kaslr_addr);
+       efi_printk(sys_table_arg, "\n");
+       pr_efi(sys_table_arg, "Loading kernel to physical address ");
+       log_hex(sys_table_arg, addr);
+       efi_printk(sys_table_arg, "\n");
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                                        unsigned long *image_addr,
                                        unsigned long *image_size,
@@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
        unsigned long nr_pages;
        void *old_image_addr = (void *)*image_addr;
        unsigned long preferred_offset;
+       unsigned long kaslr_address = 0;
 
        if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
                if (kaslr) {
@@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                 * base. Note that this may give suboptimal results on systems
                 * with discontiguous DRAM regions with large holes between them.
                 */
-               *reserve_addr = dram_base +
+               kaslr_address = dram_base +
                        ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+               *reserve_addr = kaslr_address;
 
                status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
                                        EFI_LOADER_DATA, nr_pages,
@@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                }
                *image_addr = *reserve_addr + TEXT_OFFSET;
        }
+
+       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
+
        memcpy((void *)*image_addr, old_image_addr, kernel_size);
        *reserve_size = kernel_memsize;
 

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-07 18:46     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-07 18:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

Hi Ard,

I had a go at testing this on Juno with a hacked-up PRNG, and while
everything seems to work, I think we need to make the address selection
more robust to sparse memory maps (which I believe they are going to be
fairly common).

Info dump below and suggestion below.

Other than that, this looks really nice -- I'll do other review in a
separate reply.

On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
> Since arm64 does not use a decompressor that supplies an execution
> environment where it is feasible to some extent to provide a source of
> randomness, the arm64 KASLR kernel depends on the bootloader to supply
> some random bits in register x1 upon kernel entry.
> 
> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
> some random bits. At the same time, use it to randomize the offset of the
> kernel Image in physical memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi-entry.S             |   7 +-
>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>  include/linux/efi.h                       |   5 +-
>  4 files changed, 127 insertions(+), 20 deletions(-)

[...]

> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>  	if (preferred_offset < dram_base)
>  		preferred_offset += SZ_2M;
>  
> -	/* Relocate the image, if required. */
>  	kernel_size = _edata - _text;
> -	if (*image_addr != preferred_offset) {
> -		kernel_memsize = kernel_size + (_end - _edata);
> +	kernel_memsize = kernel_size + (_end - _edata);
> +
> +	if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
> +		/*
> +		 * If KASLR is enabled, and we have some randomness available,
> +		 * locate the kernel at a randomized offset in physical memory.
> +		 */
> +		u64 dram_top = dram_base;
> +
> +		status = get_dram_top(sys_table_arg, &dram_top);
> +		if (status != EFI_SUCCESS) {
> +			pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
> +			return status;
> +		}
> +
> +		kernel_memsize += SZ_2M;
> +		nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
> +				    EFI_PAGE_SIZE;
>  
>  		/*
> -		 * First, try a straight allocation at the preferred offset.
> +		 * Use the random seed to scale the size and add it to the DRAM
> +		 * base. Note that this may give suboptimal results on systems
> +		 * with discontiguous DRAM regions with large holes between them.
> +		 */
> +		*reserve_addr = dram_base +
> +			((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;

I think that "suboptimal" is somewhat an understatement. Across 10
consecutive runs I ended up getting the same address 7 times:

EFI stub: Seed is 0x0a82016804fdc064
EFI stub: KASLR reserve address is 0x0000000832c48000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a820168050c09b2
EFI stub: KASLR reserve address is 0x00000000c59e0000
EFI stub: Loading kernel to physical address 0x00000000c4e80000 *

EFI stub: Seed is 0x0a8001680511c701
EFI stub: KASLR reserve address is 0x00000007feb40000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a8001680094d2a2
EFI stub: KASLR reserve address is 0x0000000895bd0000
EFI stub: Loading kernel to physical address 0x0000000895080000 *

EFI stub: Seed is 0x88820167ea986527
EFI stub: KASLR reserve address is 0x00000000bc570000
EFI stub: Loading kernel to physical address 0x00000000bb880000 *

EFI stub: Seed is 0x0882116805029414
EFI stub: KASLR reserve address is 0x00000005955a0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x8a821168050104ab
EFI stub: KASLR reserve address is 0x0000000639600000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08820168050671c6
EFI stub: KASLR reserve address is 0x00000005250f0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x08821167ea67381f
EFI stub: KASLR reserve address is 0x000000080e538000
EFI stub: Loading kernel to physical address 0x00000000fe080000

EFI stub: Seed is 0x0a801168050cb810
EFI stub: KASLR reserve address is 0x00000006b20e0000
EFI stub: Loading kernel to physical address 0x00000000fe080000

My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
each of the 16 bit chunks (see diff at the end of hte email). Judging by
the reserve addresses, I don't think the PRNG is to blame -- it's just
that that gaps are large relative to the available RAM and swallow up
much of the entropy, forcing a fall back to the same address.

One thing we could do is to perform the address selection in the space
of available memory, excluding gaps entirely. i.e. sum up the available
memory, select the Nth available byte, then walk the memory map to
convert that back to a real address. We might still choose an address
that cannot be used (e.g. if the kernel would hang over the end of a
region), but it'd be rarer than hitting a gap.

Thoughts?

For the above, my EFI memory map looks like:

[    0.000000] Processing EFI memory map:
[    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
[    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]

I've included my local hacks below in case they are useful.

Thanks,
Mark.

---->8----
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 27a1a92..00c6640 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -30,6 +30,34 @@ extern struct {
 
 extern bool kaslr;
 
+static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
+{
+       const char hex[16] = "0123456789abcdef";
+       char *strp, str[] = "0x0000000000000000";
+       strp = str + 18;
+
+       do {
+               *(--strp) = hex[val & 0xf];
+       } while (val >>= 4);
+
+       efi_printk(sys_table_arg, str);
+}
+
+static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
+{
+       u64 seed;
+       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
+
+       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
+
+       pr_efi(sys_table, "Seed is ");
+       log_hex(sys_table, seed);
+       efi_printk(sys_table, "\n");
+
+       efi_rnd.virt_seed = seed;
+       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
+}
+
 static int efi_get_random_bytes(efi_system_table_t *sys_table)
 {
        efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
@@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
                                                      (void **)&rng);
        if (status == EFI_NOT_FOUND) {
                pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+               dodgy_get_random_bytes(sys_table);
                return EFI_SUCCESS;
        }
 
@@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
        return EFI_SUCCESS;
 }
 
+static void log_kernel_address(efi_system_table_t *sys_table_arg,
+                              unsigned long addr, unsigned long kaslr_addr)
+{
+       pr_efi(sys_table_arg, "KASLR reserve address is ");
+       log_hex(sys_table_arg, kaslr_addr);
+       efi_printk(sys_table_arg, "\n");
+       pr_efi(sys_table_arg, "Loading kernel to physical address ");
+       log_hex(sys_table_arg, addr);
+       efi_printk(sys_table_arg, "\n");
+}
+
 efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                                        unsigned long *image_addr,
                                        unsigned long *image_size,
@@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
        unsigned long nr_pages;
        void *old_image_addr = (void *)*image_addr;
        unsigned long preferred_offset;
+       unsigned long kaslr_address = 0;
 
        if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
                if (kaslr) {
@@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                 * base. Note that this may give suboptimal results on systems
                 * with discontiguous DRAM regions with large holes between them.
                 */
-               *reserve_addr = dram_base +
+               kaslr_address = dram_base +
                        ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
+               *reserve_addr = kaslr_address;
 
                status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
                                        EFI_LOADER_DATA, nr_pages,
@@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
                }
                *image_addr = *reserve_addr + TEXT_OFFSET;
        }
+
+       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
+
        memcpy((void *)*image_addr, old_image_addr, kernel_size);
        *reserve_size = kernel_memsize;
 

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  2016-01-07 18:46     ` Mark Rutland
  (?)
@ 2016-01-07 19:07       ` Kees Cook
  -1 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-07 19:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, Will Deacon,
	Catalin Marinas, Leif Lindholm, LKML, stuart.yoder,
	bhupesh.sharma, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Thu, Jan 7, 2016 at 10:46 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Ard,
>
> I had a go at testing this on Juno with a hacked-up PRNG, and while
> everything seems to work, I think we need to make the address selection
> more robust to sparse memory maps (which I believe they are going to be
> fairly common).
>
> Info dump below and suggestion below.
>
> Other than that, this looks really nice -- I'll do other review in a
> separate reply.
>
> On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> [...]
>
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>       if (preferred_offset < dram_base)
>>               preferred_offset += SZ_2M;
>>
>> -     /* Relocate the image, if required. */
>>       kernel_size = _edata - _text;
>> -     if (*image_addr != preferred_offset) {
>> -             kernel_memsize = kernel_size + (_end - _edata);
>> +     kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +     if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +             /*
>> +              * If KASLR is enabled, and we have some randomness available,
>> +              * locate the kernel at a randomized offset in physical memory.
>> +              */
>> +             u64 dram_top = dram_base;
>> +
>> +             status = get_dram_top(sys_table_arg, &dram_top);
>> +             if (status != EFI_SUCCESS) {
>> +                     pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                     return status;
>> +             }
>> +
>> +             kernel_memsize += SZ_2M;
>> +             nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                 EFI_PAGE_SIZE;
>>
>>               /*
>> -              * First, try a straight allocation at the preferred offset.
>> +              * Use the random seed to scale the size and add it to the DRAM
>> +              * base. Note that this may give suboptimal results on systems
>> +              * with discontiguous DRAM regions with large holes between them.
>> +              */
>> +             *reserve_addr = dram_base +
>> +                     ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>
> I think that "suboptimal" is somewhat an understatement. Across 10
> consecutive runs I ended up getting the same address 7 times:
>
> EFI stub: Seed is 0x0a82016804fdc064
> EFI stub: KASLR reserve address is 0x0000000832c48000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a820168050c09b2
> EFI stub: KASLR reserve address is 0x00000000c59e0000
> EFI stub: Loading kernel to physical address 0x00000000c4e80000 *
>
> EFI stub: Seed is 0x0a8001680511c701
> EFI stub: KASLR reserve address is 0x00000007feb40000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a8001680094d2a2
> EFI stub: KASLR reserve address is 0x0000000895bd0000
> EFI stub: Loading kernel to physical address 0x0000000895080000 *
>
> EFI stub: Seed is 0x88820167ea986527
> EFI stub: KASLR reserve address is 0x00000000bc570000
> EFI stub: Loading kernel to physical address 0x00000000bb880000 *
>
> EFI stub: Seed is 0x0882116805029414
> EFI stub: KASLR reserve address is 0x00000005955a0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x8a821168050104ab
> EFI stub: KASLR reserve address is 0x0000000639600000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08820168050671c6
> EFI stub: KASLR reserve address is 0x00000005250f0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08821167ea67381f
> EFI stub: KASLR reserve address is 0x000000080e538000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a801168050cb810
> EFI stub: KASLR reserve address is 0x00000006b20e0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
> each of the 16 bit chunks (see diff at the end of hte email). Judging by
> the reserve addresses, I don't think the PRNG is to blame -- it's just
> that that gaps are large relative to the available RAM and swallow up
> much of the entropy, forcing a fall back to the same address.
>
> One thing we could do is to perform the address selection in the space
> of available memory, excluding gaps entirely. i.e. sum up the available
> memory, select the Nth available byte, then walk the memory map to
> convert that back to a real address. We might still choose an address
> that cannot be used (e.g. if the kernel would hang over the end of a
> region), but it'd be rarer than hitting a gap.

This is basically what I did on x86. I walk the memory map, counting
the number of positions that the kernel would fit into, then used the
random number to pick between position 0 and max position, then walked
the memory map again to spit out the memory address for that position.

Maybe this could be extracted into a lib for all architectures? Might
be more pain that it's worth, but at least there's no reason to write
it from scratch. See find_random_addr() and its helpers in
arch/x86/boot/compressed/aslr.c

-Kees

>
> Thoughts?
>
> For the above, my EFI memory map looks like:
>
> [    0.000000] Processing EFI memory map:
> [    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
>
> I've included my local hacks below in case they are useful.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 27a1a92..00c6640 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -30,6 +30,34 @@ extern struct {
>
>  extern bool kaslr;
>
> +static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
> +{
> +       const char hex[16] = "0123456789abcdef";
> +       char *strp, str[] = "0x0000000000000000";
> +       strp = str + 18;
> +
> +       do {
> +               *(--strp) = hex[val & 0xf];
> +       } while (val >>= 4);
> +
> +       efi_printk(sys_table_arg, str);
> +}
> +
> +static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       u64 seed;
> +       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
> +
> +       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
> +
> +       pr_efi(sys_table, "Seed is ");
> +       log_hex(sys_table, seed);
> +       efi_printk(sys_table, "\n");
> +
> +       efi_rnd.virt_seed = seed;
> +       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
> +}
> +
>  static int efi_get_random_bytes(efi_system_table_t *sys_table)
>  {
>         efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> @@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
>                                                       (void **)&rng);
>         if (status == EFI_NOT_FOUND) {
>                 pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               dodgy_get_random_bytes(sys_table);
>                 return EFI_SUCCESS;
>         }
>
> @@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>         return EFI_SUCCESS;
>  }
>
> +static void log_kernel_address(efi_system_table_t *sys_table_arg,
> +                              unsigned long addr, unsigned long kaslr_addr)
> +{
> +       pr_efi(sys_table_arg, "KASLR reserve address is ");
> +       log_hex(sys_table_arg, kaslr_addr);
> +       efi_printk(sys_table_arg, "\n");
> +       pr_efi(sys_table_arg, "Loading kernel to physical address ");
> +       log_hex(sys_table_arg, addr);
> +       efi_printk(sys_table_arg, "\n");
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         unsigned long nr_pages;
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
> +       unsigned long kaslr_address = 0;
>
>         if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
>                 if (kaslr) {
> @@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * base. Note that this may give suboptimal results on systems
>                  * with discontiguous DRAM regions with large holes between them.
>                  */
> -               *reserve_addr = dram_base +
> +               kaslr_address = dram_base +
>                         ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +               *reserve_addr = kaslr_address;
>
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
> @@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                 }
>                 *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> +
> +       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
> +
>         memcpy((void *)*image_addr, old_image_addr, kernel_size);
>         *reserve_size = kernel_memsize;
>
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-07 19:07       ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-07 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 7, 2016 at 10:46 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Ard,
>
> I had a go at testing this on Juno with a hacked-up PRNG, and while
> everything seems to work, I think we need to make the address selection
> more robust to sparse memory maps (which I believe they are going to be
> fairly common).
>
> Info dump below and suggestion below.
>
> Other than that, this looks really nice -- I'll do other review in a
> separate reply.
>
> On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> [...]
>
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>       if (preferred_offset < dram_base)
>>               preferred_offset += SZ_2M;
>>
>> -     /* Relocate the image, if required. */
>>       kernel_size = _edata - _text;
>> -     if (*image_addr != preferred_offset) {
>> -             kernel_memsize = kernel_size + (_end - _edata);
>> +     kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +     if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +             /*
>> +              * If KASLR is enabled, and we have some randomness available,
>> +              * locate the kernel at a randomized offset in physical memory.
>> +              */
>> +             u64 dram_top = dram_base;
>> +
>> +             status = get_dram_top(sys_table_arg, &dram_top);
>> +             if (status != EFI_SUCCESS) {
>> +                     pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                     return status;
>> +             }
>> +
>> +             kernel_memsize += SZ_2M;
>> +             nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                 EFI_PAGE_SIZE;
>>
>>               /*
>> -              * First, try a straight allocation at the preferred offset.
>> +              * Use the random seed to scale the size and add it to the DRAM
>> +              * base. Note that this may give suboptimal results on systems
>> +              * with discontiguous DRAM regions with large holes between them.
>> +              */
>> +             *reserve_addr = dram_base +
>> +                     ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>
> I think that "suboptimal" is somewhat an understatement. Across 10
> consecutive runs I ended up getting the same address 7 times:
>
> EFI stub: Seed is 0x0a82016804fdc064
> EFI stub: KASLR reserve address is 0x0000000832c48000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a820168050c09b2
> EFI stub: KASLR reserve address is 0x00000000c59e0000
> EFI stub: Loading kernel to physical address 0x00000000c4e80000 *
>
> EFI stub: Seed is 0x0a8001680511c701
> EFI stub: KASLR reserve address is 0x00000007feb40000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a8001680094d2a2
> EFI stub: KASLR reserve address is 0x0000000895bd0000
> EFI stub: Loading kernel to physical address 0x0000000895080000 *
>
> EFI stub: Seed is 0x88820167ea986527
> EFI stub: KASLR reserve address is 0x00000000bc570000
> EFI stub: Loading kernel to physical address 0x00000000bb880000 *
>
> EFI stub: Seed is 0x0882116805029414
> EFI stub: KASLR reserve address is 0x00000005955a0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x8a821168050104ab
> EFI stub: KASLR reserve address is 0x0000000639600000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08820168050671c6
> EFI stub: KASLR reserve address is 0x00000005250f0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08821167ea67381f
> EFI stub: KASLR reserve address is 0x000000080e538000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a801168050cb810
> EFI stub: KASLR reserve address is 0x00000006b20e0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
> each of the 16 bit chunks (see diff at the end of hte email). Judging by
> the reserve addresses, I don't think the PRNG is to blame -- it's just
> that that gaps are large relative to the available RAM and swallow up
> much of the entropy, forcing a fall back to the same address.
>
> One thing we could do is to perform the address selection in the space
> of available memory, excluding gaps entirely. i.e. sum up the available
> memory, select the Nth available byte, then walk the memory map to
> convert that back to a real address. We might still choose an address
> that cannot be used (e.g. if the kernel would hang over the end of a
> region), but it'd be rarer than hitting a gap.

This is basically what I did on x86. I walk the memory map, counting
the number of positions that the kernel would fit into, then used the
random number to pick between position 0 and max position, then walked
the memory map again to spit out the memory address for that position.

Maybe this could be extracted into a lib for all architectures? Might
be more pain that it's worth, but at least there's no reason to write
it from scratch. See find_random_addr() and its helpers in
arch/x86/boot/compressed/aslr.c

-Kees

>
> Thoughts?
>
> For the above, my EFI memory map looks like:
>
> [    0.000000] Processing EFI memory map:
> [    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
>
> I've included my local hacks below in case they are useful.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 27a1a92..00c6640 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -30,6 +30,34 @@ extern struct {
>
>  extern bool kaslr;
>
> +static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
> +{
> +       const char hex[16] = "0123456789abcdef";
> +       char *strp, str[] = "0x0000000000000000";
> +       strp = str + 18;
> +
> +       do {
> +               *(--strp) = hex[val & 0xf];
> +       } while (val >>= 4);
> +
> +       efi_printk(sys_table_arg, str);
> +}
> +
> +static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       u64 seed;
> +       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
> +
> +       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
> +
> +       pr_efi(sys_table, "Seed is ");
> +       log_hex(sys_table, seed);
> +       efi_printk(sys_table, "\n");
> +
> +       efi_rnd.virt_seed = seed;
> +       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
> +}
> +
>  static int efi_get_random_bytes(efi_system_table_t *sys_table)
>  {
>         efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> @@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
>                                                       (void **)&rng);
>         if (status == EFI_NOT_FOUND) {
>                 pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               dodgy_get_random_bytes(sys_table);
>                 return EFI_SUCCESS;
>         }
>
> @@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>         return EFI_SUCCESS;
>  }
>
> +static void log_kernel_address(efi_system_table_t *sys_table_arg,
> +                              unsigned long addr, unsigned long kaslr_addr)
> +{
> +       pr_efi(sys_table_arg, "KASLR reserve address is ");
> +       log_hex(sys_table_arg, kaslr_addr);
> +       efi_printk(sys_table_arg, "\n");
> +       pr_efi(sys_table_arg, "Loading kernel to physical address ");
> +       log_hex(sys_table_arg, addr);
> +       efi_printk(sys_table_arg, "\n");
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         unsigned long nr_pages;
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
> +       unsigned long kaslr_address = 0;
>
>         if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
>                 if (kaslr) {
> @@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * base. Note that this may give suboptimal results on systems
>                  * with discontiguous DRAM regions with large holes between them.
>                  */
> -               *reserve_addr = dram_base +
> +               kaslr_address = dram_base +
>                         ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +               *reserve_addr = kaslr_address;
>
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
> @@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                 }
>                 *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> +
> +       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
> +
>         memcpy((void *)*image_addr, old_image_addr, kernel_size);
>         *reserve_size = kernel_memsize;
>
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
@ 2016-01-07 19:07       ` Kees Cook
  0 siblings, 0 replies; 156+ messages in thread
From: Kees Cook @ 2016-01-07 19:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, Will Deacon,
	Catalin Marinas, Leif Lindholm, LKML, stuart.yoder,
	bhupesh.sharma, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Thu, Jan 7, 2016 at 10:46 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Ard,
>
> I had a go at testing this on Juno with a hacked-up PRNG, and while
> everything seems to work, I think we need to make the address selection
> more robust to sparse memory maps (which I believe they are going to be
> fairly common).
>
> Info dump below and suggestion below.
>
> Other than that, this looks really nice -- I'll do other review in a
> separate reply.
>
> On Wed, Dec 30, 2015 at 04:26:12PM +0100, Ard Biesheuvel wrote:
>> Since arm64 does not use a decompressor that supplies an execution
>> environment where it is feasible to some extent to provide a source of
>> randomness, the arm64 KASLR kernel depends on the bootloader to supply
>> some random bits in register x1 upon kernel entry.
>>
>> On UEFI systems, we can use the EFI_RNG_PROTOCOL, if supplied, to obtain
>> some random bits. At the same time, use it to randomize the offset of the
>> kernel Image in physical memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi-entry.S             |   7 +-
>>  drivers/firmware/efi/libstub/arm-stub.c   |   1 -
>>  drivers/firmware/efi/libstub/arm64-stub.c | 134 +++++++++++++++++---
>>  include/linux/efi.h                       |   5 +-
>>  4 files changed, 127 insertions(+), 20 deletions(-)
>
> [...]
>
>> @@ -36,13 +106,42 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>>       if (preferred_offset < dram_base)
>>               preferred_offset += SZ_2M;
>>
>> -     /* Relocate the image, if required. */
>>       kernel_size = _edata - _text;
>> -     if (*image_addr != preferred_offset) {
>> -             kernel_memsize = kernel_size + (_end - _edata);
>> +     kernel_memsize = kernel_size + (_end - _edata);
>> +
>> +     if (IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && efi_rnd.phys_seed) {
>> +             /*
>> +              * If KASLR is enabled, and we have some randomness available,
>> +              * locate the kernel at a randomized offset in physical memory.
>> +              */
>> +             u64 dram_top = dram_base;
>> +
>> +             status = get_dram_top(sys_table_arg, &dram_top);
>> +             if (status != EFI_SUCCESS) {
>> +                     pr_efi_err(sys_table_arg, "get_dram_size() failed\n");
>> +                     return status;
>> +             }
>> +
>> +             kernel_memsize += SZ_2M;
>> +             nr_pages = round_up(kernel_memsize, EFI_ALLOC_ALIGN) /
>> +                                 EFI_PAGE_SIZE;
>>
>>               /*
>> -              * First, try a straight allocation at the preferred offset.
>> +              * Use the random seed to scale the size and add it to the DRAM
>> +              * base. Note that this may give suboptimal results on systems
>> +              * with discontiguous DRAM regions with large holes between them.
>> +              */
>> +             *reserve_addr = dram_base +
>> +                     ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
>
> I think that "suboptimal" is somewhat an understatement. Across 10
> consecutive runs I ended up getting the same address 7 times:
>
> EFI stub: Seed is 0x0a82016804fdc064
> EFI stub: KASLR reserve address is 0x0000000832c48000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a820168050c09b2
> EFI stub: KASLR reserve address is 0x00000000c59e0000
> EFI stub: Loading kernel to physical address 0x00000000c4e80000 *
>
> EFI stub: Seed is 0x0a8001680511c701
> EFI stub: KASLR reserve address is 0x00000007feb40000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a8001680094d2a2
> EFI stub: KASLR reserve address is 0x0000000895bd0000
> EFI stub: Loading kernel to physical address 0x0000000895080000 *
>
> EFI stub: Seed is 0x88820167ea986527
> EFI stub: KASLR reserve address is 0x00000000bc570000
> EFI stub: Loading kernel to physical address 0x00000000bb880000 *
>
> EFI stub: Seed is 0x0882116805029414
> EFI stub: KASLR reserve address is 0x00000005955a0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x8a821168050104ab
> EFI stub: KASLR reserve address is 0x0000000639600000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08820168050671c6
> EFI stub: KASLR reserve address is 0x00000005250f0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x08821167ea67381f
> EFI stub: KASLR reserve address is 0x000000080e538000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> EFI stub: Seed is 0x0a801168050cb810
> EFI stub: KASLR reserve address is 0x00000006b20e0000
> EFI stub: Loading kernel to physical address 0x00000000fe080000
>
> My "Seed" here is just the CNTVCT value, with phys_seed being a xor of
> each of the 16 bit chunks (see diff at the end of hte email). Judging by
> the reserve addresses, I don't think the PRNG is to blame -- it's just
> that that gaps are large relative to the available RAM and swallow up
> much of the entropy, forcing a fall back to the same address.
>
> One thing we could do is to perform the address selection in the space
> of available memory, excluding gaps entirely. i.e. sum up the available
> memory, select the Nth available byte, then walk the memory map to
> convert that back to a real address. We might still choose an address
> that cannot be used (e.g. if the kernel would hang over the end of a
> region), but it'd be rarer than hitting a gap.

This is basically what I did on x86. I walk the memory map, counting
the number of positions that the kernel would fit into, then used the
random number to pick between position 0 and max position, then walked
the memory map again to spit out the memory address for that position.

Maybe this could be extracted into a lib for all architectures? Might
be more pain that it's worth, but at least there's no reason to write
it from scratch. See find_random_addr() and its helpers in
arch/x86/boot/compressed/aslr.c

-Kees

>
> Thoughts?
>
> For the above, my EFI memory map looks like:
>
> [    0.000000] Processing EFI memory map:
> [    0.000000]   0x000008000000-0x00000bffffff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x00001c170000-0x00001c170fff [Memory Mapped I/O  |RUN|  |  |  |  |  |   |  |  |  |UC]
> [    0.000000]   0x000080000000-0x00008000ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000080010000-0x00009fdfffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe00000-0x00009fe0ffff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x00009fe10000-0x0000dfffffff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000e00f0000-0x0000fde49fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000fde4a000-0x0000febc9fff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0000febca000-0x0000febcdfff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febce000-0x0000febcefff [ACPI Memory NVS    |   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febcf000-0x0000febd0fff [ACPI Reclaim Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0000febd1000-0x0000feffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x000880000000-0x0009f98aafff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ab000-0x0009f98acfff [Loader Data        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009f98ad000-0x0009fa42afff [Loader Code        |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fa42b000-0x0009faf6efff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009faf6f000-0x0009fafa9fff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fafaa000-0x0009ff767fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff768000-0x0009ff768fff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff769000-0x0009ff76efff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ff76f000-0x0009ffdddfff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffdde000-0x0009ffe72fff [Conventional Memory|   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009ffe73000-0x0009fff6dfff [Boot Code          |   |  |  |  |  |  |   |WB|WT|WC|UC]
> [    0.000000]   0x0009fff6e000-0x0009fffaefff [Runtime Code       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffaf000-0x0009ffffefff [Runtime Data       |RUN|  |  |  |  |  |   |WB|WT|WC|UC]*
> [    0.000000]   0x0009fffff000-0x0009ffffffff [Boot Data          |   |  |  |  |  |  |   |WB|WT|WC|UC]
>
> I've included my local hacks below in case they are useful.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
> index 27a1a92..00c6640 100644
> --- a/drivers/firmware/efi/libstub/arm64-stub.c
> +++ b/drivers/firmware/efi/libstub/arm64-stub.c
> @@ -30,6 +30,34 @@ extern struct {
>
>  extern bool kaslr;
>
> +static void log_hex(efi_system_table_t *sys_table_arg, unsigned long val)
> +{
> +       const char hex[16] = "0123456789abcdef";
> +       char *strp, str[] = "0x0000000000000000";
> +       strp = str + 18;
> +
> +       do {
> +               *(--strp) = hex[val & 0xf];
> +       } while (val >>= 4);
> +
> +       efi_printk(sys_table_arg, str);
> +}
> +
> +static void dodgy_get_random_bytes(efi_system_table_t *sys_table)
> +{
> +       u64 seed;
> +       pr_efi(sys_table, "using UNSAFE NON-RANDOM number generator\n");
> +
> +       asm volatile("mrs %0, cntvct_el0\n" : "=r" (seed));
> +
> +       pr_efi(sys_table, "Seed is ");
> +       log_hex(sys_table, seed);
> +       efi_printk(sys_table, "\n");
> +
> +       efi_rnd.virt_seed = seed;
> +       efi_rnd.phys_seed = seed ^ (seed >> 16) ^ (seed >> 32) ^ (seed >> 48);
> +}
> +
>  static int efi_get_random_bytes(efi_system_table_t *sys_table)
>  {
>         efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
> @@ -40,6 +68,7 @@ static int efi_get_random_bytes(efi_system_table_t *sys_table)
>                                                       (void **)&rng);
>         if (status == EFI_NOT_FOUND) {
>                 pr_efi(sys_table, "EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
> +               dodgy_get_random_bytes(sys_table);
>                 return EFI_SUCCESS;
>         }
>
> @@ -77,6 +106,17 @@ static efi_status_t get_dram_top(efi_system_table_t *sys_table_arg, u64 *top)
>         return EFI_SUCCESS;
>  }
>
> +static void log_kernel_address(efi_system_table_t *sys_table_arg,
> +                              unsigned long addr, unsigned long kaslr_addr)
> +{
> +       pr_efi(sys_table_arg, "KASLR reserve address is ");
> +       log_hex(sys_table_arg, kaslr_addr);
> +       efi_printk(sys_table_arg, "\n");
> +       pr_efi(sys_table_arg, "Loading kernel to physical address ");
> +       log_hex(sys_table_arg, addr);
> +       efi_printk(sys_table_arg, "\n");
> +}
> +
>  efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                                         unsigned long *image_addr,
>                                         unsigned long *image_size,
> @@ -90,6 +130,7 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>         unsigned long nr_pages;
>         void *old_image_addr = (void *)*image_addr;
>         unsigned long preferred_offset;
> +       unsigned long kaslr_address = 0;
>
>         if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
>                 if (kaslr) {
> @@ -137,8 +178,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                  * base. Note that this may give suboptimal results on systems
>                  * with discontiguous DRAM regions with large holes between them.
>                  */
> -               *reserve_addr = dram_base +
> +               kaslr_address = dram_base +
>                         ((dram_top - dram_base) >> 16) * (u16)efi_rnd.phys_seed;
> +               *reserve_addr = kaslr_address;
>
>                 status = efi_call_early(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
>                                         EFI_LOADER_DATA, nr_pages,
> @@ -179,6 +221,9 @@ efi_status_t __init handle_kernel_image(efi_system_table_t *sys_table_arg,
>                 }
>                 *image_addr = *reserve_addr + TEXT_OFFSET;
>         }
> +
> +       log_kernel_address(sys_table_arg, *image_addr, kaslr_address);
> +
>         memcpy((void *)*image_addr, old_image_addr, kernel_size);
>         *reserve_size = kernel_memsize;
>
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-08 10:17     ` James Morse
  -1 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-08 10:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---

> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>  	mov	x21, x0				// x21=FDT
>  
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +	/*
> +	 * Mask off the bits of the random value supplied in x1 so it can serve
> +	 * as a KASLR displacement value which will move the kernel image to a
> +	 * random offset in the lower half of the VMALLOC area.
> +	 */
> +	mov	x23, #(1 << (VA_BITS - 2)) - 1
> +	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif

I've managed to make this fail to boot by providing a seed that caused
the kernel to overlap a 1G boundary on a 4K system.

(It looks like your v3 may have the same issue - but I haven't tested it.)


> +
>  	adr_l	x0, boot_args			// record the contents of
>  	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
>  	stp	x2, x3, [x0, #16]


Thanks!

James

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 10:17     ` James Morse
  0 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-08 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---

> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>  	mov	x21, x0				// x21=FDT
>  
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +	/*
> +	 * Mask off the bits of the random value supplied in x1 so it can serve
> +	 * as a KASLR displacement value which will move the kernel image to a
> +	 * random offset in the lower half of the VMALLOC area.
> +	 */
> +	mov	x23, #(1 << (VA_BITS - 2)) - 1
> +	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif

I've managed to make this fail to boot by providing a seed that caused
the kernel to overlap a 1G boundary on a 4K system.

(It looks like your v3 may have the same issue - but I haven't tested it.)


> +
>  	adr_l	x0, boot_args			// record the contents of
>  	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
>  	stp	x2, x3, [x0, #16]


Thanks!

James

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 10:17     ` James Morse
  0 siblings, 0 replies; 156+ messages in thread
From: James Morse @ 2016-01-08 10:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	mark.rutland, leif.lindholm, keescook, linux-kernel,
	stuart.yoder, bhupesh.sharma, arnd, marc.zyngier,
	christoffer.dall

Hi Ard!

On 30/12/15 15:26, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---

> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 01a33e42ed70..ab582ee58b58 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -243,6 +253,16 @@ ENDPROC(stext)
>  preserve_boot_args:
>  	mov	x21, x0				// x21=FDT
>  
> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +	/*
> +	 * Mask off the bits of the random value supplied in x1 so it can serve
> +	 * as a KASLR displacement value which will move the kernel image to a
> +	 * random offset in the lower half of the VMALLOC area.
> +	 */
> +	mov	x23, #(1 << (VA_BITS - 2)) - 1
> +	and	x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
> +#endif

I've managed to make this fail to boot by providing a seed that caused
the kernel to overlap a 1G boundary on a 4K system.

(It looks like your v3 may have the same issue - but I haven't tested it.)


> +
>  	adr_l	x0, boot_args			// record the contents of
>  	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
>  	stp	x2, x3, [x0, #16]


Thanks!

James

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2016-01-08 10:17     ` James Morse
  (?)
@ 2016-01-08 10:25       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 10:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier,
	Christoffer Dall

On 8 January 2016 at 11:17, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>       mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +     /*
>> +      * Mask off the bits of the random value supplied in x1 so it can serve
>> +      * as a KASLR displacement value which will move the kernel image to a
>> +      * random offset in the lower half of the VMALLOC area.
>> +      */
>> +     mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +     and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>
> I've managed to make this fail to boot by providing a seed that caused
> the kernel to overlap a 1G boundary on a 4K system.
>

Ah, yes. Thanks for spotting that.

> (It looks like your v3 may have the same issue - but I haven't tested it.)
>
>

Yes, it does. It probably makes sense to sacrifice some entropy bits
and simply round up the kaslr offset to a log2 upper bound of the
kernel Image size, rather than hacking up some logic in assembly to
test whether we are crossing a PMD/PUD boundary


>> +
>>       adr_l   x0, boot_args                   // record the contents of
>>       stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>       stp     x2, x3, [x0, #16]
>
>
> Thanks!
>
> James
>

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 10:25       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 10:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 11:17, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>       mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +     /*
>> +      * Mask off the bits of the random value supplied in x1 so it can serve
>> +      * as a KASLR displacement value which will move the kernel image to a
>> +      * random offset in the lower half of the VMALLOC area.
>> +      */
>> +     mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +     and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>
> I've managed to make this fail to boot by providing a seed that caused
> the kernel to overlap a 1G boundary on a 4K system.
>

Ah, yes. Thanks for spotting that.

> (It looks like your v3 may have the same issue - but I haven't tested it.)
>
>

Yes, it does. It probably makes sense to sacrifice some entropy bits
and simply round up the kaslr offset to a log2 upper bound of the
kernel Image size, rather than hacking up some logic in assembly to
test whether we are crossing a PMD/PUD boundary


>> +
>>       adr_l   x0, boot_args                   // record the contents of
>>       stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>       stp     x2, x3, [x0, #16]
>
>
> Thanks!
>
> James
>

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 10:25       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 10:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Mark Rutland, Leif Lindholm, Kees Cook, linux-kernel,
	Stuart Yoder, Sharma Bhupesh, Arnd Bergmann, Marc Zyngier,
	Christoffer Dall

On 8 January 2016 at 11:17, James Morse <james.morse@arm.com> wrote:
> Hi Ard!
>
> On 30/12/15 15:26, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 01a33e42ed70..ab582ee58b58 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -243,6 +253,16 @@ ENDPROC(stext)
>>  preserve_boot_args:
>>       mov     x21, x0                         // x21=FDT
>>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +     /*
>> +      * Mask off the bits of the random value supplied in x1 so it can serve
>> +      * as a KASLR displacement value which will move the kernel image to a
>> +      * random offset in the lower half of the VMALLOC area.
>> +      */
>> +     mov     x23, #(1 << (VA_BITS - 2)) - 1
>> +     and     x23, x23, x1, lsl #SWAPPER_BLOCK_SHIFT
>> +#endif
>
> I've managed to make this fail to boot by providing a seed that caused
> the kernel to overlap a 1G boundary on a 4K system.
>

Ah, yes. Thanks for spotting that.

> (It looks like your v3 may have the same issue - but I haven't tested it.)
>
>

Yes, it does. It probably makes sense to sacrifice some entropy bits
and simply round up the kaslr offset to a log2 upper bound of the
kernel Image size, rather than hacking up some logic in assembly to
test whether we are crossing a PMD/PUD boundary


>> +
>>       adr_l   x0, boot_args                   // record the contents of
>>       stp     x21, x1, [x0]                   // x0 .. x3 at kernel entry
>>       stp     x2, x3, [x0, #16]
>
>
> Thanks!
>
> James
>

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-08 11:26     ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

Hi,

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> This relaxes the kernel Image placement requirements, so that it
> may be placed at any 2 MB aligned offset in physical memory.
> 
> This is accomplished by ignoring PHYS_OFFSET when installing
> memblocks, and accounting for the apparent virtual offset of
> the kernel Image. As a result, virtual address references
> below PAGE_OFFSET are correctly mapped onto physical references
> into the kernel Image regardless of where it sits in memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt  | 12 ++---
>  arch/arm64/include/asm/boot.h    |  5 ++
>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>  arch/arm64/include/asm/memory.h  | 15 +++---
>  arch/arm64/kernel/head.S         |  6 ++-
>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>  arch/arm64/mm/mmu.c              | 12 +++++
>  7 files changed, 86 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 701d39d3171a..03e02ebc1b0c 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -117,14 +117,14 @@ Header notes:
>    depending on selected features, and is effectively unbound.
>  
>  The Image must be placed text_offset bytes from a 2MB aligned base
> -address near the start of usable system RAM and called there. Memory
> -below that base address is currently unusable by Linux, and therefore it
> -is strongly recommended that this location is the start of system RAM.
> -The region between the 2 MB aligned base address and the start of the
> -image has no special significance to the kernel, and may be used for
> -other purposes.
> +address anywhere in usable system RAM and called there. The region
> +between the 2 MB aligned base address and the start of the image has no
> +special significance to the kernel, and may be used for other purposes.
>  At least image_size bytes from the start of the image must be free for
>  use by the kernel.
> +NOTE: versions prior to v4.6 cannot make use of memory below the
> +physical offset of the Image so it is recommended that the Image be
> +placed as close as possible to the start of system RAM.

We need a head flag for this so that a bootloader can determine whether
it can load the kernel anywhere or should try for the lowest possible
address. Then the note would describe the recommended behaviour in the
absence of the flag.

The flag for KASLR isn't sufficient as you can build without it (and it
only tells the bootloader that the kernel accepts entropy in x1).

We might also want to consider if we need to determine whether or not
the bootloader actually provided entropy, (and if we need a more general
handshake between the bootlaoder and kernel to determine that kind of
thing).

>  Any memory described to the kernel (even that below the start of the
>  image) which is not marked as reserved from the kernel (e.g., with a
> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> index 81151b67b26b..984cb0fa61ce 100644
> --- a/arch/arm64/include/asm/boot.h
> +++ b/arch/arm64/include/asm/boot.h
> @@ -11,4 +11,9 @@
>  #define MIN_FDT_ALIGN		8
>  #define MAX_FDT_SIZE		SZ_2M
>  
> +/*
> + * arm64 requires the kernel image to be 2 MB aligned

Nit: The image is TEXT_OFFSET from that 2M-aligned base.
s/image/mapping/? 

[...]

> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */

We might want ot preserve the low 4GB if possible, for those IOMMU-less
devices which can only do 32-bit addressing.

Otherwise this looks good to me!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:26     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> This relaxes the kernel Image placement requirements, so that it
> may be placed at any 2 MB aligned offset in physical memory.
> 
> This is accomplished by ignoring PHYS_OFFSET when installing
> memblocks, and accounting for the apparent virtual offset of
> the kernel Image. As a result, virtual address references
> below PAGE_OFFSET are correctly mapped onto physical references
> into the kernel Image regardless of where it sits in memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt  | 12 ++---
>  arch/arm64/include/asm/boot.h    |  5 ++
>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>  arch/arm64/include/asm/memory.h  | 15 +++---
>  arch/arm64/kernel/head.S         |  6 ++-
>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>  arch/arm64/mm/mmu.c              | 12 +++++
>  7 files changed, 86 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 701d39d3171a..03e02ebc1b0c 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -117,14 +117,14 @@ Header notes:
>    depending on selected features, and is effectively unbound.
>  
>  The Image must be placed text_offset bytes from a 2MB aligned base
> -address near the start of usable system RAM and called there. Memory
> -below that base address is currently unusable by Linux, and therefore it
> -is strongly recommended that this location is the start of system RAM.
> -The region between the 2 MB aligned base address and the start of the
> -image has no special significance to the kernel, and may be used for
> -other purposes.
> +address anywhere in usable system RAM and called there. The region
> +between the 2 MB aligned base address and the start of the image has no
> +special significance to the kernel, and may be used for other purposes.
>  At least image_size bytes from the start of the image must be free for
>  use by the kernel.
> +NOTE: versions prior to v4.6 cannot make use of memory below the
> +physical offset of the Image so it is recommended that the Image be
> +placed as close as possible to the start of system RAM.

We need a head flag for this so that a bootloader can determine whether
it can load the kernel anywhere or should try for the lowest possible
address. Then the note would describe the recommended behaviour in the
absence of the flag.

The flag for KASLR isn't sufficient as you can build without it (and it
only tells the bootloader that the kernel accepts entropy in x1).

We might also want to consider if we need to determine whether or not
the bootloader actually provided entropy, (and if we need a more general
handshake between the bootlaoder and kernel to determine that kind of
thing).

>  Any memory described to the kernel (even that below the start of the
>  image) which is not marked as reserved from the kernel (e.g., with a
> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> index 81151b67b26b..984cb0fa61ce 100644
> --- a/arch/arm64/include/asm/boot.h
> +++ b/arch/arm64/include/asm/boot.h
> @@ -11,4 +11,9 @@
>  #define MIN_FDT_ALIGN		8
>  #define MAX_FDT_SIZE		SZ_2M
>  
> +/*
> + * arm64 requires the kernel image to be 2 MB aligned

Nit: The image is TEXT_OFFSET from that 2M-aligned base.
s/image/mapping/? 

[...]

> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */

We might want ot preserve the low 4GB if possible, for those IOMMU-less
devices which can only do 32-bit addressing.

Otherwise this looks good to me!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:26     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

Hi,

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> This relaxes the kernel Image placement requirements, so that it
> may be placed at any 2 MB aligned offset in physical memory.
> 
> This is accomplished by ignoring PHYS_OFFSET when installing
> memblocks, and accounting for the apparent virtual offset of
> the kernel Image. As a result, virtual address references
> below PAGE_OFFSET are correctly mapped onto physical references
> into the kernel Image regardless of where it sits in memory.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt  | 12 ++---
>  arch/arm64/include/asm/boot.h    |  5 ++
>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>  arch/arm64/include/asm/memory.h  | 15 +++---
>  arch/arm64/kernel/head.S         |  6 ++-
>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>  arch/arm64/mm/mmu.c              | 12 +++++
>  7 files changed, 86 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 701d39d3171a..03e02ebc1b0c 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -117,14 +117,14 @@ Header notes:
>    depending on selected features, and is effectively unbound.
>  
>  The Image must be placed text_offset bytes from a 2MB aligned base
> -address near the start of usable system RAM and called there. Memory
> -below that base address is currently unusable by Linux, and therefore it
> -is strongly recommended that this location is the start of system RAM.
> -The region between the 2 MB aligned base address and the start of the
> -image has no special significance to the kernel, and may be used for
> -other purposes.
> +address anywhere in usable system RAM and called there. The region
> +between the 2 MB aligned base address and the start of the image has no
> +special significance to the kernel, and may be used for other purposes.
>  At least image_size bytes from the start of the image must be free for
>  use by the kernel.
> +NOTE: versions prior to v4.6 cannot make use of memory below the
> +physical offset of the Image so it is recommended that the Image be
> +placed as close as possible to the start of system RAM.

We need a head flag for this so that a bootloader can determine whether
it can load the kernel anywhere or should try for the lowest possible
address. Then the note would describe the recommended behaviour in the
absence of the flag.

The flag for KASLR isn't sufficient as you can build without it (and it
only tells the bootloader that the kernel accepts entropy in x1).

We might also want to consider if we need to determine whether or not
the bootloader actually provided entropy, (and if we need a more general
handshake between the bootlaoder and kernel to determine that kind of
thing).

>  Any memory described to the kernel (even that below the start of the
>  image) which is not marked as reserved from the kernel (e.g., with a
> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> index 81151b67b26b..984cb0fa61ce 100644
> --- a/arch/arm64/include/asm/boot.h
> +++ b/arch/arm64/include/asm/boot.h
> @@ -11,4 +11,9 @@
>  #define MIN_FDT_ALIGN		8
>  #define MAX_FDT_SIZE		SZ_2M
>  
> +/*
> + * arm64 requires the kernel image to be 2 MB aligned

Nit: The image is TEXT_OFFSET from that 2M-aligned base.
s/image/mapping/? 

[...]

> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */

We might want ot preserve the low 4GB if possible, for those IOMMU-less
devices which can only do 32-bit addressing.

Otherwise this looks good to me!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 11:26     ` Mark Rutland
  (?)
@ 2016-01-08 11:34       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 11:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> This relaxes the kernel Image placement requirements, so that it
>> may be placed at any 2 MB aligned offset in physical memory.
>>
>> This is accomplished by ignoring PHYS_OFFSET when installing
>> memblocks, and accounting for the apparent virtual offset of
>> the kernel Image. As a result, virtual address references
>> below PAGE_OFFSET are correctly mapped onto physical references
>> into the kernel Image regardless of where it sits in memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt  | 12 ++---
>>  arch/arm64/include/asm/boot.h    |  5 ++
>>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>>  arch/arm64/include/asm/memory.h  | 15 +++---
>>  arch/arm64/kernel/head.S         |  6 ++-
>>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>>  arch/arm64/mm/mmu.c              | 12 +++++
>>  7 files changed, 86 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 701d39d3171a..03e02ebc1b0c 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -117,14 +117,14 @@ Header notes:
>>    depending on selected features, and is effectively unbound.
>>
>>  The Image must be placed text_offset bytes from a 2MB aligned base
>> -address near the start of usable system RAM and called there. Memory
>> -below that base address is currently unusable by Linux, and therefore it
>> -is strongly recommended that this location is the start of system RAM.
>> -The region between the 2 MB aligned base address and the start of the
>> -image has no special significance to the kernel, and may be used for
>> -other purposes.
>> +address anywhere in usable system RAM and called there. The region
>> +between the 2 MB aligned base address and the start of the image has no
>> +special significance to the kernel, and may be used for other purposes.
>>  At least image_size bytes from the start of the image must be free for
>>  use by the kernel.
>> +NOTE: versions prior to v4.6 cannot make use of memory below the
>> +physical offset of the Image so it is recommended that the Image be
>> +placed as close as possible to the start of system RAM.
>
> We need a head flag for this so that a bootloader can determine whether
> it can load the kernel anywhere or should try for the lowest possible
> address. Then the note would describe the recommended behaviour in the
> absence of the flag.
>
> The flag for KASLR isn't sufficient as you can build without it (and it
> only tells the bootloader that the kernel accepts entropy in x1).
>

Indeed, I will change that.

> We might also want to consider if we need to determine whether or not
> the bootloader actually provided entropy, (and if we need a more general
> handshake between the bootlaoder and kernel to determine that kind of
> thing).
>

Yes, that is interesting. We should also think about how to handle
'nokaslr' if it appears on the command line, since in the !EFI case,
we will be way too late to parse this, and a capable kernel will
already be running from a randomized offset. That means it is the
bootloader's responsibility to ensure that the presence of 'nokaslr'
and the entropy in x1 are consistent with each other.

>>  Any memory described to the kernel (even that below the start of the
>>  image) which is not marked as reserved from the kernel (e.g., with a
>> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
>> index 81151b67b26b..984cb0fa61ce 100644
>> --- a/arch/arm64/include/asm/boot.h
>> +++ b/arch/arm64/include/asm/boot.h
>> @@ -11,4 +11,9 @@
>>  #define MIN_FDT_ALIGN                8
>>  #define MAX_FDT_SIZE         SZ_2M
>>
>> +/*
>> + * arm64 requires the kernel image to be 2 MB aligned
>
> Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> s/image/mapping/?
>
> [...]
>

Yep. I hate TEXT_OFFSET, did I mention that?

>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>
> We might want ot preserve the low 4GB if possible, for those IOMMU-less
> devices which can only do 32-bit addressing.
>
> Otherwise this looks good to me!
>

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:34       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 11:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> This relaxes the kernel Image placement requirements, so that it
>> may be placed at any 2 MB aligned offset in physical memory.
>>
>> This is accomplished by ignoring PHYS_OFFSET when installing
>> memblocks, and accounting for the apparent virtual offset of
>> the kernel Image. As a result, virtual address references
>> below PAGE_OFFSET are correctly mapped onto physical references
>> into the kernel Image regardless of where it sits in memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt  | 12 ++---
>>  arch/arm64/include/asm/boot.h    |  5 ++
>>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>>  arch/arm64/include/asm/memory.h  | 15 +++---
>>  arch/arm64/kernel/head.S         |  6 ++-
>>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>>  arch/arm64/mm/mmu.c              | 12 +++++
>>  7 files changed, 86 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 701d39d3171a..03e02ebc1b0c 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -117,14 +117,14 @@ Header notes:
>>    depending on selected features, and is effectively unbound.
>>
>>  The Image must be placed text_offset bytes from a 2MB aligned base
>> -address near the start of usable system RAM and called there. Memory
>> -below that base address is currently unusable by Linux, and therefore it
>> -is strongly recommended that this location is the start of system RAM.
>> -The region between the 2 MB aligned base address and the start of the
>> -image has no special significance to the kernel, and may be used for
>> -other purposes.
>> +address anywhere in usable system RAM and called there. The region
>> +between the 2 MB aligned base address and the start of the image has no
>> +special significance to the kernel, and may be used for other purposes.
>>  At least image_size bytes from the start of the image must be free for
>>  use by the kernel.
>> +NOTE: versions prior to v4.6 cannot make use of memory below the
>> +physical offset of the Image so it is recommended that the Image be
>> +placed as close as possible to the start of system RAM.
>
> We need a head flag for this so that a bootloader can determine whether
> it can load the kernel anywhere or should try for the lowest possible
> address. Then the note would describe the recommended behaviour in the
> absence of the flag.
>
> The flag for KASLR isn't sufficient as you can build without it (and it
> only tells the bootloader that the kernel accepts entropy in x1).
>

Indeed, I will change that.

> We might also want to consider if we need to determine whether or not
> the bootloader actually provided entropy, (and if we need a more general
> handshake between the bootlaoder and kernel to determine that kind of
> thing).
>

Yes, that is interesting. We should also think about how to handle
'nokaslr' if it appears on the command line, since in the !EFI case,
we will be way too late to parse this, and a capable kernel will
already be running from a randomized offset. That means it is the
bootloader's responsibility to ensure that the presence of 'nokaslr'
and the entropy in x1 are consistent with each other.

>>  Any memory described to the kernel (even that below the start of the
>>  image) which is not marked as reserved from the kernel (e.g., with a
>> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
>> index 81151b67b26b..984cb0fa61ce 100644
>> --- a/arch/arm64/include/asm/boot.h
>> +++ b/arch/arm64/include/asm/boot.h
>> @@ -11,4 +11,9 @@
>>  #define MIN_FDT_ALIGN                8
>>  #define MAX_FDT_SIZE         SZ_2M
>>
>> +/*
>> + * arm64 requires the kernel image to be 2 MB aligned
>
> Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> s/image/mapping/?
>
> [...]
>

Yep. I hate TEXT_OFFSET, did I mention that?

>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>
> We might want ot preserve the low 4GB if possible, for those IOMMU-less
> devices which can only do 32-bit addressing.
>
> Otherwise this looks good to me!
>

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:34       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 11:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> This relaxes the kernel Image placement requirements, so that it
>> may be placed at any 2 MB aligned offset in physical memory.
>>
>> This is accomplished by ignoring PHYS_OFFSET when installing
>> memblocks, and accounting for the apparent virtual offset of
>> the kernel Image. As a result, virtual address references
>> below PAGE_OFFSET are correctly mapped onto physical references
>> into the kernel Image regardless of where it sits in memory.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt  | 12 ++---
>>  arch/arm64/include/asm/boot.h    |  5 ++
>>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>>  arch/arm64/include/asm/memory.h  | 15 +++---
>>  arch/arm64/kernel/head.S         |  6 ++-
>>  arch/arm64/mm/init.c             | 50 +++++++++++++++++++-
>>  arch/arm64/mm/mmu.c              | 12 +++++
>>  7 files changed, 86 insertions(+), 16 deletions(-)
>>
>> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
>> index 701d39d3171a..03e02ebc1b0c 100644
>> --- a/Documentation/arm64/booting.txt
>> +++ b/Documentation/arm64/booting.txt
>> @@ -117,14 +117,14 @@ Header notes:
>>    depending on selected features, and is effectively unbound.
>>
>>  The Image must be placed text_offset bytes from a 2MB aligned base
>> -address near the start of usable system RAM and called there. Memory
>> -below that base address is currently unusable by Linux, and therefore it
>> -is strongly recommended that this location is the start of system RAM.
>> -The region between the 2 MB aligned base address and the start of the
>> -image has no special significance to the kernel, and may be used for
>> -other purposes.
>> +address anywhere in usable system RAM and called there. The region
>> +between the 2 MB aligned base address and the start of the image has no
>> +special significance to the kernel, and may be used for other purposes.
>>  At least image_size bytes from the start of the image must be free for
>>  use by the kernel.
>> +NOTE: versions prior to v4.6 cannot make use of memory below the
>> +physical offset of the Image so it is recommended that the Image be
>> +placed as close as possible to the start of system RAM.
>
> We need a head flag for this so that a bootloader can determine whether
> it can load the kernel anywhere or should try for the lowest possible
> address. Then the note would describe the recommended behaviour in the
> absence of the flag.
>
> The flag for KASLR isn't sufficient as you can build without it (and it
> only tells the bootloader that the kernel accepts entropy in x1).
>

Indeed, I will change that.

> We might also want to consider if we need to determine whether or not
> the bootloader actually provided entropy, (and if we need a more general
> handshake between the bootlaoder and kernel to determine that kind of
> thing).
>

Yes, that is interesting. We should also think about how to handle
'nokaslr' if it appears on the command line, since in the !EFI case,
we will be way too late to parse this, and a capable kernel will
already be running from a randomized offset. That means it is the
bootloader's responsibility to ensure that the presence of 'nokaslr'
and the entropy in x1 are consistent with each other.

>>  Any memory described to the kernel (even that below the start of the
>>  image) which is not marked as reserved from the kernel (e.g., with a
>> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
>> index 81151b67b26b..984cb0fa61ce 100644
>> --- a/arch/arm64/include/asm/boot.h
>> +++ b/arch/arm64/include/asm/boot.h
>> @@ -11,4 +11,9 @@
>>  #define MIN_FDT_ALIGN                8
>>  #define MAX_FDT_SIZE         SZ_2M
>>
>> +/*
>> + * arm64 requires the kernel image to be 2 MB aligned
>
> Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> s/image/mapping/?
>
> [...]
>

Yep. I hate TEXT_OFFSET, did I mention that?

>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>
> We might want ot preserve the low 4GB if possible, for those IOMMU-less
> devices which can only do 32-bit addressing.
>
> Otherwise this looks good to me!
>

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 11:34       ` Ard Biesheuvel
  (?)
@ 2016-01-08 11:43         ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:43 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Fri, Jan 08, 2016 at 12:34:18PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> > We might also want to consider if we need to determine whether or not
> > the bootloader actually provided entropy, (and if we need a more general
> > handshake between the bootlaoder and kernel to determine that kind of
> > thing).
> 
> Yes, that is interesting. We should also think about how to handle
> 'nokaslr' if it appears on the command line, since in the !EFI case,
> we will be way too late to parse this, and a capable kernel will
> already be running from a randomized offset. That means it is the
> bootloader's responsibility to ensure that the presence of 'nokaslr'
> and the entropy in x1 are consistent with each other.

Argh, I hadn't considered that. :(

In the absence of a pre-kernel environment, the best thing we can do is
probably to print a giant warning if 'nokaslr' is present but there was
entropy (where that's determined based on some handshake/magic/flag).

> >>  Any memory described to the kernel (even that below the start of the
> >>  image) which is not marked as reserved from the kernel (e.g., with a
> >> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> >> index 81151b67b26b..984cb0fa61ce 100644
> >> --- a/arch/arm64/include/asm/boot.h
> >> +++ b/arch/arm64/include/asm/boot.h
> >> @@ -11,4 +11,9 @@
> >>  #define MIN_FDT_ALIGN                8
> >>  #define MAX_FDT_SIZE         SZ_2M
> >>
> >> +/*
> >> + * arm64 requires the kernel image to be 2 MB aligned
> >
> > Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> > s/image/mapping/?
> >
> > [...]
> >
> 
> Yep. I hate TEXT_OFFSET, did I mention that?

I would also love to remove it, but I believe it's simply too late. :(

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:43         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 12:34:18PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> > We might also want to consider if we need to determine whether or not
> > the bootloader actually provided entropy, (and if we need a more general
> > handshake between the bootlaoder and kernel to determine that kind of
> > thing).
> 
> Yes, that is interesting. We should also think about how to handle
> 'nokaslr' if it appears on the command line, since in the !EFI case,
> we will be way too late to parse this, and a capable kernel will
> already be running from a randomized offset. That means it is the
> bootloader's responsibility to ensure that the presence of 'nokaslr'
> and the entropy in x1 are consistent with each other.

Argh, I hadn't considered that. :(

In the absence of a pre-kernel environment, the best thing we can do is
probably to print a giant warning if 'nokaslr' is present but there was
entropy (where that's determined based on some handshake/magic/flag).

> >>  Any memory described to the kernel (even that below the start of the
> >>  image) which is not marked as reserved from the kernel (e.g., with a
> >> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> >> index 81151b67b26b..984cb0fa61ce 100644
> >> --- a/arch/arm64/include/asm/boot.h
> >> +++ b/arch/arm64/include/asm/boot.h
> >> @@ -11,4 +11,9 @@
> >>  #define MIN_FDT_ALIGN                8
> >>  #define MAX_FDT_SIZE         SZ_2M
> >>
> >> +/*
> >> + * arm64 requires the kernel image to be 2 MB aligned
> >
> > Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> > s/image/mapping/?
> >
> > [...]
> >
> 
> Yep. I hate TEXT_OFFSET, did I mention that?

I would also love to remove it, but I believe it's simply too late. :(

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 11:43         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 11:43 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Fri, Jan 08, 2016 at 12:34:18PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 12:26, Mark Rutland <mark.rutland@arm.com> wrote:
> > We might also want to consider if we need to determine whether or not
> > the bootloader actually provided entropy, (and if we need a more general
> > handshake between the bootlaoder and kernel to determine that kind of
> > thing).
> 
> Yes, that is interesting. We should also think about how to handle
> 'nokaslr' if it appears on the command line, since in the !EFI case,
> we will be way too late to parse this, and a capable kernel will
> already be running from a randomized offset. That means it is the
> bootloader's responsibility to ensure that the presence of 'nokaslr'
> and the entropy in x1 are consistent with each other.

Argh, I hadn't considered that. :(

In the absence of a pre-kernel environment, the best thing we can do is
probably to print a giant warning if 'nokaslr' is present but there was
entropy (where that's determined based on some handshake/magic/flag).

> >>  Any memory described to the kernel (even that below the start of the
> >>  image) which is not marked as reserved from the kernel (e.g., with a
> >> diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
> >> index 81151b67b26b..984cb0fa61ce 100644
> >> --- a/arch/arm64/include/asm/boot.h
> >> +++ b/arch/arm64/include/asm/boot.h
> >> @@ -11,4 +11,9 @@
> >>  #define MIN_FDT_ALIGN                8
> >>  #define MAX_FDT_SIZE         SZ_2M
> >>
> >> +/*
> >> + * arm64 requires the kernel image to be 2 MB aligned
> >
> > Nit: The image is TEXT_OFFSET from that 2M-aligned base.
> > s/image/mapping/?
> >
> > [...]
> >
> 
> Yep. I hate TEXT_OFFSET, did I mention that?

I would also love to remove it, but I believe it's simply too late. :(

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-08 12:00     ` Catalin Marinas
  -1 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 12:00 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>  	unsigned long addr = FIXADDR_START;
>  
>  	pgd = pgd_offset_k(addr);
> -	pgd_populate(&init_mm, pgd, bm_pud);
> -	pud = pud_offset(pgd, addr);
> -	pud_populate(&init_mm, pud, bm_pmd);
> -	pmd = pmd_offset(pud, addr);
> -	pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +#if CONFIG_PGTABLE_LEVELS > 3
> +	if (pgd_none(*pgd)) {
> +		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
> +
> +		pgd_populate(&init_mm, pgd, bm_pud);
> +		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
> +	}
> +	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
> +#else
> +	pud = (pud_t *)pgd;
> +#endif
> +#if CONFIG_PGTABLE_LEVELS > 2
> +	if (pud_none(*pud)) {
> +		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
> +
> +		pud_populate(&init_mm, pud, bm_pmd);
> +		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
> +	}
> +	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
> +#else
> +	pmd = (pmd_t *)pud;
> +#endif
> +	if (pmd_none(*pmd)) {
> +		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
> +
> +		pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
> +	}
> +	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));

I haven't tried but could you not avoid the #if and just rely on the
pud_none() etc. definitions to be 0 and the compiler+linker optimising
the irrelevant code out?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-08 12:00     ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>  	unsigned long addr = FIXADDR_START;
>  
>  	pgd = pgd_offset_k(addr);
> -	pgd_populate(&init_mm, pgd, bm_pud);
> -	pud = pud_offset(pgd, addr);
> -	pud_populate(&init_mm, pud, bm_pmd);
> -	pmd = pmd_offset(pud, addr);
> -	pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +#if CONFIG_PGTABLE_LEVELS > 3
> +	if (pgd_none(*pgd)) {
> +		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
> +
> +		pgd_populate(&init_mm, pgd, bm_pud);
> +		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
> +	}
> +	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
> +#else
> +	pud = (pud_t *)pgd;
> +#endif
> +#if CONFIG_PGTABLE_LEVELS > 2
> +	if (pud_none(*pud)) {
> +		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
> +
> +		pud_populate(&init_mm, pud, bm_pmd);
> +		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
> +	}
> +	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
> +#else
> +	pmd = (pmd_t *)pud;
> +#endif
> +	if (pmd_none(*pmd)) {
> +		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
> +
> +		pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
> +	}
> +	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));

I haven't tried but could you not avoid the #if and just rely on the
pud_none() etc. definitions to be 0 and the compiler+linker optimising
the irrelevant code out?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-08 12:00     ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 12:00 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>  	unsigned long addr = FIXADDR_START;
>  
>  	pgd = pgd_offset_k(addr);
> -	pgd_populate(&init_mm, pgd, bm_pud);
> -	pud = pud_offset(pgd, addr);
> -	pud_populate(&init_mm, pud, bm_pmd);
> -	pmd = pmd_offset(pud, addr);
> -	pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +#if CONFIG_PGTABLE_LEVELS > 3
> +	if (pgd_none(*pgd)) {
> +		static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
> +
> +		pgd_populate(&init_mm, pgd, bm_pud);
> +		memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
> +	}
> +	pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
> +#else
> +	pud = (pud_t *)pgd;
> +#endif
> +#if CONFIG_PGTABLE_LEVELS > 2
> +	if (pud_none(*pud)) {
> +		static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
> +
> +		pud_populate(&init_mm, pud, bm_pmd);
> +		memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
> +	}
> +	pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
> +#else
> +	pmd = (pmd_t *)pud;
> +#endif
> +	if (pmd_none(*pmd)) {
> +		static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
> +
> +		pmd_populate_kernel(&init_mm, pmd, bm_pte);
> +		memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
> +	}
> +	__fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));

I haven't tried but could you not avoid the #if and just rely on the
pud_none() etc. definitions to be 0 and the compiler+linker optimising
the irrelevant code out?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
  2016-01-08 12:00     ` Catalin Marinas
  (?)
@ 2016-01-08 12:05       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Arnd Bergmann,
	Sharma Bhupesh, Stuart Yoder, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 13:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
>> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>>       unsigned long addr = FIXADDR_START;
>>
>>       pgd = pgd_offset_k(addr);
>> -     pgd_populate(&init_mm, pgd, bm_pud);
>> -     pud = pud_offset(pgd, addr);
>> -     pud_populate(&init_mm, pud, bm_pmd);
>> -     pmd = pmd_offset(pud, addr);
>> -     pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +     if (pgd_none(*pgd)) {
>> +             static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>> +
>> +             pgd_populate(&init_mm, pgd, bm_pud);
>> +             memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>> +     }
>> +     pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
>> +#else
>> +     pud = (pud_t *)pgd;
>> +#endif
>> +#if CONFIG_PGTABLE_LEVELS > 2
>> +     if (pud_none(*pud)) {
>> +             static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>> +
>> +             pud_populate(&init_mm, pud, bm_pmd);
>> +             memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>> +     }
>> +     pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
>> +#else
>> +     pmd = (pmd_t *)pud;
>> +#endif
>> +     if (pmd_none(*pmd)) {
>> +             static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>> +
>> +             pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +             memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>> +     }
>> +     __fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
>
> I haven't tried but could you not avoid the #if and just rely on the
> pud_none() etc. definitions to be 0 and the compiler+linker optimising
> the irrelevant code out?
>

I tried but it requires some tinkering so I gave up. I can have
another go based on the latest version (which will look a bit
different)

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-08 12:05       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 13:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
>> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>>       unsigned long addr = FIXADDR_START;
>>
>>       pgd = pgd_offset_k(addr);
>> -     pgd_populate(&init_mm, pgd, bm_pud);
>> -     pud = pud_offset(pgd, addr);
>> -     pud_populate(&init_mm, pud, bm_pmd);
>> -     pmd = pmd_offset(pud, addr);
>> -     pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +     if (pgd_none(*pgd)) {
>> +             static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>> +
>> +             pgd_populate(&init_mm, pgd, bm_pud);
>> +             memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>> +     }
>> +     pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
>> +#else
>> +     pud = (pud_t *)pgd;
>> +#endif
>> +#if CONFIG_PGTABLE_LEVELS > 2
>> +     if (pud_none(*pud)) {
>> +             static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>> +
>> +             pud_populate(&init_mm, pud, bm_pmd);
>> +             memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>> +     }
>> +     pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
>> +#else
>> +     pmd = (pmd_t *)pud;
>> +#endif
>> +     if (pmd_none(*pmd)) {
>> +             static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>> +
>> +             pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +             memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>> +     }
>> +     __fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
>
> I haven't tried but could you not avoid the #if and just rely on the
> pud_none() etc. definitions to be 0 and the compiler+linker optimising
> the irrelevant code out?
>

I tried but it requires some tinkering so I gave up. I can have
another go based on the latest version (which will look a bit
different)

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping
@ 2016-01-08 12:05       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Arnd Bergmann,
	Sharma Bhupesh, Stuart Yoder, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 13:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:03PM +0100, Ard Biesheuvel wrote:
>> @@ -583,33 +555,42 @@ void __init early_fixmap_init(void)
>>       unsigned long addr = FIXADDR_START;
>>
>>       pgd = pgd_offset_k(addr);
>> -     pgd_populate(&init_mm, pgd, bm_pud);
>> -     pud = pud_offset(pgd, addr);
>> -     pud_populate(&init_mm, pud, bm_pmd);
>> -     pmd = pmd_offset(pud, addr);
>> -     pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +     if (pgd_none(*pgd)) {
>> +             static pud_t bm_pud[PTRS_PER_PUD] __pgdir;
>> +
>> +             pgd_populate(&init_mm, pgd, bm_pud);
>> +             memblock_reserve(__pa(bm_pud), sizeof(bm_pud));
>> +     }
>> +     pud = (pud_t *)__phys_to_kimg(pud_offset_phys(pgd, addr));
>> +#else
>> +     pud = (pud_t *)pgd;
>> +#endif
>> +#if CONFIG_PGTABLE_LEVELS > 2
>> +     if (pud_none(*pud)) {
>> +             static pmd_t bm_pmd[PTRS_PER_PMD] __pgdir;
>> +
>> +             pud_populate(&init_mm, pud, bm_pmd);
>> +             memblock_reserve(__pa(bm_pmd), sizeof(bm_pmd));
>> +     }
>> +     pmd = (pmd_t *)__phys_to_kimg(pmd_offset_phys(pud, addr));
>> +#else
>> +     pmd = (pmd_t *)pud;
>> +#endif
>> +     if (pmd_none(*pmd)) {
>> +             static pte_t bm_pte[PTRS_PER_PTE] __pgdir;
>> +
>> +             pmd_populate_kernel(&init_mm, pmd, bm_pte);
>> +             memblock_reserve(__pa(bm_pte), sizeof(bm_pte));
>> +     }
>> +     __fixmap_pte = (pte_t *)__phys_to_kimg(pmd_page_paddr(*pmd));
>
> I haven't tried but could you not avoid the #if and just rely on the
> pud_none() etc. definitions to be 0 and the compiler+linker optimising
> the irrelevant code out?
>

I tried but it requires some tinkering so I gave up. I can have
another go based on the latest version (which will look a bit
different)

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-08 12:36     ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)

[...]

> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE	0x403
> +#define R_AARCH64_ABS64		0x101

Let's not duplicate asm/elf.h.

I have a patch to split the reloc types out into a separate header we
can reuse from assembly -- I'll send that momentarily. We can add
R_AARCH64_RELATIVE atop of that.

> +
> +	/*
> +	 * Iterate over each entry in the relocation table, and apply the
> +	 * relocations in place.
> +	 */
> +	adr_l	x8, __dynsym_start		// start of symbol table
> +	adr_l	x9, __reloc_start		// start of reloc table
> +	adr_l	x10, __reloc_end		// end of reloc table
> +
> +0:	cmp	x9, x10
> +	b.hs	2f
> +	ldp	x11, x12, [x9], #24
> +	ldr	x13, [x9, #-8]
> +	cmp	w12, #R_AARCH64_RELATIVE
> +	b.ne	1f
> +	add	x13, x13, x23			// relocate
> +	str	x13, [x11, x23]
> +	b	0b
> +
> +1:	cmp	w12, #R_AARCH64_ABS64
> +	b.ne	0b
> +	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
> +	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
> +	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
> +	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
> +	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
> +	add	x14, x15, x23			// relocate
> +	csel	x15, x14, x15, ne
> +	add	x15, x13, x15
> +	str	x15, [x11, x23]
> +	b	0b

We need to clean each of these relocated instructions to the PoU to be
visible for I-cache fetches.

As this is normal-cacheable we can post the maintenance with a DC CVAU
immediately after the store (no barriers necessary), and rely on the DSB
at 2f to complete all of those.

> +
> +2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
> +	dc	cvac, x8			// value visible to secondaries
> +	dsb	sy				// with MMU off

Then we need:

	ic	iallu
	dsb	nsh
	isb

To make sure the I-side is consistent with the PoU.

As secondaries will do similarly in __enable_mmu we don't need to add
any code for them.

> +#endif
> +
>  	adr_l	sp, initial_sp, x4
>  	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
>  
> -	ldr	x0, =KIMAGE_VADDR		// Save the offset between
> +	ldr_l	x0, kimage_vaddr		// Save the offset between
>  	sub	x24, x0, x24			// the kernel virtual and
>  	str_l	x24, kimage_voffset, x0		// physical mappings
>  
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>  	.section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +	.quad		_text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>  	adrp	x26, swapper_pg_dir
>  	bl	__cpu_setup			// initialise processor
>  
> -	ldr	x8, =KIMAGE_VADDR
> +	ldr	x8, kimage_vaddr
>  	ldr	w9, 0f
>  	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
>  	b	__enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>  
>  void __init setup_arch(char **cmdline_p)
>  {
> -	static struct vm_struct vmlinux_vm __initdata = {
> -		.addr		= (void *)KIMAGE_VADDR,
> -		.size		= 0,
> -		.flags		= VM_IOREMAP,
> -		.caller		= setup_arch,
> -	};
> -
> -	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -				   1 << SWAPPER_BLOCK_SHIFT);
> -	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +	static struct vm_struct vmlinux_vm __initdata;
> +
> +	vmlinux_vm.addr = (void *)kimage_vaddr;
> +	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +				   SWAPPER_BLOCK_SIZE);
> +	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +	vmlinux_vm.flags = VM_IOREMAP;
> +	vmlinux_vm.caller = setup_arch;
> +
>  	vm_area_add_early(&vmlinux_vm);
>  
>  	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>  	conswitchp = &dummy_con;
>  #endif
>  #endif
> -	if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +	    boot_args[2] || boot_args[3]) {
>  		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>  			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>  			"This indicates a broken bootloader or old kernel\n",

At this point it may make sense to split this out into a separate
function. If the handshake is more involved we'll need more code to
verify this, and it'd be nice to split that from setup_arch.

> diff --git a/arch/arm64/kernel/vmlinux.lds.S
> b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>  	.altinstr_replacement : {
>  		*(.altinstr_replacement)
>  	}
> +	.rela : ALIGN(8) {
> +		__reloc_start = .;
> +		*(.rela .rela*)
> +		__reloc_end = .;
> +	}
> +	.dynsym : ALIGN(8) {
> +		__dynsym_start = .;
> +		*(.dynsym)
> +	}
>  
>  	. = ALIGN(PAGE_SIZE);
>  	__init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>  		break;
>  	}  /* end switch */
>  	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -	||  r2(&ehdr->e_type) != ET_EXEC
> +	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>  	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>  		fail_file();
>  	}

This change should probably be a preparatory patch.

Otherwise, looks good!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:36     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)

[...]

> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE	0x403
> +#define R_AARCH64_ABS64		0x101

Let's not duplicate asm/elf.h.

I have a patch to split the reloc types out into a separate header we
can reuse from assembly -- I'll send that momentarily. We can add
R_AARCH64_RELATIVE atop of that.

> +
> +	/*
> +	 * Iterate over each entry in the relocation table, and apply the
> +	 * relocations in place.
> +	 */
> +	adr_l	x8, __dynsym_start		// start of symbol table
> +	adr_l	x9, __reloc_start		// start of reloc table
> +	adr_l	x10, __reloc_end		// end of reloc table
> +
> +0:	cmp	x9, x10
> +	b.hs	2f
> +	ldp	x11, x12, [x9], #24
> +	ldr	x13, [x9, #-8]
> +	cmp	w12, #R_AARCH64_RELATIVE
> +	b.ne	1f
> +	add	x13, x13, x23			// relocate
> +	str	x13, [x11, x23]
> +	b	0b
> +
> +1:	cmp	w12, #R_AARCH64_ABS64
> +	b.ne	0b
> +	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
> +	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
> +	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
> +	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
> +	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
> +	add	x14, x15, x23			// relocate
> +	csel	x15, x14, x15, ne
> +	add	x15, x13, x15
> +	str	x15, [x11, x23]
> +	b	0b

We need to clean each of these relocated instructions to the PoU to be
visible for I-cache fetches.

As this is normal-cacheable we can post the maintenance with a DC CVAU
immediately after the store (no barriers necessary), and rely on the DSB
at 2f to complete all of those.

> +
> +2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
> +	dc	cvac, x8			// value visible to secondaries
> +	dsb	sy				// with MMU off

Then we need:

	ic	iallu
	dsb	nsh
	isb

To make sure the I-side is consistent with the PoU.

As secondaries will do similarly in __enable_mmu we don't need to add
any code for them.

> +#endif
> +
>  	adr_l	sp, initial_sp, x4
>  	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
>  
> -	ldr	x0, =KIMAGE_VADDR		// Save the offset between
> +	ldr_l	x0, kimage_vaddr		// Save the offset between
>  	sub	x24, x0, x24			// the kernel virtual and
>  	str_l	x24, kimage_voffset, x0		// physical mappings
>  
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>  	.section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +	.quad		_text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>  	adrp	x26, swapper_pg_dir
>  	bl	__cpu_setup			// initialise processor
>  
> -	ldr	x8, =KIMAGE_VADDR
> +	ldr	x8, kimage_vaddr
>  	ldr	w9, 0f
>  	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
>  	b	__enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>  
>  void __init setup_arch(char **cmdline_p)
>  {
> -	static struct vm_struct vmlinux_vm __initdata = {
> -		.addr		= (void *)KIMAGE_VADDR,
> -		.size		= 0,
> -		.flags		= VM_IOREMAP,
> -		.caller		= setup_arch,
> -	};
> -
> -	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -				   1 << SWAPPER_BLOCK_SHIFT);
> -	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +	static struct vm_struct vmlinux_vm __initdata;
> +
> +	vmlinux_vm.addr = (void *)kimage_vaddr;
> +	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +				   SWAPPER_BLOCK_SIZE);
> +	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +	vmlinux_vm.flags = VM_IOREMAP;
> +	vmlinux_vm.caller = setup_arch;
> +
>  	vm_area_add_early(&vmlinux_vm);
>  
>  	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>  	conswitchp = &dummy_con;
>  #endif
>  #endif
> -	if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +	    boot_args[2] || boot_args[3]) {
>  		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>  			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>  			"This indicates a broken bootloader or old kernel\n",

At this point it may make sense to split this out into a separate
function. If the handshake is more involved we'll need more code to
verify this, and it'd be nice to split that from setup_arch.

> diff --git a/arch/arm64/kernel/vmlinux.lds.S
> b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>  	.altinstr_replacement : {
>  		*(.altinstr_replacement)
>  	}
> +	.rela : ALIGN(8) {
> +		__reloc_start = .;
> +		*(.rela .rela*)
> +		__reloc_end = .;
> +	}
> +	.dynsym : ALIGN(8) {
> +		__dynsym_start = .;
> +		*(.dynsym)
> +	}
>  
>  	. = ALIGN(PAGE_SIZE);
>  	__init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>  		break;
>  	}  /* end switch */
>  	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -	||  r2(&ehdr->e_type) != ET_EXEC
> +	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>  	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>  		fail_file();
>  	}

This change should probably be a preparatory patch.

Otherwise, looks good!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:36     ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, catalin.marinas,
	leif.lindholm, keescook, linux-kernel, stuart.yoder,
	bhupesh.sharma, arnd, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> This adds support for runtime relocation of the kernel Image, by
> building it as a PIE (ET_DYN) executable and applying the dynamic
> relocations in the early boot code.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  Documentation/arm64/booting.txt |  3 +-
>  arch/arm64/Kconfig              | 13 ++++
>  arch/arm64/Makefile             |  6 +-
>  arch/arm64/include/asm/memory.h |  3 +
>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>  arch/arm64/kernel/setup.c       | 22 +++---
>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>  scripts/sortextable.c           |  4 +-
>  8 files changed, 117 insertions(+), 18 deletions(-)

[...]

> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
> +
> +#define R_AARCH64_RELATIVE	0x403
> +#define R_AARCH64_ABS64		0x101

Let's not duplicate asm/elf.h.

I have a patch to split the reloc types out into a separate header we
can reuse from assembly -- I'll send that momentarily. We can add
R_AARCH64_RELATIVE atop of that.

> +
> +	/*
> +	 * Iterate over each entry in the relocation table, and apply the
> +	 * relocations in place.
> +	 */
> +	adr_l	x8, __dynsym_start		// start of symbol table
> +	adr_l	x9, __reloc_start		// start of reloc table
> +	adr_l	x10, __reloc_end		// end of reloc table
> +
> +0:	cmp	x9, x10
> +	b.hs	2f
> +	ldp	x11, x12, [x9], #24
> +	ldr	x13, [x9, #-8]
> +	cmp	w12, #R_AARCH64_RELATIVE
> +	b.ne	1f
> +	add	x13, x13, x23			// relocate
> +	str	x13, [x11, x23]
> +	b	0b
> +
> +1:	cmp	w12, #R_AARCH64_ABS64
> +	b.ne	0b
> +	add	x12, x12, x12, lsl #1		// symtab offset: 24x top word
> +	add	x12, x8, x12, lsr #(32 - 3)	// ... shifted into bottom word
> +	ldrsh	w14, [x12, #6]			// Elf64_Sym::st_shndx
> +	ldr	x15, [x12, #8]			// Elf64_Sym::st_value
> +	cmp	w14, #-0xf			// SHN_ABS (0xfff1) ?
> +	add	x14, x15, x23			// relocate
> +	csel	x15, x14, x15, ne
> +	add	x15, x13, x15
> +	str	x15, [x11, x23]
> +	b	0b

We need to clean each of these relocated instructions to the PoU to be
visible for I-cache fetches.

As this is normal-cacheable we can post the maintenance with a DC CVAU
immediately after the store (no barriers necessary), and rely on the DSB
at 2f to complete all of those.

> +
> +2:	adr_l	x8, kimage_vaddr		// make relocated kimage_vaddr
> +	dc	cvac, x8			// value visible to secondaries
> +	dsb	sy				// with MMU off

Then we need:

	ic	iallu
	dsb	nsh
	isb

To make sure the I-side is consistent with the PoU.

As secondaries will do similarly in __enable_mmu we don't need to add
any code for them.

> +#endif
> +
>  	adr_l	sp, initial_sp, x4
>  	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
>  
> -	ldr	x0, =KIMAGE_VADDR		// Save the offset between
> +	ldr_l	x0, kimage_vaddr		// Save the offset between
>  	sub	x24, x0, x24			// the kernel virtual and
>  	str_l	x24, kimage_voffset, x0		// physical mappings
>  
> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>   * hotplug and needs to have the same protections as the text region
>   */
>  	.section ".text","ax"
> +
> +ENTRY(kimage_vaddr)
> +	.quad		_text - TEXT_OFFSET
> +
>  /*
>   * If we're fortunate enough to boot at EL2, ensure that the world is
>   * sane before dropping to EL1.
> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>  	adrp	x26, swapper_pg_dir
>  	bl	__cpu_setup			// initialise processor
>  
> -	ldr	x8, =KIMAGE_VADDR
> +	ldr	x8, kimage_vaddr
>  	ldr	w9, 0f
>  	sub	x27, x8, w9, sxtw		// address to jump to after enabling the MMU
>  	b	__enable_mmu
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 96177a7c0f05..2faee6042e99 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>  
>  void __init setup_arch(char **cmdline_p)
>  {
> -	static struct vm_struct vmlinux_vm __initdata = {
> -		.addr		= (void *)KIMAGE_VADDR,
> -		.size		= 0,
> -		.flags		= VM_IOREMAP,
> -		.caller		= setup_arch,
> -	};
> -
> -	vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
> -				   1 << SWAPPER_BLOCK_SHIFT);
> -	vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
> +	static struct vm_struct vmlinux_vm __initdata;
> +
> +	vmlinux_vm.addr = (void *)kimage_vaddr;
> +	vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
> +				   SWAPPER_BLOCK_SIZE);
> +	vmlinux_vm.phys_addr = __pa(kimage_vaddr);
> +	vmlinux_vm.flags = VM_IOREMAP;
> +	vmlinux_vm.caller = setup_arch;
> +
>  	vm_area_add_early(&vmlinux_vm);
>  
>  	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>  	conswitchp = &dummy_con;
>  #endif
>  #endif
> -	if (boot_args[1] || boot_args[2] || boot_args[3]) {
> +	if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
> +	    boot_args[2] || boot_args[3]) {
>  		pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>  			"\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>  			"This indicates a broken bootloader or old kernel\n",

At this point it may make sense to split this out into a separate
function. If the handshake is more involved we'll need more code to
verify this, and it'd be nice to split that from setup_arch.

> diff --git a/arch/arm64/kernel/vmlinux.lds.S
> b/arch/arm64/kernel/vmlinux.lds.S
> index f935f082188d..cc1486039338 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -148,6 +148,15 @@ SECTIONS
>  	.altinstr_replacement : {
>  		*(.altinstr_replacement)
>  	}
> +	.rela : ALIGN(8) {
> +		__reloc_start = .;
> +		*(.rela .rela*)
> +		__reloc_end = .;
> +	}
> +	.dynsym : ALIGN(8) {
> +		__dynsym_start = .;
> +		*(.dynsym)
> +	}
>  
>  	. = ALIGN(PAGE_SIZE);
>  	__init_end = .;
> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
> index af247c70fb66..5ecbedefdb0f 100644
> --- a/scripts/sortextable.c
> +++ b/scripts/sortextable.c
> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>  		break;
>  	}  /* end switch */
>  	if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
> -	||  r2(&ehdr->e_type) != ET_EXEC
> +	|| (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>  	||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
> -		fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
> +		fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>  		fail_file();
>  	}

This change should probably be a preparatory patch.

Otherwise, looks good!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2016-01-08 12:36     ` Mark Rutland
  (?)
@ 2016-01-08 12:38       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> [...]
>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE   0x403
>> +#define R_AARCH64_ABS64              0x101
>
> Let's not duplicate asm/elf.h.
>
> I have a patch to split the reloc types out into a separate header we
> can reuse from assembly -- I'll send that momentarily. We can add
> R_AARCH64_RELATIVE atop of that.
>

OK.

>> +
>> +     /*
>> +      * Iterate over each entry in the relocation table, and apply the
>> +      * relocations in place.
>> +      */
>> +     adr_l   x8, __dynsym_start              // start of symbol table
>> +     adr_l   x9, __reloc_start               // start of reloc table
>> +     adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:   cmp     x9, x10
>> +     b.hs    2f
>> +     ldp     x11, x12, [x9], #24
>> +     ldr     x13, [x9, #-8]
>> +     cmp     w12, #R_AARCH64_RELATIVE
>> +     b.ne    1f
>> +     add     x13, x13, x23                   // relocate
>> +     str     x13, [x11, x23]
>> +     b       0b
>> +
>> +1:   cmp     w12, #R_AARCH64_ABS64
>> +     b.ne    0b
>> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +     add     x14, x15, x23                   // relocate
>> +     csel    x15, x14, x15, ne
>> +     add     x15, x13, x15
>> +     str     x15, [x11, x23]
>> +     b       0b
>
> We need to clean each of these relocated instructions to the PoU to be
> visible for I-cache fetches.
>
> As this is normal-cacheable we can post the maintenance with a DC CVAU
> immediately after the store (no barriers necessary), and rely on the DSB
> at 2f to complete all of those.
>

Dynamic relocations never apply to instructions, so i don't think that
is necessary here.

>> +
>> +2:   adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +     dc      cvac, x8                        // value visible to secondaries
>> +     dsb     sy                              // with MMU off
>
> Then we need:
>
>         ic      iallu
>         dsb     nsh
>         isb
>
> To make sure the I-side is consistent with the PoU.
>
> As secondaries will do similarly in __enable_mmu we don't need to add
> any code for them.
>
>> +#endif
>> +
>>       adr_l   sp, initial_sp, x4
>>       str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -     ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +     ldr_l   x0, kimage_vaddr                // Save the offset between
>>       sub     x24, x0, x24                    // the kernel virtual and
>>       str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>       .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +     .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>       adrp    x26, swapper_pg_dir
>>       bl      __cpu_setup                     // initialise processor
>>
>> -     ldr     x8, =KIMAGE_VADDR
>> +     ldr     x8, kimage_vaddr
>>       ldr     w9, 0f
>>       sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>       b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -     static struct vm_struct vmlinux_vm __initdata = {
>> -             .addr           = (void *)KIMAGE_VADDR,
>> -             .size           = 0,
>> -             .flags          = VM_IOREMAP,
>> -             .caller         = setup_arch,
>> -     };
>> -
>> -     vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                1 << SWAPPER_BLOCK_SHIFT);
>> -     vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +     static struct vm_struct vmlinux_vm __initdata;
>> +
>> +     vmlinux_vm.addr = (void *)kimage_vaddr;
>> +     vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                SWAPPER_BLOCK_SIZE);
>> +     vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +     vmlinux_vm.flags = VM_IOREMAP;
>> +     vmlinux_vm.caller = setup_arch;
>> +
>>       vm_area_add_early(&vmlinux_vm);
>>
>>       pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>       conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -     if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +     if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +         boot_args[2] || boot_args[3]) {
>>               pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                       "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                       "This indicates a broken bootloader or old kernel\n",
>
> At this point it may make sense to split this out into a separate
> function. If the handshake is more involved we'll need more code to
> verify this, and it'd be nice to split that from setup_arch.
>

OK

>> diff --git a/arch/arm64/kernel/vmlinux.lds.S
>> b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>       .altinstr_replacement : {
>>               *(.altinstr_replacement)
>>       }
>> +     .rela : ALIGN(8) {
>> +             __reloc_start = .;
>> +             *(.rela .rela*)
>> +             __reloc_end = .;
>> +     }
>> +     .dynsym : ALIGN(8) {
>> +             __dynsym_start = .;
>> +             *(.dynsym)
>> +     }
>>
>>       . = ALIGN(PAGE_SIZE);
>>       __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>               break;
>>       }  /* end switch */
>>       if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -     ||  r2(&ehdr->e_type) != ET_EXEC
>> +     || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>       ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -             fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +             fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>               fail_file();
>>       }
>
> This change should probably be a preparatory patch.
>
> Otherwise, looks good!
>

Cheers.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:38       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> [...]
>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE   0x403
>> +#define R_AARCH64_ABS64              0x101
>
> Let's not duplicate asm/elf.h.
>
> I have a patch to split the reloc types out into a separate header we
> can reuse from assembly -- I'll send that momentarily. We can add
> R_AARCH64_RELATIVE atop of that.
>

OK.

>> +
>> +     /*
>> +      * Iterate over each entry in the relocation table, and apply the
>> +      * relocations in place.
>> +      */
>> +     adr_l   x8, __dynsym_start              // start of symbol table
>> +     adr_l   x9, __reloc_start               // start of reloc table
>> +     adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:   cmp     x9, x10
>> +     b.hs    2f
>> +     ldp     x11, x12, [x9], #24
>> +     ldr     x13, [x9, #-8]
>> +     cmp     w12, #R_AARCH64_RELATIVE
>> +     b.ne    1f
>> +     add     x13, x13, x23                   // relocate
>> +     str     x13, [x11, x23]
>> +     b       0b
>> +
>> +1:   cmp     w12, #R_AARCH64_ABS64
>> +     b.ne    0b
>> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +     add     x14, x15, x23                   // relocate
>> +     csel    x15, x14, x15, ne
>> +     add     x15, x13, x15
>> +     str     x15, [x11, x23]
>> +     b       0b
>
> We need to clean each of these relocated instructions to the PoU to be
> visible for I-cache fetches.
>
> As this is normal-cacheable we can post the maintenance with a DC CVAU
> immediately after the store (no barriers necessary), and rely on the DSB
> at 2f to complete all of those.
>

Dynamic relocations never apply to instructions, so i don't think that
is necessary here.

>> +
>> +2:   adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +     dc      cvac, x8                        // value visible to secondaries
>> +     dsb     sy                              // with MMU off
>
> Then we need:
>
>         ic      iallu
>         dsb     nsh
>         isb
>
> To make sure the I-side is consistent with the PoU.
>
> As secondaries will do similarly in __enable_mmu we don't need to add
> any code for them.
>
>> +#endif
>> +
>>       adr_l   sp, initial_sp, x4
>>       str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -     ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +     ldr_l   x0, kimage_vaddr                // Save the offset between
>>       sub     x24, x0, x24                    // the kernel virtual and
>>       str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>       .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +     .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>       adrp    x26, swapper_pg_dir
>>       bl      __cpu_setup                     // initialise processor
>>
>> -     ldr     x8, =KIMAGE_VADDR
>> +     ldr     x8, kimage_vaddr
>>       ldr     w9, 0f
>>       sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>       b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -     static struct vm_struct vmlinux_vm __initdata = {
>> -             .addr           = (void *)KIMAGE_VADDR,
>> -             .size           = 0,
>> -             .flags          = VM_IOREMAP,
>> -             .caller         = setup_arch,
>> -     };
>> -
>> -     vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                1 << SWAPPER_BLOCK_SHIFT);
>> -     vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +     static struct vm_struct vmlinux_vm __initdata;
>> +
>> +     vmlinux_vm.addr = (void *)kimage_vaddr;
>> +     vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                SWAPPER_BLOCK_SIZE);
>> +     vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +     vmlinux_vm.flags = VM_IOREMAP;
>> +     vmlinux_vm.caller = setup_arch;
>> +
>>       vm_area_add_early(&vmlinux_vm);
>>
>>       pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>       conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -     if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +     if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +         boot_args[2] || boot_args[3]) {
>>               pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                       "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                       "This indicates a broken bootloader or old kernel\n",
>
> At this point it may make sense to split this out into a separate
> function. If the handshake is more involved we'll need more code to
> verify this, and it'd be nice to split that from setup_arch.
>

OK

>> diff --git a/arch/arm64/kernel/vmlinux.lds.S
>> b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>       .altinstr_replacement : {
>>               *(.altinstr_replacement)
>>       }
>> +     .rela : ALIGN(8) {
>> +             __reloc_start = .;
>> +             *(.rela .rela*)
>> +             __reloc_end = .;
>> +     }
>> +     .dynsym : ALIGN(8) {
>> +             __dynsym_start = .;
>> +             *(.dynsym)
>> +     }
>>
>>       . = ALIGN(PAGE_SIZE);
>>       __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>               break;
>>       }  /* end switch */
>>       if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -     ||  r2(&ehdr->e_type) != ET_EXEC
>> +     || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>       ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -             fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +             fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>               fail_file();
>>       }
>
> This change should probably be a preparatory patch.
>
> Otherwise, looks good!
>

Cheers.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:38       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 12:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
>> This adds support for runtime relocation of the kernel Image, by
>> building it as a PIE (ET_DYN) executable and applying the dynamic
>> relocations in the early boot code.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  Documentation/arm64/booting.txt |  3 +-
>>  arch/arm64/Kconfig              | 13 ++++
>>  arch/arm64/Makefile             |  6 +-
>>  arch/arm64/include/asm/memory.h |  3 +
>>  arch/arm64/kernel/head.S        | 75 +++++++++++++++++++-
>>  arch/arm64/kernel/setup.c       | 22 +++---
>>  arch/arm64/kernel/vmlinux.lds.S |  9 +++
>>  scripts/sortextable.c           |  4 +-
>>  8 files changed, 117 insertions(+), 18 deletions(-)
>
> [...]
>
>> +#ifdef CONFIG_ARM64_RELOCATABLE_KERNEL
>> +
>> +#define R_AARCH64_RELATIVE   0x403
>> +#define R_AARCH64_ABS64              0x101
>
> Let's not duplicate asm/elf.h.
>
> I have a patch to split the reloc types out into a separate header we
> can reuse from assembly -- I'll send that momentarily. We can add
> R_AARCH64_RELATIVE atop of that.
>

OK.

>> +
>> +     /*
>> +      * Iterate over each entry in the relocation table, and apply the
>> +      * relocations in place.
>> +      */
>> +     adr_l   x8, __dynsym_start              // start of symbol table
>> +     adr_l   x9, __reloc_start               // start of reloc table
>> +     adr_l   x10, __reloc_end                // end of reloc table
>> +
>> +0:   cmp     x9, x10
>> +     b.hs    2f
>> +     ldp     x11, x12, [x9], #24
>> +     ldr     x13, [x9, #-8]
>> +     cmp     w12, #R_AARCH64_RELATIVE
>> +     b.ne    1f
>> +     add     x13, x13, x23                   // relocate
>> +     str     x13, [x11, x23]
>> +     b       0b
>> +
>> +1:   cmp     w12, #R_AARCH64_ABS64
>> +     b.ne    0b
>> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
>> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
>> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
>> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
>> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
>> +     add     x14, x15, x23                   // relocate
>> +     csel    x15, x14, x15, ne
>> +     add     x15, x13, x15
>> +     str     x15, [x11, x23]
>> +     b       0b
>
> We need to clean each of these relocated instructions to the PoU to be
> visible for I-cache fetches.
>
> As this is normal-cacheable we can post the maintenance with a DC CVAU
> immediately after the store (no barriers necessary), and rely on the DSB
> at 2f to complete all of those.
>

Dynamic relocations never apply to instructions, so i don't think that
is necessary here.

>> +
>> +2:   adr_l   x8, kimage_vaddr                // make relocated kimage_vaddr
>> +     dc      cvac, x8                        // value visible to secondaries
>> +     dsb     sy                              // with MMU off
>
> Then we need:
>
>         ic      iallu
>         dsb     nsh
>         isb
>
> To make sure the I-side is consistent with the PoU.
>
> As secondaries will do similarly in __enable_mmu we don't need to add
> any code for them.
>
>> +#endif
>> +
>>       adr_l   sp, initial_sp, x4
>>       str_l   x21, __fdt_pointer, x5          // Save FDT pointer
>>
>> -     ldr     x0, =KIMAGE_VADDR               // Save the offset between
>> +     ldr_l   x0, kimage_vaddr                // Save the offset between
>>       sub     x24, x0, x24                    // the kernel virtual and
>>       str_l   x24, kimage_voffset, x0         // physical mappings
>>
>> @@ -462,6 +527,10 @@ ENDPROC(__mmap_switched)
>>   * hotplug and needs to have the same protections as the text region
>>   */
>>       .section ".text","ax"
>> +
>> +ENTRY(kimage_vaddr)
>> +     .quad           _text - TEXT_OFFSET
>> +
>>  /*
>>   * If we're fortunate enough to boot at EL2, ensure that the world is
>>   * sane before dropping to EL1.
>> @@ -622,7 +691,7 @@ ENTRY(secondary_startup)
>>       adrp    x26, swapper_pg_dir
>>       bl      __cpu_setup                     // initialise processor
>>
>> -     ldr     x8, =KIMAGE_VADDR
>> +     ldr     x8, kimage_vaddr
>>       ldr     w9, 0f
>>       sub     x27, x8, w9, sxtw               // address to jump to after enabling the MMU
>>       b       __enable_mmu
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 96177a7c0f05..2faee6042e99 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -292,16 +292,15 @@ u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
>>
>>  void __init setup_arch(char **cmdline_p)
>>  {
>> -     static struct vm_struct vmlinux_vm __initdata = {
>> -             .addr           = (void *)KIMAGE_VADDR,
>> -             .size           = 0,
>> -             .flags          = VM_IOREMAP,
>> -             .caller         = setup_arch,
>> -     };
>> -
>> -     vmlinux_vm.size = round_up((unsigned long)_end - KIMAGE_VADDR,
>> -                                1 << SWAPPER_BLOCK_SHIFT);
>> -     vmlinux_vm.phys_addr = __pa(KIMAGE_VADDR);
>> +     static struct vm_struct vmlinux_vm __initdata;
>> +
>> +     vmlinux_vm.addr = (void *)kimage_vaddr;
>> +     vmlinux_vm.size = round_up((u64)_end - kimage_vaddr,
>> +                                SWAPPER_BLOCK_SIZE);
>> +     vmlinux_vm.phys_addr = __pa(kimage_vaddr);
>> +     vmlinux_vm.flags = VM_IOREMAP;
>> +     vmlinux_vm.caller = setup_arch;
>> +
>>       vm_area_add_early(&vmlinux_vm);
>>
>>       pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
>> @@ -367,7 +366,8 @@ void __init setup_arch(char **cmdline_p)
>>       conswitchp = &dummy_con;
>>  #endif
>>  #endif
>> -     if (boot_args[1] || boot_args[2] || boot_args[3]) {
>> +     if ((!IS_ENABLED(CONFIG_ARM64_RELOCATABLE_KERNEL) && boot_args[1]) ||
>> +         boot_args[2] || boot_args[3]) {
>>               pr_err("WARNING: x1-x3 nonzero in violation of boot protocol:\n"
>>                       "\tx1: %016llx\n\tx2: %016llx\n\tx3: %016llx\n"
>>                       "This indicates a broken bootloader or old kernel\n",
>
> At this point it may make sense to split this out into a separate
> function. If the handshake is more involved we'll need more code to
> verify this, and it'd be nice to split that from setup_arch.
>

OK

>> diff --git a/arch/arm64/kernel/vmlinux.lds.S
>> b/arch/arm64/kernel/vmlinux.lds.S
>> index f935f082188d..cc1486039338 100644
>> --- a/arch/arm64/kernel/vmlinux.lds.S
>> +++ b/arch/arm64/kernel/vmlinux.lds.S
>> @@ -148,6 +148,15 @@ SECTIONS
>>       .altinstr_replacement : {
>>               *(.altinstr_replacement)
>>       }
>> +     .rela : ALIGN(8) {
>> +             __reloc_start = .;
>> +             *(.rela .rela*)
>> +             __reloc_end = .;
>> +     }
>> +     .dynsym : ALIGN(8) {
>> +             __dynsym_start = .;
>> +             *(.dynsym)
>> +     }
>>
>>       . = ALIGN(PAGE_SIZE);
>>       __init_end = .;
>> diff --git a/scripts/sortextable.c b/scripts/sortextable.c
>> index af247c70fb66..5ecbedefdb0f 100644
>> --- a/scripts/sortextable.c
>> +++ b/scripts/sortextable.c
>> @@ -266,9 +266,9 @@ do_file(char const *const fname)
>>               break;
>>       }  /* end switch */
>>       if (memcmp(ELFMAG, ehdr->e_ident, SELFMAG) != 0
>> -     ||  r2(&ehdr->e_type) != ET_EXEC
>> +     || (r2(&ehdr->e_type) != ET_EXEC && r2(&ehdr->e_type) != ET_DYN)
>>       ||  ehdr->e_ident[EI_VERSION] != EV_CURRENT) {
>> -             fprintf(stderr, "unrecognized ET_EXEC file %s\n", fname);
>> +             fprintf(stderr, "unrecognized ET_EXEC/ET_DYN file %s\n", fname);
>>               fail_file();
>>       }
>
> This change should probably be a preparatory patch.
>
> Otherwise, looks good!
>

Cheers.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
  2016-01-08 12:38       ` Ard Biesheuvel
  (?)
@ 2016-01-08 12:40         ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Fri, Jan 08, 2016 at 01:38:54PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> >> +
> >> +     /*
> >> +      * Iterate over each entry in the relocation table, and apply the
> >> +      * relocations in place.
> >> +      */
> >> +     adr_l   x8, __dynsym_start              // start of symbol table
> >> +     adr_l   x9, __reloc_start               // start of reloc table
> >> +     adr_l   x10, __reloc_end                // end of reloc table
> >> +
> >> +0:   cmp     x9, x10
> >> +     b.hs    2f
> >> +     ldp     x11, x12, [x9], #24
> >> +     ldr     x13, [x9, #-8]
> >> +     cmp     w12, #R_AARCH64_RELATIVE
> >> +     b.ne    1f
> >> +     add     x13, x13, x23                   // relocate
> >> +     str     x13, [x11, x23]
> >> +     b       0b
> >> +
> >> +1:   cmp     w12, #R_AARCH64_ABS64
> >> +     b.ne    0b
> >> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> >> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> >> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> >> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> >> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> >> +     add     x14, x15, x23                   // relocate
> >> +     csel    x15, x14, x15, ne
> >> +     add     x15, x13, x15
> >> +     str     x15, [x11, x23]
> >> +     b       0b
> >
> > We need to clean each of these relocated instructions to the PoU to be
> > visible for I-cache fetches.
> >
> > As this is normal-cacheable we can post the maintenance with a DC CVAU
> > immediately after the store (no barriers necessary), and rely on the DSB
> > at 2f to complete all of those.
> >
> 
> Dynamic relocations never apply to instructions, so i don't think that
> is necessary here.

Ah, yes. I was being thick. Sorry for the noise!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:40         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 01:38:54PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> >> +
> >> +     /*
> >> +      * Iterate over each entry in the relocation table, and apply the
> >> +      * relocations in place.
> >> +      */
> >> +     adr_l   x8, __dynsym_start              // start of symbol table
> >> +     adr_l   x9, __reloc_start               // start of reloc table
> >> +     adr_l   x10, __reloc_end                // end of reloc table
> >> +
> >> +0:   cmp     x9, x10
> >> +     b.hs    2f
> >> +     ldp     x11, x12, [x9], #24
> >> +     ldr     x13, [x9, #-8]
> >> +     cmp     w12, #R_AARCH64_RELATIVE
> >> +     b.ne    1f
> >> +     add     x13, x13, x23                   // relocate
> >> +     str     x13, [x11, x23]
> >> +     b       0b
> >> +
> >> +1:   cmp     w12, #R_AARCH64_ABS64
> >> +     b.ne    0b
> >> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> >> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> >> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> >> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> >> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> >> +     add     x14, x15, x23                   // relocate
> >> +     csel    x15, x14, x15, ne
> >> +     add     x15, x13, x15
> >> +     str     x15, [x11, x23]
> >> +     b       0b
> >
> > We need to clean each of these relocated instructions to the PoU to be
> > visible for I-cache fetches.
> >
> > As this is normal-cacheable we can post the maintenance with a DC CVAU
> > immediately after the store (no barriers necessary), and rely on the DSB
> > at 2f to complete all of those.
> >
> 
> Dynamic relocations never apply to instructions, so i don't think that
> is necessary here.

Ah, yes. I was being thick. Sorry for the noise!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 12/13] arm64: add support for relocatable kernel
@ 2016-01-08 12:40         ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:40 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Catalin Marinas,
	Leif Lindholm, Kees Cook, linux-kernel, Stuart Yoder,
	Sharma Bhupesh, Arnd Bergmann, Marc Zyngier, Christoffer Dall

On Fri, Jan 08, 2016 at 01:38:54PM +0100, Ard Biesheuvel wrote:
> On 8 January 2016 at 13:36, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Dec 30, 2015 at 04:26:11PM +0100, Ard Biesheuvel wrote:
> >> +
> >> +     /*
> >> +      * Iterate over each entry in the relocation table, and apply the
> >> +      * relocations in place.
> >> +      */
> >> +     adr_l   x8, __dynsym_start              // start of symbol table
> >> +     adr_l   x9, __reloc_start               // start of reloc table
> >> +     adr_l   x10, __reloc_end                // end of reloc table
> >> +
> >> +0:   cmp     x9, x10
> >> +     b.hs    2f
> >> +     ldp     x11, x12, [x9], #24
> >> +     ldr     x13, [x9, #-8]
> >> +     cmp     w12, #R_AARCH64_RELATIVE
> >> +     b.ne    1f
> >> +     add     x13, x13, x23                   // relocate
> >> +     str     x13, [x11, x23]
> >> +     b       0b
> >> +
> >> +1:   cmp     w12, #R_AARCH64_ABS64
> >> +     b.ne    0b
> >> +     add     x12, x12, x12, lsl #1           // symtab offset: 24x top word
> >> +     add     x12, x8, x12, lsr #(32 - 3)     // ... shifted into bottom word
> >> +     ldrsh   w14, [x12, #6]                  // Elf64_Sym::st_shndx
> >> +     ldr     x15, [x12, #8]                  // Elf64_Sym::st_value
> >> +     cmp     w14, #-0xf                      // SHN_ABS (0xfff1) ?
> >> +     add     x14, x15, x23                   // relocate
> >> +     csel    x15, x14, x15, ne
> >> +     add     x15, x13, x15
> >> +     str     x15, [x11, x23]
> >> +     b       0b
> >
> > We need to clean each of these relocated instructions to the PoU to be
> > visible for I-cache fetches.
> >
> > As this is normal-cacheable we can post the maintenance with a DC CVAU
> > immediately after the store (no barriers necessary), and rely on the DSB
> > at 2f to complete all of those.
> >
> 
> Dynamic relocations never apply to instructions, so i don't think that
> is necessary here.

Ah, yes. I was being thick. Sorry for the noise!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH] arm64: split elf relocs into a separate header.
  2016-01-08 12:36     ` Mark Rutland
                       ` (2 preceding siblings ...)
  (?)
@ 2016-01-08 12:41     ` Mark Rutland
  2016-01-08 15:59       ` Will Deacon
  -1 siblings, 1 reply; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 12:41 UTC (permalink / raw)
  To: linux-arm-kernel

Currently asm/elf.h contains a mixture of simple constants, C structure
definitions, and some constants defined terms of constants from other
headers (which are themselves mixtures).

To enable the use of AArch64 ELF reloc constants from assembly code (s
we will need for relocatable kernel support), we need an include without
C structure definitions or incldues of other files with such
definitions.

This patch factors out the relocs into a new header specifically for ELF
reloc types.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/elf.h        | 54 +--------------------------
 arch/arm64/include/asm/elf_relocs.h | 73 +++++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+), 53 deletions(-)
 create mode 100644 arch/arm64/include/asm/elf_relocs.h

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index faad6df..e4b3cdc 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -16,6 +16,7 @@
 #ifndef __ASM_ELF_H
 #define __ASM_ELF_H
 
+#include <asm/elf_relocs.h>
 #include <asm/hwcap.h>
 
 /*
@@ -34,59 +35,6 @@ typedef elf_greg_t elf_gregset_t[ELF_NGREG];
 typedef struct user_fpsimd_state elf_fpregset_t;
 
 /*
- * AArch64 static relocation types.
- */
-
-/* Miscellaneous. */
-#define R_ARM_NONE			0
-#define R_AARCH64_NONE			256
-
-/* Data. */
-#define R_AARCH64_ABS64			257
-#define R_AARCH64_ABS32			258
-#define R_AARCH64_ABS16			259
-#define R_AARCH64_PREL64		260
-#define R_AARCH64_PREL32		261
-#define R_AARCH64_PREL16		262
-
-/* Instructions. */
-#define R_AARCH64_MOVW_UABS_G0		263
-#define R_AARCH64_MOVW_UABS_G0_NC	264
-#define R_AARCH64_MOVW_UABS_G1		265
-#define R_AARCH64_MOVW_UABS_G1_NC	266
-#define R_AARCH64_MOVW_UABS_G2		267
-#define R_AARCH64_MOVW_UABS_G2_NC	268
-#define R_AARCH64_MOVW_UABS_G3		269
-
-#define R_AARCH64_MOVW_SABS_G0		270
-#define R_AARCH64_MOVW_SABS_G1		271
-#define R_AARCH64_MOVW_SABS_G2		272
-
-#define R_AARCH64_LD_PREL_LO19		273
-#define R_AARCH64_ADR_PREL_LO21		274
-#define R_AARCH64_ADR_PREL_PG_HI21	275
-#define R_AARCH64_ADR_PREL_PG_HI21_NC	276
-#define R_AARCH64_ADD_ABS_LO12_NC	277
-#define R_AARCH64_LDST8_ABS_LO12_NC	278
-
-#define R_AARCH64_TSTBR14		279
-#define R_AARCH64_CONDBR19		280
-#define R_AARCH64_JUMP26		282
-#define R_AARCH64_CALL26		283
-#define R_AARCH64_LDST16_ABS_LO12_NC	284
-#define R_AARCH64_LDST32_ABS_LO12_NC	285
-#define R_AARCH64_LDST64_ABS_LO12_NC	286
-#define R_AARCH64_LDST128_ABS_LO12_NC	299
-
-#define R_AARCH64_MOVW_PREL_G0		287
-#define R_AARCH64_MOVW_PREL_G0_NC	288
-#define R_AARCH64_MOVW_PREL_G1		289
-#define R_AARCH64_MOVW_PREL_G1_NC	290
-#define R_AARCH64_MOVW_PREL_G2		291
-#define R_AARCH64_MOVW_PREL_G2_NC	292
-#define R_AARCH64_MOVW_PREL_G3		293
-
-/*
  * These are used to set parameters in the core dumps.
  */
 #define ELF_CLASS	ELFCLASS64
diff --git a/arch/arm64/include/asm/elf_relocs.h b/arch/arm64/include/asm/elf_relocs.h
new file mode 100644
index 0000000..3f6b930
--- /dev/null
+++ b/arch/arm64/include/asm/elf_relocs.h
@@ -0,0 +1,73 @@
+/*
+ * Copyright (C) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_ELF_RELOCS_H
+#define __ASM_ELF_RELOCS_H
+
+/*
+ * AArch64 static relocation types.
+ */
+
+/* Miscellaneous. */
+#define R_ARM_NONE			0
+#define R_AARCH64_NONE			256
+
+/* Data. */
+#define R_AARCH64_ABS64			257
+#define R_AARCH64_ABS32			258
+#define R_AARCH64_ABS16			259
+#define R_AARCH64_PREL64		260
+#define R_AARCH64_PREL32		261
+#define R_AARCH64_PREL16		262
+
+/* Instructions. */
+#define R_AARCH64_MOVW_UABS_G0		263
+#define R_AARCH64_MOVW_UABS_G0_NC	264
+#define R_AARCH64_MOVW_UABS_G1		265
+#define R_AARCH64_MOVW_UABS_G1_NC	266
+#define R_AARCH64_MOVW_UABS_G2		267
+#define R_AARCH64_MOVW_UABS_G2_NC	268
+#define R_AARCH64_MOVW_UABS_G3		269
+
+#define R_AARCH64_MOVW_SABS_G0		270
+#define R_AARCH64_MOVW_SABS_G1		271
+#define R_AARCH64_MOVW_SABS_G2		272
+
+#define R_AARCH64_LD_PREL_LO19		273
+#define R_AARCH64_ADR_PREL_LO21		274
+#define R_AARCH64_ADR_PREL_PG_HI21	275
+#define R_AARCH64_ADR_PREL_PG_HI21_NC	276
+#define R_AARCH64_ADD_ABS_LO12_NC	277
+#define R_AARCH64_LDST8_ABS_LO12_NC	278
+
+#define R_AARCH64_TSTBR14		279
+#define R_AARCH64_CONDBR19		280
+#define R_AARCH64_JUMP26		282
+#define R_AARCH64_CALL26		283
+#define R_AARCH64_LDST16_ABS_LO12_NC	284
+#define R_AARCH64_LDST32_ABS_LO12_NC	285
+#define R_AARCH64_LDST64_ABS_LO12_NC	286
+#define R_AARCH64_LDST128_ABS_LO12_NC	299
+
+#define R_AARCH64_MOVW_PREL_G0		287
+#define R_AARCH64_MOVW_PREL_G0_NC	288
+#define R_AARCH64_MOVW_PREL_G1		289
+#define R_AARCH64_MOVW_PREL_G1_NC	290
+#define R_AARCH64_MOVW_PREL_G2		291
+#define R_AARCH64_MOVW_PREL_G2_NC	292
+#define R_AARCH64_MOVW_PREL_G3		293
+
+#endif /* __ASM_ELF_RELOCS_H */
+
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2015-12-30 15:26   ` Ard Biesheuvel
  (?)
@ 2016-01-08 15:27     ` Catalin Marinas
  -1 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */
> +	for_each_memblock(memory, r) {
> +		if (r->base + r->size > kbase) {
> +			u64 rem = min(to_remove, kbase - r->base);
> +
> +			max_addr = r->base + rem;
> +			to_remove -= rem;
> +			break;
> +		}
> +		if (to_remove <= r->size) {
> +			max_addr = r->base + to_remove;
> +			to_remove = 0;
> +			break;
> +		}
> +		to_remove -= r->size;
> +	}
> +
> +	memblock_remove(0, max_addr);
> +
> +	if (to_remove)
> +		memblock_enforce_memory_limit(memory_limit);
> +}

IIUC, this is changing the user expectations a bit. There are people
using the mem= limit to hijack some top of the RAM for other needs
(though they could do it in a saner way like changing the DT memory
nodes). Your patch first tries to remove the memory below the kernel
image and only remove the top if additional limitation is necessary.

Can you not remove memory from the top and block the limit if it goes
below the end of the kernel image, with some warning that memory limit
was not entirely fulfilled?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:27     ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */
> +	for_each_memblock(memory, r) {
> +		if (r->base + r->size > kbase) {
> +			u64 rem = min(to_remove, kbase - r->base);
> +
> +			max_addr = r->base + rem;
> +			to_remove -= rem;
> +			break;
> +		}
> +		if (to_remove <= r->size) {
> +			max_addr = r->base + to_remove;
> +			to_remove = 0;
> +			break;
> +		}
> +		to_remove -= r->size;
> +	}
> +
> +	memblock_remove(0, max_addr);
> +
> +	if (to_remove)
> +		memblock_enforce_memory_limit(memory_limit);
> +}

IIUC, this is changing the user expectations a bit. There are people
using the mem= limit to hijack some top of the RAM for other needs
(though they could do it in a saner way like changing the DT memory
nodes). Your patch first tries to remove the memory below the kernel
image and only remove the top if additional limitation is necessary.

Can you not remove memory from the top and block the limit if it goes
below the end of the kernel image, with some warning that memory limit
was not entirely fulfilled?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:27     ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, kernel-hardening, will.deacon, mark.rutland,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> +static void __init enforce_memory_limit(void)
> +{
> +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> +	phys_addr_t max_addr = 0;
> +	struct memblock_region *r;
> +
> +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> +		return;
> +
> +	/*
> +	 * The kernel may be high up in physical memory, so try to apply the
> +	 * limit below the kernel first, and only let the generic handling
> +	 * take over if it turns out we haven't clipped enough memory yet.
> +	 */
> +	for_each_memblock(memory, r) {
> +		if (r->base + r->size > kbase) {
> +			u64 rem = min(to_remove, kbase - r->base);
> +
> +			max_addr = r->base + rem;
> +			to_remove -= rem;
> +			break;
> +		}
> +		if (to_remove <= r->size) {
> +			max_addr = r->base + to_remove;
> +			to_remove = 0;
> +			break;
> +		}
> +		to_remove -= r->size;
> +	}
> +
> +	memblock_remove(0, max_addr);
> +
> +	if (to_remove)
> +		memblock_enforce_memory_limit(memory_limit);
> +}

IIUC, this is changing the user expectations a bit. There are people
using the mem= limit to hijack some top of the RAM for other needs
(though they could do it in a saner way like changing the DT memory
nodes). Your patch first tries to remove the memory below the kernel
image and only remove the top if additional limitation is necessary.

Can you not remove memory from the top and block the limit if it goes
below the end of the kernel image, with some warning that memory limit
was not entirely fulfilled?

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 15:27     ` Catalin Marinas
  (?)
@ 2016-01-08 15:30       ` Ard Biesheuvel
  -1 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 15:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Arnd Bergmann,
	Sharma Bhupesh, Stuart Yoder, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 16:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>> +     for_each_memblock(memory, r) {
>> +             if (r->base + r->size > kbase) {
>> +                     u64 rem = min(to_remove, kbase - r->base);
>> +
>> +                     max_addr = r->base + rem;
>> +                     to_remove -= rem;
>> +                     break;
>> +             }
>> +             if (to_remove <= r->size) {
>> +                     max_addr = r->base + to_remove;
>> +                     to_remove = 0;
>> +                     break;
>> +             }
>> +             to_remove -= r->size;
>> +     }
>> +
>> +     memblock_remove(0, max_addr);
>> +
>> +     if (to_remove)
>> +             memblock_enforce_memory_limit(memory_limit);
>> +}
>
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes). Your patch first tries to remove the memory below the kernel
> image and only remove the top if additional limitation is necessary.
>
> Can you not remove memory from the top and block the limit if it goes
> below the end of the kernel image, with some warning that memory limit
> was not entirely fulfilled?
>

I'm in the middle of rewriting this code from scratch. The general idea is

static void __init clip_mem_range(u64 min, u64 max);

/*
* Clip memory in order of preference:
* - above the kernel and above 4 GB
* - between 4 GB and the start of the kernel
* - below 4 GB
* Note that tho
*/
clip_mem_range(max(sz_4g, PAGE_ALIGN(__pa(_end))), ULLONG_MAX);
clip_mem_range(sz_4g, round_down(__pa(_text), MIN_KIMG_ALIGN));
clip_mem_range(0, sz_4g);

where clip_mem_range() iterates over the memblocks to remove memory
between min and max iff min < max and the limit has not been met yet.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:30       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 16:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>> +     for_each_memblock(memory, r) {
>> +             if (r->base + r->size > kbase) {
>> +                     u64 rem = min(to_remove, kbase - r->base);
>> +
>> +                     max_addr = r->base + rem;
>> +                     to_remove -= rem;
>> +                     break;
>> +             }
>> +             if (to_remove <= r->size) {
>> +                     max_addr = r->base + to_remove;
>> +                     to_remove = 0;
>> +                     break;
>> +             }
>> +             to_remove -= r->size;
>> +     }
>> +
>> +     memblock_remove(0, max_addr);
>> +
>> +     if (to_remove)
>> +             memblock_enforce_memory_limit(memory_limit);
>> +}
>
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes). Your patch first tries to remove the memory below the kernel
> image and only remove the top if additional limitation is necessary.
>
> Can you not remove memory from the top and block the limit if it goes
> below the end of the kernel image, with some warning that memory limit
> was not entirely fulfilled?
>

I'm in the middle of rewriting this code from scratch. The general idea is

static void __init clip_mem_range(u64 min, u64 max);

/*
* Clip memory in order of preference:
* - above the kernel and above 4 GB
* - between 4 GB and the start of the kernel
* - below 4 GB
* Note that tho
*/
clip_mem_range(max(sz_4g, PAGE_ALIGN(__pa(_end))), ULLONG_MAX);
clip_mem_range(sz_4g, round_down(__pa(_text), MIN_KIMG_ALIGN));
clip_mem_range(0, sz_4g);

where clip_mem_range() iterates over the memblocks to remove memory
between min and max iff min < max and the limit has not been met yet.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:30       ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-08 15:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kernel-hardening, Will Deacon, Mark Rutland,
	Leif Lindholm, Kees Cook, linux-kernel, Arnd Bergmann,
	Sharma Bhupesh, Stuart Yoder, Marc Zyngier, Christoffer Dall

On 8 January 2016 at 16:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
>> +static void __init enforce_memory_limit(void)
>> +{
>> +     const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
>> +     u64 to_remove = memblock_phys_mem_size() - memory_limit;
>> +     phys_addr_t max_addr = 0;
>> +     struct memblock_region *r;
>> +
>> +     if (memory_limit == (phys_addr_t)ULLONG_MAX)
>> +             return;
>> +
>> +     /*
>> +      * The kernel may be high up in physical memory, so try to apply the
>> +      * limit below the kernel first, and only let the generic handling
>> +      * take over if it turns out we haven't clipped enough memory yet.
>> +      */
>> +     for_each_memblock(memory, r) {
>> +             if (r->base + r->size > kbase) {
>> +                     u64 rem = min(to_remove, kbase - r->base);
>> +
>> +                     max_addr = r->base + rem;
>> +                     to_remove -= rem;
>> +                     break;
>> +             }
>> +             if (to_remove <= r->size) {
>> +                     max_addr = r->base + to_remove;
>> +                     to_remove = 0;
>> +                     break;
>> +             }
>> +             to_remove -= r->size;
>> +     }
>> +
>> +     memblock_remove(0, max_addr);
>> +
>> +     if (to_remove)
>> +             memblock_enforce_memory_limit(memory_limit);
>> +}
>
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes). Your patch first tries to remove the memory below the kernel
> image and only remove the top if additional limitation is necessary.
>
> Can you not remove memory from the top and block the limit if it goes
> below the end of the kernel image, with some warning that memory limit
> was not entirely fulfilled?
>

I'm in the middle of rewriting this code from scratch. The general idea is

static void __init clip_mem_range(u64 min, u64 max);

/*
* Clip memory in order of preference:
* - above the kernel and above 4 GB
* - between 4 GB and the start of the kernel
* - below 4 GB
* Note that tho
*/
clip_mem_range(max(sz_4g, PAGE_ALIGN(__pa(_end))), ULLONG_MAX);
clip_mem_range(sz_4g, round_down(__pa(_text), MIN_KIMG_ALIGN));
clip_mem_range(0, sz_4g);

where clip_mem_range() iterates over the memblocks to remove memory
between min and max iff min < max and the limit has not been met yet.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 15:27     ` Catalin Marinas
  (?)
@ 2016-01-08 15:36       ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 15:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > +static void __init enforce_memory_limit(void)
> > +{
> > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > +	phys_addr_t max_addr = 0;
> > +	struct memblock_region *r;
> > +
> > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > +		return;
> > +
> > +	/*
> > +	 * The kernel may be high up in physical memory, so try to apply the
> > +	 * limit below the kernel first, and only let the generic handling
> > +	 * take over if it turns out we haven't clipped enough memory yet.
> > +	 */
> > +	for_each_memblock(memory, r) {
> > +		if (r->base + r->size > kbase) {
> > +			u64 rem = min(to_remove, kbase - r->base);
> > +
> > +			max_addr = r->base + rem;
> > +			to_remove -= rem;
> > +			break;
> > +		}
> > +		if (to_remove <= r->size) {
> > +			max_addr = r->base + to_remove;
> > +			to_remove = 0;
> > +			break;
> > +		}
> > +		to_remove -= r->size;
> > +	}
> > +
> > +	memblock_remove(0, max_addr);
> > +
> > +	if (to_remove)
> > +		memblock_enforce_memory_limit(memory_limit);
> > +}
> 
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes).

Which will be hopelessly broken in the presence of KASLR, the kernel
being loaded at a different address, pages betting reserved differently
due to page size, etc.

I hope that no-one usees this for anything other than testing low-memory
conditions. If they want to steal memory they need to carve it out
explicitly.

We can behave as we used to, but we shouldn't give the impression that
such usage is supported.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:36       ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 15:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > +static void __init enforce_memory_limit(void)
> > +{
> > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > +	phys_addr_t max_addr = 0;
> > +	struct memblock_region *r;
> > +
> > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > +		return;
> > +
> > +	/*
> > +	 * The kernel may be high up in physical memory, so try to apply the
> > +	 * limit below the kernel first, and only let the generic handling
> > +	 * take over if it turns out we haven't clipped enough memory yet.
> > +	 */
> > +	for_each_memblock(memory, r) {
> > +		if (r->base + r->size > kbase) {
> > +			u64 rem = min(to_remove, kbase - r->base);
> > +
> > +			max_addr = r->base + rem;
> > +			to_remove -= rem;
> > +			break;
> > +		}
> > +		if (to_remove <= r->size) {
> > +			max_addr = r->base + to_remove;
> > +			to_remove = 0;
> > +			break;
> > +		}
> > +		to_remove -= r->size;
> > +	}
> > +
> > +	memblock_remove(0, max_addr);
> > +
> > +	if (to_remove)
> > +		memblock_enforce_memory_limit(memory_limit);
> > +}
> 
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes).

Which will be hopelessly broken in the presence of KASLR, the kernel
being loaded at a different address, pages betting reserved differently
due to page size, etc.

I hope that no-one usees this for anything other than testing low-memory
conditions. If they want to steal memory they need to carve it out
explicitly.

We can behave as we used to, but we shouldn't give the impression that
such usage is supported.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:36       ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 15:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ard Biesheuvel, linux-arm-kernel, kernel-hardening, will.deacon,
	leif.lindholm, keescook, linux-kernel, arnd, bhupesh.sharma,
	stuart.yoder, marc.zyngier, christoffer.dall

On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > +static void __init enforce_memory_limit(void)
> > +{
> > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > +	phys_addr_t max_addr = 0;
> > +	struct memblock_region *r;
> > +
> > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > +		return;
> > +
> > +	/*
> > +	 * The kernel may be high up in physical memory, so try to apply the
> > +	 * limit below the kernel first, and only let the generic handling
> > +	 * take over if it turns out we haven't clipped enough memory yet.
> > +	 */
> > +	for_each_memblock(memory, r) {
> > +		if (r->base + r->size > kbase) {
> > +			u64 rem = min(to_remove, kbase - r->base);
> > +
> > +			max_addr = r->base + rem;
> > +			to_remove -= rem;
> > +			break;
> > +		}
> > +		if (to_remove <= r->size) {
> > +			max_addr = r->base + to_remove;
> > +			to_remove = 0;
> > +			break;
> > +		}
> > +		to_remove -= r->size;
> > +	}
> > +
> > +	memblock_remove(0, max_addr);
> > +
> > +	if (to_remove)
> > +		memblock_enforce_memory_limit(memory_limit);
> > +}
> 
> IIUC, this is changing the user expectations a bit. There are people
> using the mem= limit to hijack some top of the RAM for other needs
> (though they could do it in a saner way like changing the DT memory
> nodes).

Which will be hopelessly broken in the presence of KASLR, the kernel
being loaded at a different address, pages betting reserved differently
due to page size, etc.

I hope that no-one usees this for anything other than testing low-memory
conditions. If they want to steal memory they need to carve it out
explicitly.

We can behave as we used to, but we shouldn't give the impression that
such usage is supported.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 15:36       ` Mark Rutland
  (?)
@ 2016-01-08 15:48         ` Catalin Marinas
  -1 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: keescook, arnd, kernel-hardening, bhupesh.sharma, Ard Biesheuvel,
	will.deacon, linux-kernel, leif.lindholm, stuart.yoder,
	marc.zyngier, christoffer.dall, linux-arm-kernel

On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > +static void __init enforce_memory_limit(void)
> > > +{
> > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > +	phys_addr_t max_addr = 0;
> > > +	struct memblock_region *r;
> > > +
> > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > +	 * limit below the kernel first, and only let the generic handling
> > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > +	 */
> > > +	for_each_memblock(memory, r) {
> > > +		if (r->base + r->size > kbase) {
> > > +			u64 rem = min(to_remove, kbase - r->base);
> > > +
> > > +			max_addr = r->base + rem;
> > > +			to_remove -= rem;
> > > +			break;
> > > +		}
> > > +		if (to_remove <= r->size) {
> > > +			max_addr = r->base + to_remove;
> > > +			to_remove = 0;
> > > +			break;
> > > +		}
> > > +		to_remove -= r->size;
> > > +	}
> > > +
> > > +	memblock_remove(0, max_addr);
> > > +
> > > +	if (to_remove)
> > > +		memblock_enforce_memory_limit(memory_limit);
> > > +}
> > 
> > IIUC, this is changing the user expectations a bit. There are people
> > using the mem= limit to hijack some top of the RAM for other needs
> > (though they could do it in a saner way like changing the DT memory
> > nodes).
> 
> Which will be hopelessly broken in the presence of KASLR, the kernel
> being loaded at a different address, pages betting reserved differently
> due to page size, etc.

With KASLR disabled, I think we should aim for the existing behaviour as
much as possible. The original aim of these patches was to relax the
kernel image placement rules, to make it easier for boot loaders rather
than completely randomising it.

With KASLR enabled, I agree it's hard to make any assumptions about what
memory is available. But removing memory only from the top would also
help with the point you already raised - keeping lower memory for
devices with narrower DMA mask.

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:48         ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > +static void __init enforce_memory_limit(void)
> > > +{
> > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > +	phys_addr_t max_addr = 0;
> > > +	struct memblock_region *r;
> > > +
> > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > +	 * limit below the kernel first, and only let the generic handling
> > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > +	 */
> > > +	for_each_memblock(memory, r) {
> > > +		if (r->base + r->size > kbase) {
> > > +			u64 rem = min(to_remove, kbase - r->base);
> > > +
> > > +			max_addr = r->base + rem;
> > > +			to_remove -= rem;
> > > +			break;
> > > +		}
> > > +		if (to_remove <= r->size) {
> > > +			max_addr = r->base + to_remove;
> > > +			to_remove = 0;
> > > +			break;
> > > +		}
> > > +		to_remove -= r->size;
> > > +	}
> > > +
> > > +	memblock_remove(0, max_addr);
> > > +
> > > +	if (to_remove)
> > > +		memblock_enforce_memory_limit(memory_limit);
> > > +}
> > 
> > IIUC, this is changing the user expectations a bit. There are people
> > using the mem= limit to hijack some top of the RAM for other needs
> > (though they could do it in a saner way like changing the DT memory
> > nodes).
> 
> Which will be hopelessly broken in the presence of KASLR, the kernel
> being loaded at a different address, pages betting reserved differently
> due to page size, etc.

With KASLR disabled, I think we should aim for the existing behaviour as
much as possible. The original aim of these patches was to relax the
kernel image placement rules, to make it easier for boot loaders rather
than completely randomising it.

With KASLR enabled, I agree it's hard to make any assumptions about what
memory is available. But removing memory only from the top would also
help with the point you already raised - keeping lower memory for
devices with narrower DMA mask.

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 15:48         ` Catalin Marinas
  0 siblings, 0 replies; 156+ messages in thread
From: Catalin Marinas @ 2016-01-08 15:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: keescook, arnd, kernel-hardening, bhupesh.sharma, Ard Biesheuvel,
	will.deacon, linux-kernel, leif.lindholm, stuart.yoder,
	marc.zyngier, christoffer.dall, linux-arm-kernel

On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > +static void __init enforce_memory_limit(void)
> > > +{
> > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > +	phys_addr_t max_addr = 0;
> > > +	struct memblock_region *r;
> > > +
> > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > +	 * limit below the kernel first, and only let the generic handling
> > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > +	 */
> > > +	for_each_memblock(memory, r) {
> > > +		if (r->base + r->size > kbase) {
> > > +			u64 rem = min(to_remove, kbase - r->base);
> > > +
> > > +			max_addr = r->base + rem;
> > > +			to_remove -= rem;
> > > +			break;
> > > +		}
> > > +		if (to_remove <= r->size) {
> > > +			max_addr = r->base + to_remove;
> > > +			to_remove = 0;
> > > +			break;
> > > +		}
> > > +		to_remove -= r->size;
> > > +	}
> > > +
> > > +	memblock_remove(0, max_addr);
> > > +
> > > +	if (to_remove)
> > > +		memblock_enforce_memory_limit(memory_limit);
> > > +}
> > 
> > IIUC, this is changing the user expectations a bit. There are people
> > using the mem= limit to hijack some top of the RAM for other needs
> > (though they could do it in a saner way like changing the DT memory
> > nodes).
> 
> Which will be hopelessly broken in the presence of KASLR, the kernel
> being loaded at a different address, pages betting reserved differently
> due to page size, etc.

With KASLR disabled, I think we should aim for the existing behaviour as
much as possible. The original aim of these patches was to relax the
kernel image placement rules, to make it easier for boot loaders rather
than completely randomising it.

With KASLR enabled, I agree it's hard to make any assumptions about what
memory is available. But removing memory only from the top would also
help with the point you already raised - keeping lower memory for
devices with narrower DMA mask.

-- 
Catalin

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH] arm64: split elf relocs into a separate header.
  2016-01-08 12:41     ` [PATCH] arm64: split elf relocs into a separate header Mark Rutland
@ 2016-01-08 15:59       ` Will Deacon
  2016-01-12 11:55         ` Ard Biesheuvel
  0 siblings, 1 reply; 156+ messages in thread
From: Will Deacon @ 2016-01-08 15:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 08, 2016 at 12:41:49PM +0000, Mark Rutland wrote:
> Currently asm/elf.h contains a mixture of simple constants, C structure
> definitions, and some constants defined terms of constants from other
> headers (which are themselves mixtures).
> 
> To enable the use of AArch64 ELF reloc constants from assembly code (s
> we will need for relocatable kernel support), we need an include without
> C structure definitions or incldues of other files with such
> definitions.
> 
> This patch factors out the relocs into a new header specifically for ELF
> reloc types.

Does #ifdef __ASSEMBLY__ not do the trick?

Will

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
  2016-01-08 15:48         ` Catalin Marinas
  (?)
@ 2016-01-08 16:14           ` Mark Rutland
  -1 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 16:14 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: keescook, arnd, kernel-hardening, bhupesh.sharma, Ard Biesheuvel,
	will.deacon, linux-kernel, leif.lindholm, stuart.yoder,
	marc.zyngier, christoffer.dall, linux-arm-kernel

Hi Catalin,

I think we agree w.r.t. the code you suggest. I just disagree with the
suggestion that using mem= for carveouts is something we must, or even
could support -- it's already fragile.

More on that below.

On Fri, Jan 08, 2016 at 03:48:15PM +0000, Catalin Marinas wrote:
> On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> > On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > > +static void __init enforce_memory_limit(void)
> > > > +{
> > > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > > +	phys_addr_t max_addr = 0;
> > > > +	struct memblock_region *r;
> > > > +
> > > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > > +		return;
> > > > +
> > > > +	/*
> > > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > > +	 * limit below the kernel first, and only let the generic handling
> > > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > > +	 */
> > > > +	for_each_memblock(memory, r) {
> > > > +		if (r->base + r->size > kbase) {
> > > > +			u64 rem = min(to_remove, kbase - r->base);
> > > > +
> > > > +			max_addr = r->base + rem;
> > > > +			to_remove -= rem;
> > > > +			break;
> > > > +		}
> > > > +		if (to_remove <= r->size) {
> > > > +			max_addr = r->base + to_remove;
> > > > +			to_remove = 0;
> > > > +			break;
> > > > +		}
> > > > +		to_remove -= r->size;
> > > > +	}
> > > > +
> > > > +	memblock_remove(0, max_addr);
> > > > +
> > > > +	if (to_remove)
> > > > +		memblock_enforce_memory_limit(memory_limit);
> > > > +}
> > > 
> > > IIUC, this is changing the user expectations a bit. There are people
> > > using the mem= limit to hijack some top of the RAM for other needs
> > > (though they could do it in a saner way like changing the DT memory
> > > nodes).
> > 
> > Which will be hopelessly broken in the presence of KASLR, the kernel
> > being loaded at a different address, pages betting reserved differently
> > due to page size, etc.
> 
> With KASLR disabled, I think we should aim for the existing behaviour as
> much as possible. The original aim of these patches was to relax the
> kernel image placement rules, to make it easier for boot loaders rather
> than completely randomising it.

Sure. My point was there were other reasons this is extremely fragile
currently, regardless of KASLR. For example, due to reservations
occurring differently.

Consider that when we add memory we may shave off portions of memory due
to page size, as we do in early_init_dt_add_memory_arch. Regions may be
fused or split for other reasons which may change over time, leading to
a different amount of memory being shaved off.

Afterwards memblock_enforce_memory_limit figures out the max address to keep
with:

        /* find out max address */
        for_each_memblock(memory, r) { 
                if (limit <= r->size) {
                        max_addr = r->base + limit;
                        break;
                }    
                limit -= r->size;
        }

Given all that, you cannot use mem= to prevent use of some memory, except for a
specific kernel binary with some value found by experimentation.

I think we need to make it clear that this is completely and hopelessly broken,
and should not pretend to support that.

> With KASLR enabled, I agree it's hard to make any assumptions about what
> memory is available.

As above, I do not think this is safe at all across kernel binaries.

> But removing memory only from the top would also > help with the point
> you already raised - keeping lower memory for > devices with narrower
> DMA mask.

I'm happy with the logic you suggest for the purpose of keeping low DMA
memory.

I think we must make it clear that mem= cannot be used to protect or
carve out memory -- it's a best effort tool for test purposes.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 16:14           ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 16:14 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

I think we agree w.r.t. the code you suggest. I just disagree with the
suggestion that using mem= for carveouts is something we must, or even
could support -- it's already fragile.

More on that below.

On Fri, Jan 08, 2016 at 03:48:15PM +0000, Catalin Marinas wrote:
> On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> > On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > > +static void __init enforce_memory_limit(void)
> > > > +{
> > > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > > +	phys_addr_t max_addr = 0;
> > > > +	struct memblock_region *r;
> > > > +
> > > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > > +		return;
> > > > +
> > > > +	/*
> > > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > > +	 * limit below the kernel first, and only let the generic handling
> > > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > > +	 */
> > > > +	for_each_memblock(memory, r) {
> > > > +		if (r->base + r->size > kbase) {
> > > > +			u64 rem = min(to_remove, kbase - r->base);
> > > > +
> > > > +			max_addr = r->base + rem;
> > > > +			to_remove -= rem;
> > > > +			break;
> > > > +		}
> > > > +		if (to_remove <= r->size) {
> > > > +			max_addr = r->base + to_remove;
> > > > +			to_remove = 0;
> > > > +			break;
> > > > +		}
> > > > +		to_remove -= r->size;
> > > > +	}
> > > > +
> > > > +	memblock_remove(0, max_addr);
> > > > +
> > > > +	if (to_remove)
> > > > +		memblock_enforce_memory_limit(memory_limit);
> > > > +}
> > > 
> > > IIUC, this is changing the user expectations a bit. There are people
> > > using the mem= limit to hijack some top of the RAM for other needs
> > > (though they could do it in a saner way like changing the DT memory
> > > nodes).
> > 
> > Which will be hopelessly broken in the presence of KASLR, the kernel
> > being loaded at a different address, pages betting reserved differently
> > due to page size, etc.
> 
> With KASLR disabled, I think we should aim for the existing behaviour as
> much as possible. The original aim of these patches was to relax the
> kernel image placement rules, to make it easier for boot loaders rather
> than completely randomising it.

Sure. My point was there were other reasons this is extremely fragile
currently, regardless of KASLR. For example, due to reservations
occurring differently.

Consider that when we add memory we may shave off portions of memory due
to page size, as we do in early_init_dt_add_memory_arch. Regions may be
fused or split for other reasons which may change over time, leading to
a different amount of memory being shaved off.

Afterwards memblock_enforce_memory_limit figures out the max address to keep
with:

        /* find out max address */
        for_each_memblock(memory, r) { 
                if (limit <= r->size) {
                        max_addr = r->base + limit;
                        break;
                }    
                limit -= r->size;
        }

Given all that, you cannot use mem= to prevent use of some memory, except for a
specific kernel binary with some value found by experimentation.

I think we need to make it clear that this is completely and hopelessly broken,
and should not pretend to support that.

> With KASLR enabled, I agree it's hard to make any assumptions about what
> memory is available.

As above, I do not think this is safe at all across kernel binaries.

> But removing memory only from the top would also > help with the point
> you already raised - keeping lower memory for > devices with narrower
> DMA mask.

I'm happy with the logic you suggest for the purpose of keeping low DMA
memory.

I think we must make it clear that mem= cannot be used to protect or
carve out memory -- it's a best effort tool for test purposes.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [kernel-hardening] Re: [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory
@ 2016-01-08 16:14           ` Mark Rutland
  0 siblings, 0 replies; 156+ messages in thread
From: Mark Rutland @ 2016-01-08 16:14 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: keescook, arnd, kernel-hardening, bhupesh.sharma, Ard Biesheuvel,
	will.deacon, linux-kernel, leif.lindholm, stuart.yoder,
	marc.zyngier, christoffer.dall, linux-arm-kernel

Hi Catalin,

I think we agree w.r.t. the code you suggest. I just disagree with the
suggestion that using mem= for carveouts is something we must, or even
could support -- it's already fragile.

More on that below.

On Fri, Jan 08, 2016 at 03:48:15PM +0000, Catalin Marinas wrote:
> On Fri, Jan 08, 2016 at 03:36:54PM +0000, Mark Rutland wrote:
> > On Fri, Jan 08, 2016 at 03:27:38PM +0000, Catalin Marinas wrote:
> > > On Wed, Dec 30, 2015 at 04:26:10PM +0100, Ard Biesheuvel wrote:
> > > > +static void __init enforce_memory_limit(void)
> > > > +{
> > > > +	const phys_addr_t kbase = round_down(__pa(_text), MIN_KIMG_ALIGN);
> > > > +	u64 to_remove = memblock_phys_mem_size() - memory_limit;
> > > > +	phys_addr_t max_addr = 0;
> > > > +	struct memblock_region *r;
> > > > +
> > > > +	if (memory_limit == (phys_addr_t)ULLONG_MAX)
> > > > +		return;
> > > > +
> > > > +	/*
> > > > +	 * The kernel may be high up in physical memory, so try to apply the
> > > > +	 * limit below the kernel first, and only let the generic handling
> > > > +	 * take over if it turns out we haven't clipped enough memory yet.
> > > > +	 */
> > > > +	for_each_memblock(memory, r) {
> > > > +		if (r->base + r->size > kbase) {
> > > > +			u64 rem = min(to_remove, kbase - r->base);
> > > > +
> > > > +			max_addr = r->base + rem;
> > > > +			to_remove -= rem;
> > > > +			break;
> > > > +		}
> > > > +		if (to_remove <= r->size) {
> > > > +			max_addr = r->base + to_remove;
> > > > +			to_remove = 0;
> > > > +			break;
> > > > +		}
> > > > +		to_remove -= r->size;
> > > > +	}
> > > > +
> > > > +	memblock_remove(0, max_addr);
> > > > +
> > > > +	if (to_remove)
> > > > +		memblock_enforce_memory_limit(memory_limit);
> > > > +}
> > > 
> > > IIUC, this is changing the user expectations a bit. There are people
> > > using the mem= limit to hijack some top of the RAM for other needs
> > > (though they could do it in a saner way like changing the DT memory
> > > nodes).
> > 
> > Which will be hopelessly broken in the presence of KASLR, the kernel
> > being loaded at a different address, pages betting reserved differently
> > due to page size, etc.
> 
> With KASLR disabled, I think we should aim for the existing behaviour as
> much as possible. The original aim of these patches was to relax the
> kernel image placement rules, to make it easier for boot loaders rather
> than completely randomising it.

Sure. My point was there were other reasons this is extremely fragile
currently, regardless of KASLR. For example, due to reservations
occurring differently.

Consider that when we add memory we may shave off portions of memory due
to page size, as we do in early_init_dt_add_memory_arch. Regions may be
fused or split for other reasons which may change over time, leading to
a different amount of memory being shaved off.

Afterwards memblock_enforce_memory_limit figures out the max address to keep
with:

        /* find out max address */
        for_each_memblock(memory, r) { 
                if (limit <= r->size) {
                        max_addr = r->base + limit;
                        break;
                }    
                limit -= r->size;
        }

Given all that, you cannot use mem= to prevent use of some memory, except for a
specific kernel binary with some value found by experimentation.

I think we need to make it clear that this is completely and hopelessly broken,
and should not pretend to support that.

> With KASLR enabled, I agree it's hard to make any assumptions about what
> memory is available.

As above, I do not think this is safe at all across kernel binaries.

> But removing memory only from the top would also > help with the point
> you already raised - keeping lower memory for > devices with narrower
> DMA mask.

I'm happy with the logic you suggest for the purpose of keeping low DMA
memory.

I think we must make it clear that mem= cannot be used to protect or
carve out memory -- it's a best effort tool for test purposes.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* [PATCH] arm64: split elf relocs into a separate header.
  2016-01-08 15:59       ` Will Deacon
@ 2016-01-12 11:55         ` Ard Biesheuvel
  0 siblings, 0 replies; 156+ messages in thread
From: Ard Biesheuvel @ 2016-01-12 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 January 2016 at 16:59, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jan 08, 2016 at 12:41:49PM +0000, Mark Rutland wrote:
>> Currently asm/elf.h contains a mixture of simple constants, C structure
>> definitions, and some constants defined terms of constants from other
>> headers (which are themselves mixtures).
>>
>> To enable the use of AArch64 ELF reloc constants from assembly code (s
>> we will need for relocatable kernel support), we need an include without
>> C structure definitions or incldues of other files with such
>> definitions.
>>
>> This patch factors out the relocs into a new header specifically for ELF
>> reloc types.
>
> Does #ifdef __ASSEMBLY__ not do the trick?
>

Actually, it does. The includes in asm/elf.h are guarded that way
themselves, so it is simply a matter of moving the C declarations
inside a #ifndef __ASSEMBLY__ block

^ permalink raw reply	[flat|nested] 156+ messages in thread

end of thread, other threads:[~2016-01-12 11:55 UTC | newest]

Thread overview: 156+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-30 15:25 [PATCH v2 00/13] arm64: implement support for KASLR Ard Biesheuvel
2015-12-30 15:25 ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:25 ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 01/13] of/fdt: make memblock minimum physical address arch configurable Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 02/13] arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-05 14:36   ` Christoffer Dall
2016-01-05 14:36     ` [kernel-hardening] " Christoffer Dall
2016-01-05 14:36     ` Christoffer Dall
2016-01-05 14:46     ` Mark Rutland
2016-01-05 14:46       ` [kernel-hardening] " Mark Rutland
2016-01-05 14:46       ` Mark Rutland
2016-01-05 14:58       ` Christoffer Dall
2016-01-05 14:58         ` [kernel-hardening] " Christoffer Dall
2016-01-05 14:58         ` Christoffer Dall
2015-12-30 15:26 ` [PATCH v2 03/13] arm64: use more granular reservations for static page table allocations Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-07 13:55   ` Mark Rutland
2016-01-07 13:55     ` [kernel-hardening] " Mark Rutland
2016-01-07 13:55     ` Mark Rutland
2016-01-07 14:02     ` Ard Biesheuvel
2016-01-07 14:02       ` [kernel-hardening] " Ard Biesheuvel
2016-01-07 14:02       ` Ard Biesheuvel
2016-01-07 14:25       ` Mark Rutland
2016-01-07 14:25         ` [kernel-hardening] " Mark Rutland
2016-01-07 14:25         ` Mark Rutland
2015-12-30 15:26 ` [PATCH v2 04/13] arm64: decouple early fixmap init from linear mapping Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-06 16:35   ` James Morse
2016-01-06 16:35     ` [kernel-hardening] " James Morse
2016-01-06 16:35     ` James Morse
2016-01-06 16:42     ` Ard Biesheuvel
2016-01-06 16:42       ` [kernel-hardening] " Ard Biesheuvel
2016-01-06 16:42       ` Ard Biesheuvel
2016-01-08 12:00   ` Catalin Marinas
2016-01-08 12:00     ` [kernel-hardening] " Catalin Marinas
2016-01-08 12:00     ` Catalin Marinas
2016-01-08 12:05     ` Ard Biesheuvel
2016-01-08 12:05       ` [kernel-hardening] " Ard Biesheuvel
2016-01-08 12:05       ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 05/13] arm64: kvm: deal with kernel symbols outside of " Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-04 10:08   ` Marc Zyngier
2016-01-04 10:08     ` [kernel-hardening] " Marc Zyngier
2016-01-04 10:08     ` Marc Zyngier
2016-01-04 10:31     ` Ard Biesheuvel
2016-01-04 10:31       ` [kernel-hardening] " Ard Biesheuvel
2016-01-04 10:31       ` Ard Biesheuvel
2016-01-04 11:02       ` Marc Zyngier
2016-01-04 11:02         ` [kernel-hardening] " Marc Zyngier
2016-01-04 11:02         ` Marc Zyngier
2016-01-05 14:41   ` Christoffer Dall
2016-01-05 14:41     ` [kernel-hardening] " Christoffer Dall
2016-01-05 14:41     ` Christoffer Dall
2016-01-05 14:51     ` Ard Biesheuvel
2016-01-05 14:51       ` [kernel-hardening] " Ard Biesheuvel
2016-01-05 14:51       ` Ard Biesheuvel
2016-01-05 14:56       ` Christoffer Dall
2016-01-05 14:56         ` [kernel-hardening] " Christoffer Dall
2016-01-05 14:56         ` Christoffer Dall
2015-12-30 15:26 ` [PATCH v2 06/13] arm64: move kernel image to base of vmalloc area Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 07/13] arm64: add support for module PLTs Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 08/13] arm64: use relative references in exception tables Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 09/13] arm64: avoid R_AARCH64_ABS64 relocations for Image header fields Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 10/13] arm64: avoid dynamic relocations in early boot code Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 11/13] arm64: allow kernel Image to be loaded anywhere in physical memory Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-08 11:26   ` Mark Rutland
2016-01-08 11:26     ` [kernel-hardening] " Mark Rutland
2016-01-08 11:26     ` Mark Rutland
2016-01-08 11:34     ` Ard Biesheuvel
2016-01-08 11:34       ` [kernel-hardening] " Ard Biesheuvel
2016-01-08 11:34       ` Ard Biesheuvel
2016-01-08 11:43       ` Mark Rutland
2016-01-08 11:43         ` [kernel-hardening] " Mark Rutland
2016-01-08 11:43         ` Mark Rutland
2016-01-08 15:27   ` Catalin Marinas
2016-01-08 15:27     ` [kernel-hardening] " Catalin Marinas
2016-01-08 15:27     ` Catalin Marinas
2016-01-08 15:30     ` Ard Biesheuvel
2016-01-08 15:30       ` [kernel-hardening] " Ard Biesheuvel
2016-01-08 15:30       ` Ard Biesheuvel
2016-01-08 15:36     ` Mark Rutland
2016-01-08 15:36       ` [kernel-hardening] " Mark Rutland
2016-01-08 15:36       ` Mark Rutland
2016-01-08 15:48       ` Catalin Marinas
2016-01-08 15:48         ` [kernel-hardening] " Catalin Marinas
2016-01-08 15:48         ` Catalin Marinas
2016-01-08 16:14         ` Mark Rutland
2016-01-08 16:14           ` [kernel-hardening] " Mark Rutland
2016-01-08 16:14           ` Mark Rutland
2015-12-30 15:26 ` [PATCH v2 12/13] arm64: add support for relocatable kernel Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-05 19:51   ` Kees Cook
2016-01-05 19:51     ` [kernel-hardening] " Kees Cook
2016-01-05 19:51     ` Kees Cook
2016-01-06  7:51     ` Ard Biesheuvel
2016-01-06  7:51       ` [kernel-hardening] " Ard Biesheuvel
2016-01-06  7:51       ` Ard Biesheuvel
2016-01-08 10:17   ` James Morse
2016-01-08 10:17     ` [kernel-hardening] " James Morse
2016-01-08 10:17     ` James Morse
2016-01-08 10:25     ` Ard Biesheuvel
2016-01-08 10:25       ` [kernel-hardening] " Ard Biesheuvel
2016-01-08 10:25       ` Ard Biesheuvel
2016-01-08 12:36   ` Mark Rutland
2016-01-08 12:36     ` [kernel-hardening] " Mark Rutland
2016-01-08 12:36     ` Mark Rutland
2016-01-08 12:38     ` Ard Biesheuvel
2016-01-08 12:38       ` [kernel-hardening] " Ard Biesheuvel
2016-01-08 12:38       ` Ard Biesheuvel
2016-01-08 12:40       ` Mark Rutland
2016-01-08 12:40         ` [kernel-hardening] " Mark Rutland
2016-01-08 12:40         ` Mark Rutland
2016-01-08 12:41     ` [PATCH] arm64: split elf relocs into a separate header Mark Rutland
2016-01-08 15:59       ` Will Deacon
2016-01-12 11:55         ` Ard Biesheuvel
2015-12-30 15:26 ` [PATCH v2 13/13] arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness Ard Biesheuvel
2015-12-30 15:26   ` [kernel-hardening] " Ard Biesheuvel
2015-12-30 15:26   ` Ard Biesheuvel
2016-01-05 19:53   ` Kees Cook
2016-01-05 19:53     ` [kernel-hardening] " Kees Cook
2016-01-05 19:53     ` Kees Cook
2016-01-06  7:51     ` Ard Biesheuvel
2016-01-06  7:51       ` [kernel-hardening] " Ard Biesheuvel
2016-01-06  7:51       ` Ard Biesheuvel
2016-01-07 18:46   ` Mark Rutland
2016-01-07 18:46     ` [kernel-hardening] " Mark Rutland
2016-01-07 18:46     ` Mark Rutland
2016-01-07 19:07     ` Kees Cook
2016-01-07 19:07       ` [kernel-hardening] " Kees Cook
2016-01-07 19:07       ` Kees Cook
2016-01-05 20:08 ` [PATCH v2 00/13] arm64: implement support for KASLR Kees Cook
2016-01-05 20:08   ` [kernel-hardening] " Kees Cook
2016-01-05 20:08   ` Kees Cook
2016-01-05 21:24   ` Ard Biesheuvel
2016-01-05 21:24     ` [kernel-hardening] " Ard Biesheuvel
2016-01-05 21:24     ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.