All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-11 20:30 Thomas Garnier
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-11 20:30 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Thomas Garnier, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Tom Lendacky, Andy Lutomirski, Kirill A . Shutemov,
	Borislav Petkov, Rafael J . Wysocki, Len Brown, Pavel Machek,
	Juergen Gross, Chris Wright, Alok Kataria, Rusty Russell,
	Tejun Heo
  Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
	virtualization, linux-sparse, linux-crypto, kernel-hardening,
	xen-devel

Changes:
 - patch v1:
   - Simplify ftrace implementation.
   - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
 - rfc v3:
   - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
     mapped memory. It also simplifies the relocation process.
   - Move the start the module section next to the kernel. Remove the need for
     -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
   - Support for XEN PVH as 32-bit relocations can be ignored with
     --emit-relocs.
   - Support for GOT relocations previously done automatically with -pie.
   - Remove need for dynamic PLT in modules.
   - Support dymamic GOT for modules.
 - rfc v2:
   - Add support for global stack cookie while compiler default to fs without
     mcmodel=kernel
   - Change patch 7 to correctly jump out of the identity mapping on kexec load
     preserve.

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.

The patches:
 - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
 - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
 - 14: Adapt percpu design to work correctly when PIE is enabled.
 - 15: Provide an option to default visibility to hidden except for key symbols.
       It removes errors between compilation units.
 - 16: Adapt relocation tool to handle PIE binary correctly.
 - 19: Add support for global cookie.
 - 20: Support ftrace with PIE (used on Ubuntu config).
 - 21: Fix incorrect address marker on dump_pagetables.
 - 22: Add option to move the module section just after the kernel.
 - 23: Adapt module loading to support PIE with dynamic GOT.
 - 24: Make the GOT read-only.
 - 25: Add the CONFIG_X86_PIE option (off by default).
 - 26: Adapt relocation tool to generate a 64-bit relocation table.
 - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
       from 1G to 3G (off by default).

Performance/Size impact:

Size of vmlinux (Default configuration):
 File size:
 - PIE disabled: +0.000031%
 - PIE enabled: -3.210% (less relocations)
 .text section:
 - PIE disabled: +0.000644%
 - PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: -0.201%
 - PIE enabled: -0.082%
 .text section:
 - PIE disabled: same
 - PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
 - PIE disabled: no significant change (avg +0.1% on latest test).
 - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-2% on latest run, likely noise).
 - PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (avg -0.239%)
 - PIE enabled: average +0.07%
 System Time:
 - PIE disabled: no significant change (avg -0.277%)
 - PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

diffstat:
 Documentation/x86/x86_64/mm.txt              |    3 
 arch/x86/Kconfig                             |   43 ++++++
 arch/x86/Makefile                            |   40 +++++
 arch/x86/boot/boot.h                         |    2 
 arch/x86/boot/compressed/Makefile            |    5 
 arch/x86/boot/compressed/misc.c              |   10 +
 arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
 arch/x86/crypto/aesni-intel_asm.S            |   14 +-
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6 
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
 arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
 arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4 
 arch/x86/crypto/glue_helper-asm-avx.S        |    4 
 arch/x86/crypto/glue_helper-asm-avx2.S       |    6 
 arch/x86/entry/entry_32.S                    |    3 
 arch/x86/entry/entry_64.S                    |   29 ++--
 arch/x86/include/asm/asm.h                   |   13 +
 arch/x86/include/asm/bug.h                   |    2 
 arch/x86/include/asm/ftrace.h                |    6 
 arch/x86/include/asm/jump_label.h            |    8 -
 arch/x86/include/asm/kvm_host.h              |    6 
 arch/x86/include/asm/module.h                |   11 +
 arch/x86/include/asm/page_64_types.h         |    9 +
 arch/x86/include/asm/paravirt_types.h        |   12 +
 arch/x86/include/asm/percpu.h                |   25 ++-
 arch/x86/include/asm/pgtable_64_types.h      |    6 
 arch/x86/include/asm/pm-trace.h              |    2 
 arch/x86/include/asm/processor.h             |   12 +
 arch/x86/include/asm/sections.h              |    8 +
 arch/x86/include/asm/setup.h                 |    2 
 arch/x86/include/asm/stackprotector.h        |   19 ++
 arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
 arch/x86/kernel/asm-offsets.c                |    3 
 arch/x86/kernel/asm-offsets_32.c             |    3 
 arch/x86/kernel/asm-offsets_64.c             |    3 
 arch/x86/kernel/cpu/common.c                 |    7 -
 arch/x86/kernel/cpu/microcode/core.c         |    4 
 arch/x86/kernel/ftrace.c                     |   42 +++++-
 arch/x86/kernel/head64.c                     |   32 +++-
 arch/x86/kernel/head_32.S                    |    3 
 arch/x86/kernel/head_64.S                    |   41 +++++-
 arch/x86/kernel/kvm.c                        |    6 
 arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
 arch/x86/kernel/module.lds                   |    3 
 arch/x86/kernel/process.c                    |    5 
 arch/x86/kernel/relocate_kernel_64.S         |    8 -
 arch/x86/kernel/setup_percpu.c               |    2 
 arch/x86/kernel/vmlinux.lds.S                |   13 +
 arch/x86/kvm/svm.c                           |    4 
 arch/x86/lib/cmpxchg16b_emu.S                |    8 -
 arch/x86/mm/dump_pagetables.c                |   11 +
 arch/x86/power/hibernate_asm_64.S            |    4 
 arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
 arch/x86/tools/relocs.h                      |    4 
 arch/x86/tools/relocs_common.c               |   15 +-
 arch/x86/xen/xen-asm.S                       |   12 -
 arch/x86/xen/xen-head.S                      |    9 -
 arch/x86/xen/xen-pvh.S                       |   13 +
 drivers/base/firmware_class.c                |    4 
 include/asm-generic/sections.h               |    6 
 include/asm-generic/vmlinux.lds.h            |   12 +
 include/linux/compiler.h                     |    8 +
 init/Kconfig                                 |    9 +
 kernel/kallsyms.c                            |   16 +-
 kernel/trace/trace.h                         |    4 
 lib/dynamic_debug.c                          |    4 
 70 files changed, 1032 insertions(+), 308 deletions(-)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 16:28     ` Tom Lendacky
@ 2017-10-18 23:17         ` Thomas Garnier
  2017-10-18 23:17       ` Thomas Garnier
  2017-10-18 23:17         ` Thomas Garnier
  2 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Kees Cook, Andrey Ryabinin, Matthias Kaehlcke, Andy Lutomirski,
	Kirill A . Shutemov, Borislav Petkov, Rafael J . Wysocki,
	Len Brown, Pavel Machek, Juergen Gross, Chris Wright,
	Alok Kataria, Rusty Russell, Tejun Heo, Christoph Lameter,
	Boris Ostrovsky

On Thu, Oct 12, 2017 at 9:28 AM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/12/2017 10:34 AM, Thomas Garnier wrote:
>>
>> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com>
>> wrote:
>>>
>>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>>>
>>>> Changes:
>>>>    - patch v1:
>>>>      - Simplify ftrace implementation.
>>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>>    - rfc v3:
>>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation
>>>> space on
>>>>        mapped memory. It also simplifies the relocation process.
>>>>      - Move the start the module section next to the kernel. Remove the
>>>> need for
>>>>        -mcmodel=large on modules. Extends module space from 1 to 2G
>>>> maximum.
>>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>>        --emit-relocs.
>>>>      - Support for GOT relocations previously done automatically with
>>>> -pie.
>>>>      - Remove need for dynamic PLT in modules.
>>>>      - Support dymamic GOT for modules.
>>>>    - rfc v2:
>>>>      - Add support for global stack cookie while compiler default to fs
>>>> without
>>>>        mcmodel=kernel
>>>>      - Change patch 7 to correctly jump out of the identity mapping on
>>>> kexec load
>>>>        preserve.
>>>>
>>>> These patches make the changes necessary to build the kernel as Position
>>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated
>>>> below
>>>> the top 2G of the virtual address space. It allows to optionally extend
>>>> the
>>>> KASLR randomization range from 1G to 3G.
>>>
>>>
>>> Hi Thomas,
>>>
>>> I've applied your patches so that I can verify that SME works with PIE.
>>> Unfortunately, I'm running into build warnings and errors when I enable
>>> PIE.
>>>
>>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>>
>>>    drivers/scsi/libfc/fc_exch.o: warning: objtool:
>>> fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>>
>>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>>
>>
>> I ran into that, I plan to fix it in the next iteration.
>>
>>>
>>> But near the end of the build, I receive errors like this:
>>>
>>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to
>>> fit: R_X86_64_32S against symbol `_text' defined in .text section in
>>> .tmp_vmlinux1
>>>    .
>>>    . about 10 more of the above type messages
>>>    .
>>>    make: *** [vmlinux] Error 1
>>>    Error building kernel, exiting
>>>
>>> Are there any config options that should or should not be enabled when
>>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>>
>>
>> I never ran into these ones and I tested compilers older and newer.
>> What was your exact configuration?
>
>
> I'll send you the config in a separate email.
>
> Thanks,
> Tom

Thanks for your feedback (Tom and Markus). The issue was linked to
using a modern gcc with a modern linker, I managed to repro and fix it
on my current version.

I will create a v1.5 for Kees Cook to keep on one of his branch for
few weeks so I can collect as much feedback from 0day. After that I
will send v2.

>
>
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on
>>>> his
>>>> feedback for using -pie versus --emit-relocs and details on compiler
>>>> code
>>>> generation.
>>>>
>>>> The patches:
>>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address
>>>> generically.
>>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>>    - 15: Provide an option to default visibility to hidden except for
>>>> key symbols.
>>>>          It removes errors between compilation units.
>>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>>    - 19: Add support for global cookie.
>>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>>    - 22: Add option to move the module section just after the kernel.
>>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>>    - 24: Make the GOT read-only.
>>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase
>>>> relocation range
>>>>          from 1G to 3G (off by default).
>>>>
>>>> Performance/Size impact:
>>>>
>>>> Size of vmlinux (Default configuration):
>>>>    File size:
>>>>    - PIE disabled: +0.000031%
>>>>    - PIE enabled: -3.210% (less relocations)
>>>>    .text section:
>>>>    - PIE disabled: +0.000644%
>>>>    - PIE enabled: +0.837%
>>>>
>>>> Size of vmlinux (Ubuntu configuration):
>>>>    File size:
>>>>    - PIE disabled: -0.201%
>>>>    - PIE enabled: -0.082%
>>>>    .text section:
>>>>    - PIE disabled: same
>>>>    - PIE enabled: +1.319%
>>>>
>>>> Size of vmlinux (Default configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +0.814%
>>>>
>>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +1.26%
>>>>
>>>> The size increase is mainly due to not having access to the 32-bit
>>>> signed
>>>> relocation that can be used with mcmodel=kernel. A small part is due to
>>>> reduced
>>>> optimization for PIE code. This bug [1] was opened with gcc to provide a
>>>> better
>>>> code generation for kernel PIE.
>>>>
>>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>>    - PIE enabled: between -0.50% to +0.86% in average (default and
>>>> Ubuntu config).
>>>>
>>>> slab_test (average of 10 runs):
>>>>    - PIE disabled: no significant change (-2% on latest run, likely
>>>> noise).
>>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>>
>>>> Kernbench (average of 10 Half and Optimal runs):
>>>>    Elapsed Time:
>>>>    - PIE disabled: no significant change (avg -0.239%)
>>>>    - PIE enabled: average +0.07%
>>>>    System Time:
>>>>    - PIE disabled: no significant change (avg -0.277%)
>>>>    - PIE enabled: average +0.7%
>>>>
>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>>
>>>> diffstat:
>>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>>    arch/x86/Kconfig                             |   43 ++++++
>>>>    arch/x86/Makefile                            |   40 +++++
>>>>    arch/x86/boot/boot.h                         |    2
>>>>    arch/x86/boot/compressed/Makefile            |    5
>>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>>    arch/x86/entry/entry_32.S                    |    3
>>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>>    arch/x86/include/asm/bug.h                   |    2
>>>>    arch/x86/include/asm/ftrace.h                |    6
>>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>>    arch/x86/include/asm/module.h                |   11 +
>>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>>    arch/x86/include/asm/processor.h             |   12 +
>>>>    arch/x86/include/asm/sections.h              |    8 +
>>>>    arch/x86/include/asm/setup.h                 |    2
>>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>>    arch/x86/kernel/head_32.S                    |    3
>>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>>    arch/x86/kernel/kvm.c                        |    6
>>>>    arch/x86/kernel/module.c                     |  182
>>>> ++++++++++++++++++++++++++-
>>>>    arch/x86/kernel/module.lds                   |    3
>>>>    arch/x86/kernel/process.c                    |    5
>>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>>    arch/x86/kvm/svm.c                           |    4
>>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>>    arch/x86/tools/relocs.c                      |  170
>>>> +++++++++++++++++++++++--
>>>>    arch/x86/tools/relocs.h                      |    4
>>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>>    drivers/base/firmware_class.c                |    4
>>>>    include/asm-generic/sections.h               |    6
>>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>>    include/linux/compiler.h                     |    8 +
>>>>    init/Kconfig                                 |    9 +
>>>>    kernel/kallsyms.c                            |   16 +-
>>>>    kernel/trace/trace.h                         |    4
>>>>    lib/dynamic_debug.c                          |    4
>>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>>
>>
>>
>>
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-18 23:17         ` Thomas Garnier
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Kees Cook, Andrey Ryabinin, Matthias Kaehlcke, Andy Lutomirski,
	Kirill A . Shutemov, Borislav Petkov, Rafael J . Wysocki,
	Len Brown, Pavel Machek, Juergen Gross, Chris Wright,
	Alok Kataria, Rusty Russell, Tejun Heo, Christoph Lameter,
	Boris Ostrovsky

On Thu, Oct 12, 2017 at 9:28 AM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/12/2017 10:34 AM, Thomas Garnier wrote:
>>
>> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com>
>> wrote:
>>>
>>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>>>
>>>> Changes:
>>>>    - patch v1:
>>>>      - Simplify ftrace implementation.
>>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>>    - rfc v3:
>>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation
>>>> space on
>>>>        mapped memory. It also simplifies the relocation process.
>>>>      - Move the start the module section next to the kernel. Remove the
>>>> need for
>>>>        -mcmodel=large on modules. Extends module space from 1 to 2G
>>>> maximum.
>>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>>        --emit-relocs.
>>>>      - Support for GOT relocations previously done automatically with
>>>> -pie.
>>>>      - Remove need for dynamic PLT in modules.
>>>>      - Support dymamic GOT for modules.
>>>>    - rfc v2:
>>>>      - Add support for global stack cookie while compiler default to fs
>>>> without
>>>>        mcmodel=kernel
>>>>      - Change patch 7 to correctly jump out of the identity mapping on
>>>> kexec load
>>>>        preserve.
>>>>
>>>> These patches make the changes necessary to build the kernel as Position
>>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated
>>>> below
>>>> the top 2G of the virtual address space. It allows to optionally extend
>>>> the
>>>> KASLR randomization range from 1G to 3G.
>>>
>>>
>>> Hi Thomas,
>>>
>>> I've applied your patches so that I can verify that SME works with PIE.
>>> Unfortunately, I'm running into build warnings and errors when I enable
>>> PIE.
>>>
>>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>>
>>>    drivers/scsi/libfc/fc_exch.o: warning: objtool:
>>> fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>>
>>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>>
>>
>> I ran into that, I plan to fix it in the next iteration.
>>
>>>
>>> But near the end of the build, I receive errors like this:
>>>
>>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to
>>> fit: R_X86_64_32S against symbol `_text' defined in .text section in
>>> .tmp_vmlinux1
>>>    .
>>>    . about 10 more of the above type messages
>>>    .
>>>    make: *** [vmlinux] Error 1
>>>    Error building kernel, exiting
>>>
>>> Are there any config options that should or should not be enabled when
>>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>>
>>
>> I never ran into these ones and I tested compilers older and newer.
>> What was your exact configuration?
>
>
> I'll send you the config in a separate email.
>
> Thanks,
> Tom

Thanks for your feedback (Tom and Markus). The issue was linked to
using a modern gcc with a modern linker, I managed to repro and fix it
on my current version.

I will create a v1.5 for Kees Cook to keep on one of his branch for
few weeks so I can collect as much feedback from 0day. After that I
will send v2.

>
>
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on
>>>> his
>>>> feedback for using -pie versus --emit-relocs and details on compiler
>>>> code
>>>> generation.
>>>>
>>>> The patches:
>>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address
>>>> generically.
>>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>>    - 15: Provide an option to default visibility to hidden except for
>>>> key symbols.
>>>>          It removes errors between compilation units.
>>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>>    - 19: Add support for global cookie.
>>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>>    - 22: Add option to move the module section just after the kernel.
>>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>>    - 24: Make the GOT read-only.
>>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase
>>>> relocation range
>>>>          from 1G to 3G (off by default).
>>>>
>>>> Performance/Size impact:
>>>>
>>>> Size of vmlinux (Default configuration):
>>>>    File size:
>>>>    - PIE disabled: +0.000031%
>>>>    - PIE enabled: -3.210% (less relocations)
>>>>    .text section:
>>>>    - PIE disabled: +0.000644%
>>>>    - PIE enabled: +0.837%
>>>>
>>>> Size of vmlinux (Ubuntu configuration):
>>>>    File size:
>>>>    - PIE disabled: -0.201%
>>>>    - PIE enabled: -0.082%
>>>>    .text section:
>>>>    - PIE disabled: same
>>>>    - PIE enabled: +1.319%
>>>>
>>>> Size of vmlinux (Default configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +0.814%
>>>>
>>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +1.26%
>>>>
>>>> The size increase is mainly due to not having access to the 32-bit
>>>> signed
>>>> relocation that can be used with mcmodel=kernel. A small part is due to
>>>> reduced
>>>> optimization for PIE code. This bug [1] was opened with gcc to provide a
>>>> better
>>>> code generation for kernel PIE.
>>>>
>>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>>    - PIE enabled: between -0.50% to +0.86% in average (default and
>>>> Ubuntu config).
>>>>
>>>> slab_test (average of 10 runs):
>>>>    - PIE disabled: no significant change (-2% on latest run, likely
>>>> noise).
>>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>>
>>>> Kernbench (average of 10 Half and Optimal runs):
>>>>    Elapsed Time:
>>>>    - PIE disabled: no significant change (avg -0.239%)
>>>>    - PIE enabled: average +0.07%
>>>>    System Time:
>>>>    - PIE disabled: no significant change (avg -0.277%)
>>>>    - PIE enabled: average +0.7%
>>>>
>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>>
>>>> diffstat:
>>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>>    arch/x86/Kconfig                             |   43 ++++++
>>>>    arch/x86/Makefile                            |   40 +++++
>>>>    arch/x86/boot/boot.h                         |    2
>>>>    arch/x86/boot/compressed/Makefile            |    5
>>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>>    arch/x86/entry/entry_32.S                    |    3
>>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>>    arch/x86/include/asm/bug.h                   |    2
>>>>    arch/x86/include/asm/ftrace.h                |    6
>>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>>    arch/x86/include/asm/module.h                |   11 +
>>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>>    arch/x86/include/asm/processor.h             |   12 +
>>>>    arch/x86/include/asm/sections.h              |    8 +
>>>>    arch/x86/include/asm/setup.h                 |    2
>>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>>    arch/x86/kernel/head_32.S                    |    3
>>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>>    arch/x86/kernel/kvm.c                        |    6
>>>>    arch/x86/kernel/module.c                     |  182
>>>> ++++++++++++++++++++++++++-
>>>>    arch/x86/kernel/module.lds                   |    3
>>>>    arch/x86/kernel/process.c                    |    5
>>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>>    arch/x86/kvm/svm.c                           |    4
>>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>>    arch/x86/tools/relocs.c                      |  170
>>>> +++++++++++++++++++++++--
>>>>    arch/x86/tools/relocs.h                      |    4
>>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>>    drivers/base/firmware_class.c                |    4
>>>>    include/asm-generic/sections.h               |    6
>>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>>    include/linux/compiler.h                     |    8 +
>>>>    init/Kconfig                                 |    9 +
>>>>    kernel/kallsyms.c                            |   16 +-
>>>>    kernel/trace/trace.h                         |    4
>>>>    lib/dynamic_debug.c                          |    4
>>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>>
>>
>>
>>
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 16:28     ` Tom Lendacky
@ 2017-10-18 23:17       ` Thomas Garnier via Virtualization
  2017-10-18 23:17       ` Thomas Garnier
  2017-10-18 23:17         ` Thomas Garnier
  2 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier via Virtualization @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On Thu, Oct 12, 2017 at 9:28 AM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/12/2017 10:34 AM, Thomas Garnier wrote:
>>
>> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com>
>> wrote:
>>>
>>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>>>
>>>> Changes:
>>>>    - patch v1:
>>>>      - Simplify ftrace implementation.
>>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>>    - rfc v3:
>>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation
>>>> space on
>>>>        mapped memory. It also simplifies the relocation process.
>>>>      - Move the start the module section next to the kernel. Remove the
>>>> need for
>>>>        -mcmodel=large on modules. Extends module space from 1 to 2G
>>>> maximum.
>>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>>        --emit-relocs.
>>>>      - Support for GOT relocations previously done automatically with
>>>> -pie.
>>>>      - Remove need for dynamic PLT in modules.
>>>>      - Support dymamic GOT for modules.
>>>>    - rfc v2:
>>>>      - Add support for global stack cookie while compiler default to fs
>>>> without
>>>>        mcmodel=kernel
>>>>      - Change patch 7 to correctly jump out of the identity mapping on
>>>> kexec load
>>>>        preserve.
>>>>
>>>> These patches make the changes necessary to build the kernel as Position
>>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated
>>>> below
>>>> the top 2G of the virtual address space. It allows to optionally extend
>>>> the
>>>> KASLR randomization range from 1G to 3G.
>>>
>>>
>>> Hi Thomas,
>>>
>>> I've applied your patches so that I can verify that SME works with PIE.
>>> Unfortunately, I'm running into build warnings and errors when I enable
>>> PIE.
>>>
>>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>>
>>>    drivers/scsi/libfc/fc_exch.o: warning: objtool:
>>> fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>>
>>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>>
>>
>> I ran into that, I plan to fix it in the next iteration.
>>
>>>
>>> But near the end of the build, I receive errors like this:
>>>
>>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to
>>> fit: R_X86_64_32S against symbol `_text' defined in .text section in
>>> .tmp_vmlinux1
>>>    .
>>>    . about 10 more of the above type messages
>>>    .
>>>    make: *** [vmlinux] Error 1
>>>    Error building kernel, exiting
>>>
>>> Are there any config options that should or should not be enabled when
>>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>>
>>
>> I never ran into these ones and I tested compilers older and newer.
>> What was your exact configuration?
>
>
> I'll send you the config in a separate email.
>
> Thanks,
> Tom

Thanks for your feedback (Tom and Markus). The issue was linked to
using a modern gcc with a modern linker, I managed to repro and fix it
on my current version.

I will create a v1.5 for Kees Cook to keep on one of his branch for
few weeks so I can collect as much feedback from 0day. After that I
will send v2.

>
>
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on
>>>> his
>>>> feedback for using -pie versus --emit-relocs and details on compiler
>>>> code
>>>> generation.
>>>>
>>>> The patches:
>>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address
>>>> generically.
>>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>>    - 15: Provide an option to default visibility to hidden except for
>>>> key symbols.
>>>>          It removes errors between compilation units.
>>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>>    - 19: Add support for global cookie.
>>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>>    - 22: Add option to move the module section just after the kernel.
>>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>>    - 24: Make the GOT read-only.
>>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase
>>>> relocation range
>>>>          from 1G to 3G (off by default).
>>>>
>>>> Performance/Size impact:
>>>>
>>>> Size of vmlinux (Default configuration):
>>>>    File size:
>>>>    - PIE disabled: +0.000031%
>>>>    - PIE enabled: -3.210% (less relocations)
>>>>    .text section:
>>>>    - PIE disabled: +0.000644%
>>>>    - PIE enabled: +0.837%
>>>>
>>>> Size of vmlinux (Ubuntu configuration):
>>>>    File size:
>>>>    - PIE disabled: -0.201%
>>>>    - PIE enabled: -0.082%
>>>>    .text section:
>>>>    - PIE disabled: same
>>>>    - PIE enabled: +1.319%
>>>>
>>>> Size of vmlinux (Default configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +0.814%
>>>>
>>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +1.26%
>>>>
>>>> The size increase is mainly due to not having access to the 32-bit
>>>> signed
>>>> relocation that can be used with mcmodel=kernel. A small part is due to
>>>> reduced
>>>> optimization for PIE code. This bug [1] was opened with gcc to provide a
>>>> better
>>>> code generation for kernel PIE.
>>>>
>>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>>    - PIE enabled: between -0.50% to +0.86% in average (default and
>>>> Ubuntu config).
>>>>
>>>> slab_test (average of 10 runs):
>>>>    - PIE disabled: no significant change (-2% on latest run, likely
>>>> noise).
>>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>>
>>>> Kernbench (average of 10 Half and Optimal runs):
>>>>    Elapsed Time:
>>>>    - PIE disabled: no significant change (avg -0.239%)
>>>>    - PIE enabled: average +0.07%
>>>>    System Time:
>>>>    - PIE disabled: no significant change (avg -0.277%)
>>>>    - PIE enabled: average +0.7%
>>>>
>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>>
>>>> diffstat:
>>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>>    arch/x86/Kconfig                             |   43 ++++++
>>>>    arch/x86/Makefile                            |   40 +++++
>>>>    arch/x86/boot/boot.h                         |    2
>>>>    arch/x86/boot/compressed/Makefile            |    5
>>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>>    arch/x86/entry/entry_32.S                    |    3
>>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>>    arch/x86/include/asm/bug.h                   |    2
>>>>    arch/x86/include/asm/ftrace.h                |    6
>>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>>    arch/x86/include/asm/module.h                |   11 +
>>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>>    arch/x86/include/asm/processor.h             |   12 +
>>>>    arch/x86/include/asm/sections.h              |    8 +
>>>>    arch/x86/include/asm/setup.h                 |    2
>>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>>    arch/x86/kernel/head_32.S                    |    3
>>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>>    arch/x86/kernel/kvm.c                        |    6
>>>>    arch/x86/kernel/module.c                     |  182
>>>> ++++++++++++++++++++++++++-
>>>>    arch/x86/kernel/module.lds                   |    3
>>>>    arch/x86/kernel/process.c                    |    5
>>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>>    arch/x86/kvm/svm.c                           |    4
>>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>>    arch/x86/tools/relocs.c                      |  170
>>>> +++++++++++++++++++++++--
>>>>    arch/x86/tools/relocs.h                      |    4
>>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>>    drivers/base/firmware_class.c                |    4
>>>>    include/asm-generic/sections.h               |    6
>>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>>    include/linux/compiler.h                     |    8 +
>>>>    init/Kconfig                                 |    9 +
>>>>    kernel/kallsyms.c                            |   16 +-
>>>>    kernel/trace/trace.h                         |    4
>>>>    lib/dynamic_debug.c                          |    4
>>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>>
>>
>>
>>
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 16:28     ` Tom Lendacky
  2017-10-18 23:17       ` Thomas Garnier via Virtualization
@ 2017-10-18 23:17       ` Thomas Garnier
  2017-10-18 23:17         ` Thomas Garnier
  2 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On Thu, Oct 12, 2017 at 9:28 AM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/12/2017 10:34 AM, Thomas Garnier wrote:
>>
>> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com>
>> wrote:
>>>
>>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>>>
>>>> Changes:
>>>>    - patch v1:
>>>>      - Simplify ftrace implementation.
>>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>>    - rfc v3:
>>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation
>>>> space on
>>>>        mapped memory. It also simplifies the relocation process.
>>>>      - Move the start the module section next to the kernel. Remove the
>>>> need for
>>>>        -mcmodel=large on modules. Extends module space from 1 to 2G
>>>> maximum.
>>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>>        --emit-relocs.
>>>>      - Support for GOT relocations previously done automatically with
>>>> -pie.
>>>>      - Remove need for dynamic PLT in modules.
>>>>      - Support dymamic GOT for modules.
>>>>    - rfc v2:
>>>>      - Add support for global stack cookie while compiler default to fs
>>>> without
>>>>        mcmodel=kernel
>>>>      - Change patch 7 to correctly jump out of the identity mapping on
>>>> kexec load
>>>>        preserve.
>>>>
>>>> These patches make the changes necessary to build the kernel as Position
>>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated
>>>> below
>>>> the top 2G of the virtual address space. It allows to optionally extend
>>>> the
>>>> KASLR randomization range from 1G to 3G.
>>>
>>>
>>> Hi Thomas,
>>>
>>> I've applied your patches so that I can verify that SME works with PIE.
>>> Unfortunately, I'm running into build warnings and errors when I enable
>>> PIE.
>>>
>>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>>
>>>    drivers/scsi/libfc/fc_exch.o: warning: objtool:
>>> fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>>
>>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>>
>>
>> I ran into that, I plan to fix it in the next iteration.
>>
>>>
>>> But near the end of the build, I receive errors like this:
>>>
>>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to
>>> fit: R_X86_64_32S against symbol `_text' defined in .text section in
>>> .tmp_vmlinux1
>>>    .
>>>    . about 10 more of the above type messages
>>>    .
>>>    make: *** [vmlinux] Error 1
>>>    Error building kernel, exiting
>>>
>>> Are there any config options that should or should not be enabled when
>>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>>
>>
>> I never ran into these ones and I tested compilers older and newer.
>> What was your exact configuration?
>
>
> I'll send you the config in a separate email.
>
> Thanks,
> Tom

Thanks for your feedback (Tom and Markus). The issue was linked to
using a modern gcc with a modern linker, I managed to repro and fix it
on my current version.

I will create a v1.5 for Kees Cook to keep on one of his branch for
few weeks so I can collect as much feedback from 0day. After that I
will send v2.

>
>
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on
>>>> his
>>>> feedback for using -pie versus --emit-relocs and details on compiler
>>>> code
>>>> generation.
>>>>
>>>> The patches:
>>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address
>>>> generically.
>>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>>    - 15: Provide an option to default visibility to hidden except for
>>>> key symbols.
>>>>          It removes errors between compilation units.
>>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>>    - 19: Add support for global cookie.
>>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>>    - 22: Add option to move the module section just after the kernel.
>>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>>    - 24: Make the GOT read-only.
>>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase
>>>> relocation range
>>>>          from 1G to 3G (off by default).
>>>>
>>>> Performance/Size impact:
>>>>
>>>> Size of vmlinux (Default configuration):
>>>>    File size:
>>>>    - PIE disabled: +0.000031%
>>>>    - PIE enabled: -3.210% (less relocations)
>>>>    .text section:
>>>>    - PIE disabled: +0.000644%
>>>>    - PIE enabled: +0.837%
>>>>
>>>> Size of vmlinux (Ubuntu configuration):
>>>>    File size:
>>>>    - PIE disabled: -0.201%
>>>>    - PIE enabled: -0.082%
>>>>    .text section:
>>>>    - PIE disabled: same
>>>>    - PIE enabled: +1.319%
>>>>
>>>> Size of vmlinux (Default configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +0.814%
>>>>
>>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>>    File size:
>>>>    - PIE enabled: -3.167%
>>>>    .text section:
>>>>    - PIE enabled: +1.26%
>>>>
>>>> The size increase is mainly due to not having access to the 32-bit
>>>> signed
>>>> relocation that can be used with mcmodel=kernel. A small part is due to
>>>> reduced
>>>> optimization for PIE code. This bug [1] was opened with gcc to provide a
>>>> better
>>>> code generation for kernel PIE.
>>>>
>>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>>    - PIE enabled: between -0.50% to +0.86% in average (default and
>>>> Ubuntu config).
>>>>
>>>> slab_test (average of 10 runs):
>>>>    - PIE disabled: no significant change (-2% on latest run, likely
>>>> noise).
>>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>>
>>>> Kernbench (average of 10 Half and Optimal runs):
>>>>    Elapsed Time:
>>>>    - PIE disabled: no significant change (avg -0.239%)
>>>>    - PIE enabled: average +0.07%
>>>>    System Time:
>>>>    - PIE disabled: no significant change (avg -0.277%)
>>>>    - PIE enabled: average +0.7%
>>>>
>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>>
>>>> diffstat:
>>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>>    arch/x86/Kconfig                             |   43 ++++++
>>>>    arch/x86/Makefile                            |   40 +++++
>>>>    arch/x86/boot/boot.h                         |    2
>>>>    arch/x86/boot/compressed/Makefile            |    5
>>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>>    arch/x86/entry/entry_32.S                    |    3
>>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>>    arch/x86/include/asm/bug.h                   |    2
>>>>    arch/x86/include/asm/ftrace.h                |    6
>>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>>    arch/x86/include/asm/module.h                |   11 +
>>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>>    arch/x86/include/asm/processor.h             |   12 +
>>>>    arch/x86/include/asm/sections.h              |    8 +
>>>>    arch/x86/include/asm/setup.h                 |    2
>>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>>    arch/x86/kernel/head_32.S                    |    3
>>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>>    arch/x86/kernel/kvm.c                        |    6
>>>>    arch/x86/kernel/module.c                     |  182
>>>> ++++++++++++++++++++++++++-
>>>>    arch/x86/kernel/module.lds                   |    3
>>>>    arch/x86/kernel/process.c                    |    5
>>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>>    arch/x86/kvm/svm.c                           |    4
>>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>>    arch/x86/tools/relocs.c                      |  170
>>>> +++++++++++++++++++++++--
>>>>    arch/x86/tools/relocs.h                      |    4
>>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>>    drivers/base/firmware_class.c                |    4
>>>>    include/asm-generic/sections.h               |    6
>>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>>    include/linux/compiler.h                     |    8 +
>>>>    init/Kconfig                                 |    9 +
>>>>    kernel/kallsyms.c                            |   16 +-
>>>>    kernel/trace/trace.h                         |    4
>>>>    lib/dynamic_debug.c                          |    4
>>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>>
>>
>>
>>
>



-- 
Thomas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 15:34     ` Thomas Garnier
                       ` (3 preceding siblings ...)
  (?)
@ 2017-10-12 16:28     ` Tom Lendacky
  2017-10-18 23:17       ` Thomas Garnier via Virtualization
                         ` (2 more replies)
  -1 siblings, 3 replies; 19+ messages in thread
From: Tom Lendacky @ 2017-10-12 16:28 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On 10/12/2017 10:34 AM, Thomas Garnier wrote:
> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>> Changes:
>>>    - patch v1:
>>>      - Simplify ftrace implementation.
>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>    - rfc v3:
>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>>        mapped memory. It also simplifies the relocation process.
>>>      - Move the start the module section next to the kernel. Remove the need for
>>>        -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>        --emit-relocs.
>>>      - Support for GOT relocations previously done automatically with -pie.
>>>      - Remove need for dynamic PLT in modules.
>>>      - Support dymamic GOT for modules.
>>>    - rfc v2:
>>>      - Add support for global stack cookie while compiler default to fs without
>>>        mcmodel=kernel
>>>      - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>>        preserve.
>>>
>>> These patches make the changes necessary to build the kernel as Position
>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>>> the top 2G of the virtual address space. It allows to optionally extend the
>>> KASLR randomization range from 1G to 3G.
>>
>> Hi Thomas,
>>
>> I've applied your patches so that I can verify that SME works with PIE.
>> Unfortunately, I'm running into build warnings and errors when I enable
>> PIE.
>>
>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>
>>    drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>
>> Disabling CONFIG_STACK_VALIDATION suppresses those.
> 
> I ran into that, I plan to fix it in the next iteration.
> 
>>
>> But near the end of the build, I receive errors like this:
>>
>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>>    .
>>    . about 10 more of the above type messages
>>    .
>>    make: *** [vmlinux] Error 1
>>    Error building kernel, exiting
>>
>> Are there any config options that should or should not be enabled when
>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
> 
> I never ran into these ones and I tested compilers older and newer.
> What was your exact configuration?

I'll send you the config in a separate email.

Thanks,
Tom

> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>>> feedback for using -pie versus --emit-relocs and details on compiler code
>>> generation.
>>>
>>> The patches:
>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>    - 15: Provide an option to default visibility to hidden except for key symbols.
>>>          It removes errors between compilation units.
>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>    - 19: Add support for global cookie.
>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>    - 22: Add option to move the module section just after the kernel.
>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>    - 24: Make the GOT read-only.
>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>>          from 1G to 3G (off by default).
>>>
>>> Performance/Size impact:
>>>
>>> Size of vmlinux (Default configuration):
>>>    File size:
>>>    - PIE disabled: +0.000031%
>>>    - PIE enabled: -3.210% (less relocations)
>>>    .text section:
>>>    - PIE disabled: +0.000644%
>>>    - PIE enabled: +0.837%
>>>
>>> Size of vmlinux (Ubuntu configuration):
>>>    File size:
>>>    - PIE disabled: -0.201%
>>>    - PIE enabled: -0.082%
>>>    .text section:
>>>    - PIE disabled: same
>>>    - PIE enabled: +1.319%
>>>
>>> Size of vmlinux (Default configuration + ORC):
>>>    File size:
>>>    - PIE enabled: -3.167%
>>>    .text section:
>>>    - PIE enabled: +0.814%
>>>
>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>    File size:
>>>    - PIE enabled: -3.167%
>>>    .text section:
>>>    - PIE enabled: +1.26%
>>>
>>> The size increase is mainly due to not having access to the 32-bit signed
>>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>>> code generation for kernel PIE.
>>>
>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>    - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>>
>>> slab_test (average of 10 runs):
>>>    - PIE disabled: no significant change (-2% on latest run, likely noise).
>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>
>>> Kernbench (average of 10 Half and Optimal runs):
>>>    Elapsed Time:
>>>    - PIE disabled: no significant change (avg -0.239%)
>>>    - PIE enabled: average +0.07%
>>>    System Time:
>>>    - PIE disabled: no significant change (avg -0.277%)
>>>    - PIE enabled: average +0.7%
>>>
>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>
>>> diffstat:
>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>    arch/x86/Kconfig                             |   43 ++++++
>>>    arch/x86/Makefile                            |   40 +++++
>>>    arch/x86/boot/boot.h                         |    2
>>>    arch/x86/boot/compressed/Makefile            |    5
>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>    arch/x86/entry/entry_32.S                    |    3
>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>    arch/x86/include/asm/bug.h                   |    2
>>>    arch/x86/include/asm/ftrace.h                |    6
>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>    arch/x86/include/asm/module.h                |   11 +
>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>    arch/x86/include/asm/processor.h             |   12 +
>>>    arch/x86/include/asm/sections.h              |    8 +
>>>    arch/x86/include/asm/setup.h                 |    2
>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>    arch/x86/kernel/head_32.S                    |    3
>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>    arch/x86/kernel/kvm.c                        |    6
>>>    arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>>    arch/x86/kernel/module.lds                   |    3
>>>    arch/x86/kernel/process.c                    |    5
>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>    arch/x86/kvm/svm.c                           |    4
>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>    arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>>    arch/x86/tools/relocs.h                      |    4
>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>    drivers/base/firmware_class.c                |    4
>>>    include/asm-generic/sections.h               |    6
>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>    include/linux/compiler.h                     |    8 +
>>>    init/Kconfig                                 |    9 +
>>>    kernel/kallsyms.c                            |   16 +-
>>>    kernel/trace/trace.h                         |    4
>>>    lib/dynamic_debug.c                          |    4
>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 15:34     ` Thomas Garnier
                       ` (2 preceding siblings ...)
  (?)
@ 2017-10-12 16:28     ` Tom Lendacky
  -1 siblings, 0 replies; 19+ messages in thread
From: Tom Lendacky @ 2017-10-12 16:28 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On 10/12/2017 10:34 AM, Thomas Garnier wrote:
> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>> Changes:
>>>    - patch v1:
>>>      - Simplify ftrace implementation.
>>>      - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>    - rfc v3:
>>>      - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>>        mapped memory. It also simplifies the relocation process.
>>>      - Move the start the module section next to the kernel. Remove the need for
>>>        -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>>      - Support for XEN PVH as 32-bit relocations can be ignored with
>>>        --emit-relocs.
>>>      - Support for GOT relocations previously done automatically with -pie.
>>>      - Remove need for dynamic PLT in modules.
>>>      - Support dymamic GOT for modules.
>>>    - rfc v2:
>>>      - Add support for global stack cookie while compiler default to fs without
>>>        mcmodel=kernel
>>>      - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>>        preserve.
>>>
>>> These patches make the changes necessary to build the kernel as Position
>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>>> the top 2G of the virtual address space. It allows to optionally extend the
>>> KASLR randomization range from 1G to 3G.
>>
>> Hi Thomas,
>>
>> I've applied your patches so that I can verify that SME works with PIE.
>> Unfortunately, I'm running into build warnings and errors when I enable
>> PIE.
>>
>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>
>>    drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>
>> Disabling CONFIG_STACK_VALIDATION suppresses those.
> 
> I ran into that, I plan to fix it in the next iteration.
> 
>>
>> But near the end of the build, I receive errors like this:
>>
>>    arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>    .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>>    .
>>    . about 10 more of the above type messages
>>    .
>>    make: *** [vmlinux] Error 1
>>    Error building kernel, exiting
>>
>> Are there any config options that should or should not be enabled when
>> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
> 
> I never ran into these ones and I tested compilers older and newer.
> What was your exact configuration?

I'll send you the config in a separate email.

Thanks,
Tom

> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>>> feedback for using -pie versus --emit-relocs and details on compiler code
>>> generation.
>>>
>>> The patches:
>>>    - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>    - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>>    - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>    - 15: Provide an option to default visibility to hidden except for key symbols.
>>>          It removes errors between compilation units.
>>>    - 16: Adapt relocation tool to handle PIE binary correctly.
>>>    - 19: Add support for global cookie.
>>>    - 20: Support ftrace with PIE (used on Ubuntu config).
>>>    - 21: Fix incorrect address marker on dump_pagetables.
>>>    - 22: Add option to move the module section just after the kernel.
>>>    - 23: Adapt module loading to support PIE with dynamic GOT.
>>>    - 24: Make the GOT read-only.
>>>    - 25: Add the CONFIG_X86_PIE option (off by default).
>>>    - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>    - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>>          from 1G to 3G (off by default).
>>>
>>> Performance/Size impact:
>>>
>>> Size of vmlinux (Default configuration):
>>>    File size:
>>>    - PIE disabled: +0.000031%
>>>    - PIE enabled: -3.210% (less relocations)
>>>    .text section:
>>>    - PIE disabled: +0.000644%
>>>    - PIE enabled: +0.837%
>>>
>>> Size of vmlinux (Ubuntu configuration):
>>>    File size:
>>>    - PIE disabled: -0.201%
>>>    - PIE enabled: -0.082%
>>>    .text section:
>>>    - PIE disabled: same
>>>    - PIE enabled: +1.319%
>>>
>>> Size of vmlinux (Default configuration + ORC):
>>>    File size:
>>>    - PIE enabled: -3.167%
>>>    .text section:
>>>    - PIE enabled: +0.814%
>>>
>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>    File size:
>>>    - PIE enabled: -3.167%
>>>    .text section:
>>>    - PIE enabled: +1.26%
>>>
>>> The size increase is mainly due to not having access to the 32-bit signed
>>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>>> code generation for kernel PIE.
>>>
>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>    - PIE disabled: no significant change (avg +0.1% on latest test).
>>>    - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>>
>>> slab_test (average of 10 runs):
>>>    - PIE disabled: no significant change (-2% on latest run, likely noise).
>>>    - PIE enabled: between -1% and +0.8% on latest runs.
>>>
>>> Kernbench (average of 10 Half and Optimal runs):
>>>    Elapsed Time:
>>>    - PIE disabled: no significant change (avg -0.239%)
>>>    - PIE enabled: average +0.07%
>>>    System Time:
>>>    - PIE disabled: no significant change (avg -0.277%)
>>>    - PIE enabled: average +0.7%
>>>
>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>
>>> diffstat:
>>>    Documentation/x86/x86_64/mm.txt              |    3
>>>    arch/x86/Kconfig                             |   43 ++++++
>>>    arch/x86/Makefile                            |   40 +++++
>>>    arch/x86/boot/boot.h                         |    2
>>>    arch/x86/boot/compressed/Makefile            |    5
>>>    arch/x86/boot/compressed/misc.c              |   10 +
>>>    arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>>    arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>>    arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>>    arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>>    arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>>    arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>>    arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>>    arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>>    arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>>    arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>>    arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>>    arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>>    arch/x86/entry/entry_32.S                    |    3
>>>    arch/x86/entry/entry_64.S                    |   29 ++--
>>>    arch/x86/include/asm/asm.h                   |   13 +
>>>    arch/x86/include/asm/bug.h                   |    2
>>>    arch/x86/include/asm/ftrace.h                |    6
>>>    arch/x86/include/asm/jump_label.h            |    8 -
>>>    arch/x86/include/asm/kvm_host.h              |    6
>>>    arch/x86/include/asm/module.h                |   11 +
>>>    arch/x86/include/asm/page_64_types.h         |    9 +
>>>    arch/x86/include/asm/paravirt_types.h        |   12 +
>>>    arch/x86/include/asm/percpu.h                |   25 ++-
>>>    arch/x86/include/asm/pgtable_64_types.h      |    6
>>>    arch/x86/include/asm/pm-trace.h              |    2
>>>    arch/x86/include/asm/processor.h             |   12 +
>>>    arch/x86/include/asm/sections.h              |    8 +
>>>    arch/x86/include/asm/setup.h                 |    2
>>>    arch/x86/include/asm/stackprotector.h        |   19 ++
>>>    arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>>    arch/x86/kernel/asm-offsets.c                |    3
>>>    arch/x86/kernel/asm-offsets_32.c             |    3
>>>    arch/x86/kernel/asm-offsets_64.c             |    3
>>>    arch/x86/kernel/cpu/common.c                 |    7 -
>>>    arch/x86/kernel/cpu/microcode/core.c         |    4
>>>    arch/x86/kernel/ftrace.c                     |   42 +++++-
>>>    arch/x86/kernel/head64.c                     |   32 +++-
>>>    arch/x86/kernel/head_32.S                    |    3
>>>    arch/x86/kernel/head_64.S                    |   41 +++++-
>>>    arch/x86/kernel/kvm.c                        |    6
>>>    arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>>    arch/x86/kernel/module.lds                   |    3
>>>    arch/x86/kernel/process.c                    |    5
>>>    arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>>    arch/x86/kernel/setup_percpu.c               |    2
>>>    arch/x86/kernel/vmlinux.lds.S                |   13 +
>>>    arch/x86/kvm/svm.c                           |    4
>>>    arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>>    arch/x86/mm/dump_pagetables.c                |   11 +
>>>    arch/x86/power/hibernate_asm_64.S            |    4
>>>    arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>>    arch/x86/tools/relocs.h                      |    4
>>>    arch/x86/tools/relocs_common.c               |   15 +-
>>>    arch/x86/xen/xen-asm.S                       |   12 -
>>>    arch/x86/xen/xen-head.S                      |    9 -
>>>    arch/x86/xen/xen-pvh.S                       |   13 +
>>>    drivers/base/firmware_class.c                |    4
>>>    include/asm-generic/sections.h               |    6
>>>    include/asm-generic/vmlinux.lds.h            |   12 +
>>>    include/linux/compiler.h                     |    8 +
>>>    init/Kconfig                                 |    9 +
>>>    kernel/kallsyms.c                            |   16 +-
>>>    kernel/trace/trace.h                         |    4
>>>    lib/dynamic_debug.c                          |    4
>>>    70 files changed, 1032 insertions(+), 308 deletions(-)
>>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 15:34     ` Thomas Garnier
  (?)
  (?)
@ 2017-10-12 15:51     ` Markus Trippelsdorf
  -1 siblings, 0 replies; 19+ messages in thread
From: Markus Trippelsdorf @ 2017-10-12 15:51 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Nicolas Pitre, Michal Hocko, Radim Krčmář,
	linux-doc, Daniel Micay, Len Brown, Peter Zijlstra,
	Christopher Li, Jan H . Schönherr, Alexei Starovoitov,
	virtualization, David Howells, Paul Gortmaker, Waiman Long,
	Pavel Machek, H . Peter Anvin, Kernel Hardening,
	Christoph Lameter, Thomas Gleixner, the arch/x86 maintainers,
	Herbert Xu, Daniel Borkmann

[-- Attachment #1: Type: text/plain, Size: 4354 bytes --]

On 2017.10.12 at 08:34 -0700, Thomas Garnier wrote:
> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> >> Changes:
> >>   - patch v1:
> >>     - Simplify ftrace implementation.
> >>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
> >>   - rfc v3:
> >>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
> >>       mapped memory. It also simplifies the relocation process.
> >>     - Move the start the module section next to the kernel. Remove the need for
> >>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
> >>     - Support for XEN PVH as 32-bit relocations can be ignored with
> >>       --emit-relocs.
> >>     - Support for GOT relocations previously done automatically with -pie.
> >>     - Remove need for dynamic PLT in modules.
> >>     - Support dymamic GOT for modules.
> >>   - rfc v2:
> >>     - Add support for global stack cookie while compiler default to fs without
> >>       mcmodel=kernel
> >>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
> >>       preserve.
> >>
> >> These patches make the changes necessary to build the kernel as Position
> >> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> >> the top 2G of the virtual address space. It allows to optionally extend the
> >> KASLR randomization range from 1G to 3G.
> >
> > Hi Thomas,
> >
> > I've applied your patches so that I can verify that SME works with PIE.
> > Unfortunately, I'm running into build warnings and errors when I enable
> > PIE.
> >
> > With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
> >
> >   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
> >
> > Disabling CONFIG_STACK_VALIDATION suppresses those.
> 
> I ran into that, I plan to fix it in the next iteration.
> 
> >
> > But near the end of the build, I receive errors like this:
> >
> >   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
> >   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
> >   .
> >   . about 10 more of the above type messages
> >   .
> >   make: *** [vmlinux] Error 1
> >   Error building kernel, exiting
> >
> > Are there any config options that should or should not be enabled when
> > building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> > using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
> 
> I never ran into these ones and I tested compilers older and newer.
> What was your exact configuration?

I get with gcc trunk and CONFIG_RANDOMIZE_BASE_LARGE=y:

...
  MODPOST vmlinux.o                         
  ld: failed to convert GOTPCREL relocation; relink with --no-relax

and after adding --no-relax to vmlinux_link() in scripts/link-vmlinux.sh:

  MODPOST vmlinux.o
virt/kvm/vfio.o: In function `kvm_vfio_update_coherency.isra.4':
vfio.c:(.text+0x63): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_external_check_extension'
virt/kvm/vfio.o: In function `kvm_vfio_destroy':
vfio.c:(.text+0xf7): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x10a): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
virt/kvm/vfio.o: In function `kvm_vfio_set_attr':
vfio.c:(.text+0x2bc): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_external_group_match_file'
vfio.c:(.text+0x307): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x31a): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
vfio.c:(.text+0x3b9): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_get_external_user'
vfio.c:(.text+0x462): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x4bd): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
make: *** [Makefile:1000: vmlinux] Error 1

Works fine with CONFIG_RANDOMIZE_BASE_LARGE unset.

-- 
Markus

[-- Attachment #2: config.gz --]
[-- Type: application/x-gunzip, Size: 19847 bytes --]

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-12 15:34     ` Thomas Garnier
  (?)
@ 2017-10-12 15:51     ` Markus Trippelsdorf
  -1 siblings, 0 replies; 19+ messages in thread
From: Markus Trippelsdorf @ 2017-10-12 15:51 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Nicolas Pitre, Michal Hocko, Radim Krčmář,
	linux-doc, Daniel Micay, Len Brown, Peter Zijlstra,
	Christopher Li, Jan H . Schönherr, Alexei Starovoitov,
	virtualization, David Howells, Paul Gortmaker, Waiman Long,
	Pavel Machek, H . Peter Anvin, Kernel Hardening,
	Christoph Lameter, Thomas Gleixner, the arch/x86 maintainers,
	Herbert Xu, Daniel Borkmann

[-- Attachment #1: Type: text/plain, Size: 4354 bytes --]

On 2017.10.12 at 08:34 -0700, Thomas Garnier wrote:
> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> > On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> >> Changes:
> >>   - patch v1:
> >>     - Simplify ftrace implementation.
> >>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
> >>   - rfc v3:
> >>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
> >>       mapped memory. It also simplifies the relocation process.
> >>     - Move the start the module section next to the kernel. Remove the need for
> >>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
> >>     - Support for XEN PVH as 32-bit relocations can be ignored with
> >>       --emit-relocs.
> >>     - Support for GOT relocations previously done automatically with -pie.
> >>     - Remove need for dynamic PLT in modules.
> >>     - Support dymamic GOT for modules.
> >>   - rfc v2:
> >>     - Add support for global stack cookie while compiler default to fs without
> >>       mcmodel=kernel
> >>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
> >>       preserve.
> >>
> >> These patches make the changes necessary to build the kernel as Position
> >> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> >> the top 2G of the virtual address space. It allows to optionally extend the
> >> KASLR randomization range from 1G to 3G.
> >
> > Hi Thomas,
> >
> > I've applied your patches so that I can verify that SME works with PIE.
> > Unfortunately, I'm running into build warnings and errors when I enable
> > PIE.
> >
> > With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
> >
> >   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
> >
> > Disabling CONFIG_STACK_VALIDATION suppresses those.
> 
> I ran into that, I plan to fix it in the next iteration.
> 
> >
> > But near the end of the build, I receive errors like this:
> >
> >   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
> >   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
> >   .
> >   . about 10 more of the above type messages
> >   .
> >   make: *** [vmlinux] Error 1
> >   Error building kernel, exiting
> >
> > Are there any config options that should or should not be enabled when
> > building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> > using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
> 
> I never ran into these ones and I tested compilers older and newer.
> What was your exact configuration?

I get with gcc trunk and CONFIG_RANDOMIZE_BASE_LARGE=y:

...
  MODPOST vmlinux.o                         
  ld: failed to convert GOTPCREL relocation; relink with --no-relax

and after adding --no-relax to vmlinux_link() in scripts/link-vmlinux.sh:

  MODPOST vmlinux.o
virt/kvm/vfio.o: In function `kvm_vfio_update_coherency.isra.4':
vfio.c:(.text+0x63): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_external_check_extension'
virt/kvm/vfio.o: In function `kvm_vfio_destroy':
vfio.c:(.text+0xf7): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x10a): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
virt/kvm/vfio.o: In function `kvm_vfio_set_attr':
vfio.c:(.text+0x2bc): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_external_group_match_file'
vfio.c:(.text+0x307): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x31a): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
vfio.c:(.text+0x3b9): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_get_external_user'
vfio.c:(.text+0x462): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_set_kvm'
vfio.c:(.text+0x4bd): relocation truncated to fit: R_X86_64_PLT32 against undefined symbol `vfio_group_put_external_user'
make: *** [Makefile:1000: vmlinux] Error 1

Works fine with CONFIG_RANDOMIZE_BASE_LARGE unset.

-- 
Markus

[-- Attachment #2: config.gz --]
[-- Type: application/x-gunzip, Size: 19847 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-11 21:34   ` Tom Lendacky
@ 2017-10-12 15:34     ` Thomas Garnier
  -1 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-12 15:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Kees Cook, Andrey Ryabinin, Matthias Kaehlcke, Andy Lutomirski,
	Kirill A . Shutemov, Borislav Petkov, Rafael J . Wysocki,
	Len Brown, Pavel Machek, Juergen Gross, Chris Wright,
	Alok Kataria, Rusty Russell, Tejun Heo, Christoph Lameter,
	Boris Ostrovsky

On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>> Changes:
>>   - patch v1:
>>     - Simplify ftrace implementation.
>>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>   - rfc v3:
>>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>       mapped memory. It also simplifies the relocation process.
>>     - Move the start the module section next to the kernel. Remove the need for
>>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>     - Support for XEN PVH as 32-bit relocations can be ignored with
>>       --emit-relocs.
>>     - Support for GOT relocations previously done automatically with -pie.
>>     - Remove need for dynamic PLT in modules.
>>     - Support dymamic GOT for modules.
>>   - rfc v2:
>>     - Add support for global stack cookie while compiler default to fs without
>>       mcmodel=kernel
>>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>       preserve.
>>
>> These patches make the changes necessary to build the kernel as Position
>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>> the top 2G of the virtual address space. It allows to optionally extend the
>> KASLR randomization range from 1G to 3G.
>
> Hi Thomas,
>
> I've applied your patches so that I can verify that SME works with PIE.
> Unfortunately, I'm running into build warnings and errors when I enable
> PIE.
>
> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>
>   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>
> Disabling CONFIG_STACK_VALIDATION suppresses those.

I ran into that, I plan to fix it in the next iteration.

>
> But near the end of the build, I receive errors like this:
>
>   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>   .
>   . about 10 more of the above type messages
>   .
>   make: *** [vmlinux] Error 1
>   Error building kernel, exiting
>
> Are there any config options that should or should not be enabled when
> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

I never ran into these ones and I tested compilers older and newer.
What was your exact configuration?

>
> Thanks,
> Tom
>
>>
>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>> feedback for using -pie versus --emit-relocs and details on compiler code
>> generation.
>>
>> The patches:
>>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>>   - 15: Provide an option to default visibility to hidden except for key symbols.
>>         It removes errors between compilation units.
>>   - 16: Adapt relocation tool to handle PIE binary correctly.
>>   - 19: Add support for global cookie.
>>   - 20: Support ftrace with PIE (used on Ubuntu config).
>>   - 21: Fix incorrect address marker on dump_pagetables.
>>   - 22: Add option to move the module section just after the kernel.
>>   - 23: Adapt module loading to support PIE with dynamic GOT.
>>   - 24: Make the GOT read-only.
>>   - 25: Add the CONFIG_X86_PIE option (off by default).
>>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>         from 1G to 3G (off by default).
>>
>> Performance/Size impact:
>>
>> Size of vmlinux (Default configuration):
>>   File size:
>>   - PIE disabled: +0.000031%
>>   - PIE enabled: -3.210% (less relocations)
>>   .text section:
>>   - PIE disabled: +0.000644%
>>   - PIE enabled: +0.837%
>>
>> Size of vmlinux (Ubuntu configuration):
>>   File size:
>>   - PIE disabled: -0.201%
>>   - PIE enabled: -0.082%
>>   .text section:
>>   - PIE disabled: same
>>   - PIE enabled: +1.319%
>>
>> Size of vmlinux (Default configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +0.814%
>>
>> Size of vmlinux (Ubuntu configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +1.26%
>>
>> The size increase is mainly due to not having access to the 32-bit signed
>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>> code generation for kernel PIE.
>>
>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>   - PIE disabled: no significant change (avg +0.1% on latest test).
>>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>
>> slab_test (average of 10 runs):
>>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>>   - PIE enabled: between -1% and +0.8% on latest runs.
>>
>> Kernbench (average of 10 Half and Optimal runs):
>>   Elapsed Time:
>>   - PIE disabled: no significant change (avg -0.239%)
>>   - PIE enabled: average +0.07%
>>   System Time:
>>   - PIE disabled: no significant change (avg -0.277%)
>>   - PIE enabled: average +0.7%
>>
>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>
>> diffstat:
>>   Documentation/x86/x86_64/mm.txt              |    3
>>   arch/x86/Kconfig                             |   43 ++++++
>>   arch/x86/Makefile                            |   40 +++++
>>   arch/x86/boot/boot.h                         |    2
>>   arch/x86/boot/compressed/Makefile            |    5
>>   arch/x86/boot/compressed/misc.c              |   10 +
>>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>   arch/x86/entry/entry_32.S                    |    3
>>   arch/x86/entry/entry_64.S                    |   29 ++--
>>   arch/x86/include/asm/asm.h                   |   13 +
>>   arch/x86/include/asm/bug.h                   |    2
>>   arch/x86/include/asm/ftrace.h                |    6
>>   arch/x86/include/asm/jump_label.h            |    8 -
>>   arch/x86/include/asm/kvm_host.h              |    6
>>   arch/x86/include/asm/module.h                |   11 +
>>   arch/x86/include/asm/page_64_types.h         |    9 +
>>   arch/x86/include/asm/paravirt_types.h        |   12 +
>>   arch/x86/include/asm/percpu.h                |   25 ++-
>>   arch/x86/include/asm/pgtable_64_types.h      |    6
>>   arch/x86/include/asm/pm-trace.h              |    2
>>   arch/x86/include/asm/processor.h             |   12 +
>>   arch/x86/include/asm/sections.h              |    8 +
>>   arch/x86/include/asm/setup.h                 |    2
>>   arch/x86/include/asm/stackprotector.h        |   19 ++
>>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>   arch/x86/kernel/asm-offsets.c                |    3
>>   arch/x86/kernel/asm-offsets_32.c             |    3
>>   arch/x86/kernel/asm-offsets_64.c             |    3
>>   arch/x86/kernel/cpu/common.c                 |    7 -
>>   arch/x86/kernel/cpu/microcode/core.c         |    4
>>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>>   arch/x86/kernel/head64.c                     |   32 +++-
>>   arch/x86/kernel/head_32.S                    |    3
>>   arch/x86/kernel/head_64.S                    |   41 +++++-
>>   arch/x86/kernel/kvm.c                        |    6
>>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>   arch/x86/kernel/module.lds                   |    3
>>   arch/x86/kernel/process.c                    |    5
>>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>   arch/x86/kernel/setup_percpu.c               |    2
>>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>>   arch/x86/kvm/svm.c                           |    4
>>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>   arch/x86/mm/dump_pagetables.c                |   11 +
>>   arch/x86/power/hibernate_asm_64.S            |    4
>>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>   arch/x86/tools/relocs.h                      |    4
>>   arch/x86/tools/relocs_common.c               |   15 +-
>>   arch/x86/xen/xen-asm.S                       |   12 -
>>   arch/x86/xen/xen-head.S                      |    9 -
>>   arch/x86/xen/xen-pvh.S                       |   13 +
>>   drivers/base/firmware_class.c                |    4
>>   include/asm-generic/sections.h               |    6
>>   include/asm-generic/vmlinux.lds.h            |   12 +
>>   include/linux/compiler.h                     |    8 +
>>   init/Kconfig                                 |    9 +
>>   kernel/kallsyms.c                            |   16 +-
>>   kernel/trace/trace.h                         |    4
>>   lib/dynamic_debug.c                          |    4
>>   70 files changed, 1032 insertions(+), 308 deletions(-)
>>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-12 15:34     ` Thomas Garnier
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-12 15:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Kees Cook, Andrey Ryabinin, Matthias Kaehlcke, Andy Lutomirski,
	Kirill A . Shutemov, Borislav Petkov, Rafael J . Wysocki,
	Len Brown, Pavel Machek, Juergen Gross, Chris Wright,
	Alok Kataria, Rusty Russell, Tejun Heo, Christoph Lameter,
	Boris Ostrovsky

On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>> Changes:
>>   - patch v1:
>>     - Simplify ftrace implementation.
>>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>   - rfc v3:
>>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>       mapped memory. It also simplifies the relocation process.
>>     - Move the start the module section next to the kernel. Remove the need for
>>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>     - Support for XEN PVH as 32-bit relocations can be ignored with
>>       --emit-relocs.
>>     - Support for GOT relocations previously done automatically with -pie.
>>     - Remove need for dynamic PLT in modules.
>>     - Support dymamic GOT for modules.
>>   - rfc v2:
>>     - Add support for global stack cookie while compiler default to fs without
>>       mcmodel=kernel
>>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>       preserve.
>>
>> These patches make the changes necessary to build the kernel as Position
>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>> the top 2G of the virtual address space. It allows to optionally extend the
>> KASLR randomization range from 1G to 3G.
>
> Hi Thomas,
>
> I've applied your patches so that I can verify that SME works with PIE.
> Unfortunately, I'm running into build warnings and errors when I enable
> PIE.
>
> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>
>   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>
> Disabling CONFIG_STACK_VALIDATION suppresses those.

I ran into that, I plan to fix it in the next iteration.

>
> But near the end of the build, I receive errors like this:
>
>   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>   .
>   . about 10 more of the above type messages
>   .
>   make: *** [vmlinux] Error 1
>   Error building kernel, exiting
>
> Are there any config options that should or should not be enabled when
> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

I never ran into these ones and I tested compilers older and newer.
What was your exact configuration?

>
> Thanks,
> Tom
>
>>
>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>> feedback for using -pie versus --emit-relocs and details on compiler code
>> generation.
>>
>> The patches:
>>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>>   - 15: Provide an option to default visibility to hidden except for key symbols.
>>         It removes errors between compilation units.
>>   - 16: Adapt relocation tool to handle PIE binary correctly.
>>   - 19: Add support for global cookie.
>>   - 20: Support ftrace with PIE (used on Ubuntu config).
>>   - 21: Fix incorrect address marker on dump_pagetables.
>>   - 22: Add option to move the module section just after the kernel.
>>   - 23: Adapt module loading to support PIE with dynamic GOT.
>>   - 24: Make the GOT read-only.
>>   - 25: Add the CONFIG_X86_PIE option (off by default).
>>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>         from 1G to 3G (off by default).
>>
>> Performance/Size impact:
>>
>> Size of vmlinux (Default configuration):
>>   File size:
>>   - PIE disabled: +0.000031%
>>   - PIE enabled: -3.210% (less relocations)
>>   .text section:
>>   - PIE disabled: +0.000644%
>>   - PIE enabled: +0.837%
>>
>> Size of vmlinux (Ubuntu configuration):
>>   File size:
>>   - PIE disabled: -0.201%
>>   - PIE enabled: -0.082%
>>   .text section:
>>   - PIE disabled: same
>>   - PIE enabled: +1.319%
>>
>> Size of vmlinux (Default configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +0.814%
>>
>> Size of vmlinux (Ubuntu configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +1.26%
>>
>> The size increase is mainly due to not having access to the 32-bit signed
>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>> code generation for kernel PIE.
>>
>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>   - PIE disabled: no significant change (avg +0.1% on latest test).
>>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>
>> slab_test (average of 10 runs):
>>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>>   - PIE enabled: between -1% and +0.8% on latest runs.
>>
>> Kernbench (average of 10 Half and Optimal runs):
>>   Elapsed Time:
>>   - PIE disabled: no significant change (avg -0.239%)
>>   - PIE enabled: average +0.07%
>>   System Time:
>>   - PIE disabled: no significant change (avg -0.277%)
>>   - PIE enabled: average +0.7%
>>
>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>
>> diffstat:
>>   Documentation/x86/x86_64/mm.txt              |    3
>>   arch/x86/Kconfig                             |   43 ++++++
>>   arch/x86/Makefile                            |   40 +++++
>>   arch/x86/boot/boot.h                         |    2
>>   arch/x86/boot/compressed/Makefile            |    5
>>   arch/x86/boot/compressed/misc.c              |   10 +
>>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>   arch/x86/entry/entry_32.S                    |    3
>>   arch/x86/entry/entry_64.S                    |   29 ++--
>>   arch/x86/include/asm/asm.h                   |   13 +
>>   arch/x86/include/asm/bug.h                   |    2
>>   arch/x86/include/asm/ftrace.h                |    6
>>   arch/x86/include/asm/jump_label.h            |    8 -
>>   arch/x86/include/asm/kvm_host.h              |    6
>>   arch/x86/include/asm/module.h                |   11 +
>>   arch/x86/include/asm/page_64_types.h         |    9 +
>>   arch/x86/include/asm/paravirt_types.h        |   12 +
>>   arch/x86/include/asm/percpu.h                |   25 ++-
>>   arch/x86/include/asm/pgtable_64_types.h      |    6
>>   arch/x86/include/asm/pm-trace.h              |    2
>>   arch/x86/include/asm/processor.h             |   12 +
>>   arch/x86/include/asm/sections.h              |    8 +
>>   arch/x86/include/asm/setup.h                 |    2
>>   arch/x86/include/asm/stackprotector.h        |   19 ++
>>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>   arch/x86/kernel/asm-offsets.c                |    3
>>   arch/x86/kernel/asm-offsets_32.c             |    3
>>   arch/x86/kernel/asm-offsets_64.c             |    3
>>   arch/x86/kernel/cpu/common.c                 |    7 -
>>   arch/x86/kernel/cpu/microcode/core.c         |    4
>>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>>   arch/x86/kernel/head64.c                     |   32 +++-
>>   arch/x86/kernel/head_32.S                    |    3
>>   arch/x86/kernel/head_64.S                    |   41 +++++-
>>   arch/x86/kernel/kvm.c                        |    6
>>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>   arch/x86/kernel/module.lds                   |    3
>>   arch/x86/kernel/process.c                    |    5
>>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>   arch/x86/kernel/setup_percpu.c               |    2
>>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>>   arch/x86/kvm/svm.c                           |    4
>>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>   arch/x86/mm/dump_pagetables.c                |   11 +
>>   arch/x86/power/hibernate_asm_64.S            |    4
>>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>   arch/x86/tools/relocs.h                      |    4
>>   arch/x86/tools/relocs_common.c               |   15 +-
>>   arch/x86/xen/xen-asm.S                       |   12 -
>>   arch/x86/xen/xen-head.S                      |    9 -
>>   arch/x86/xen/xen-pvh.S                       |   13 +
>>   drivers/base/firmware_class.c                |    4
>>   include/asm-generic/sections.h               |    6
>>   include/asm-generic/vmlinux.lds.h            |   12 +
>>   include/linux/compiler.h                     |    8 +
>>   init/Kconfig                                 |    9 +
>>   kernel/kallsyms.c                            |   16 +-
>>   kernel/trace/trace.h                         |    4
>>   lib/dynamic_debug.c                          |    4
>>   70 files changed, 1032 insertions(+), 308 deletions(-)
>>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-11 21:34   ` Tom Lendacky
  (?)
@ 2017-10-12 15:34   ` Thomas Garnier via Virtualization
  -1 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier via Virtualization @ 2017-10-12 15:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>> Changes:
>>   - patch v1:
>>     - Simplify ftrace implementation.
>>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>   - rfc v3:
>>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>       mapped memory. It also simplifies the relocation process.
>>     - Move the start the module section next to the kernel. Remove the need for
>>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>     - Support for XEN PVH as 32-bit relocations can be ignored with
>>       --emit-relocs.
>>     - Support for GOT relocations previously done automatically with -pie.
>>     - Remove need for dynamic PLT in modules.
>>     - Support dymamic GOT for modules.
>>   - rfc v2:
>>     - Add support for global stack cookie while compiler default to fs without
>>       mcmodel=kernel
>>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>       preserve.
>>
>> These patches make the changes necessary to build the kernel as Position
>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>> the top 2G of the virtual address space. It allows to optionally extend the
>> KASLR randomization range from 1G to 3G.
>
> Hi Thomas,
>
> I've applied your patches so that I can verify that SME works with PIE.
> Unfortunately, I'm running into build warnings and errors when I enable
> PIE.
>
> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>
>   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>
> Disabling CONFIG_STACK_VALIDATION suppresses those.

I ran into that, I plan to fix it in the next iteration.

>
> But near the end of the build, I receive errors like this:
>
>   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>   .
>   . about 10 more of the above type messages
>   .
>   make: *** [vmlinux] Error 1
>   Error building kernel, exiting
>
> Are there any config options that should or should not be enabled when
> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

I never ran into these ones and I tested compilers older and newer.
What was your exact configuration?

>
> Thanks,
> Tom
>
>>
>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>> feedback for using -pie versus --emit-relocs and details on compiler code
>> generation.
>>
>> The patches:
>>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>>   - 15: Provide an option to default visibility to hidden except for key symbols.
>>         It removes errors between compilation units.
>>   - 16: Adapt relocation tool to handle PIE binary correctly.
>>   - 19: Add support for global cookie.
>>   - 20: Support ftrace with PIE (used on Ubuntu config).
>>   - 21: Fix incorrect address marker on dump_pagetables.
>>   - 22: Add option to move the module section just after the kernel.
>>   - 23: Adapt module loading to support PIE with dynamic GOT.
>>   - 24: Make the GOT read-only.
>>   - 25: Add the CONFIG_X86_PIE option (off by default).
>>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>         from 1G to 3G (off by default).
>>
>> Performance/Size impact:
>>
>> Size of vmlinux (Default configuration):
>>   File size:
>>   - PIE disabled: +0.000031%
>>   - PIE enabled: -3.210% (less relocations)
>>   .text section:
>>   - PIE disabled: +0.000644%
>>   - PIE enabled: +0.837%
>>
>> Size of vmlinux (Ubuntu configuration):
>>   File size:
>>   - PIE disabled: -0.201%
>>   - PIE enabled: -0.082%
>>   .text section:
>>   - PIE disabled: same
>>   - PIE enabled: +1.319%
>>
>> Size of vmlinux (Default configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +0.814%
>>
>> Size of vmlinux (Ubuntu configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +1.26%
>>
>> The size increase is mainly due to not having access to the 32-bit signed
>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>> code generation for kernel PIE.
>>
>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>   - PIE disabled: no significant change (avg +0.1% on latest test).
>>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>
>> slab_test (average of 10 runs):
>>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>>   - PIE enabled: between -1% and +0.8% on latest runs.
>>
>> Kernbench (average of 10 Half and Optimal runs):
>>   Elapsed Time:
>>   - PIE disabled: no significant change (avg -0.239%)
>>   - PIE enabled: average +0.07%
>>   System Time:
>>   - PIE disabled: no significant change (avg -0.277%)
>>   - PIE enabled: average +0.7%
>>
>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>
>> diffstat:
>>   Documentation/x86/x86_64/mm.txt              |    3
>>   arch/x86/Kconfig                             |   43 ++++++
>>   arch/x86/Makefile                            |   40 +++++
>>   arch/x86/boot/boot.h                         |    2
>>   arch/x86/boot/compressed/Makefile            |    5
>>   arch/x86/boot/compressed/misc.c              |   10 +
>>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>   arch/x86/entry/entry_32.S                    |    3
>>   arch/x86/entry/entry_64.S                    |   29 ++--
>>   arch/x86/include/asm/asm.h                   |   13 +
>>   arch/x86/include/asm/bug.h                   |    2
>>   arch/x86/include/asm/ftrace.h                |    6
>>   arch/x86/include/asm/jump_label.h            |    8 -
>>   arch/x86/include/asm/kvm_host.h              |    6
>>   arch/x86/include/asm/module.h                |   11 +
>>   arch/x86/include/asm/page_64_types.h         |    9 +
>>   arch/x86/include/asm/paravirt_types.h        |   12 +
>>   arch/x86/include/asm/percpu.h                |   25 ++-
>>   arch/x86/include/asm/pgtable_64_types.h      |    6
>>   arch/x86/include/asm/pm-trace.h              |    2
>>   arch/x86/include/asm/processor.h             |   12 +
>>   arch/x86/include/asm/sections.h              |    8 +
>>   arch/x86/include/asm/setup.h                 |    2
>>   arch/x86/include/asm/stackprotector.h        |   19 ++
>>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>   arch/x86/kernel/asm-offsets.c                |    3
>>   arch/x86/kernel/asm-offsets_32.c             |    3
>>   arch/x86/kernel/asm-offsets_64.c             |    3
>>   arch/x86/kernel/cpu/common.c                 |    7 -
>>   arch/x86/kernel/cpu/microcode/core.c         |    4
>>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>>   arch/x86/kernel/head64.c                     |   32 +++-
>>   arch/x86/kernel/head_32.S                    |    3
>>   arch/x86/kernel/head_64.S                    |   41 +++++-
>>   arch/x86/kernel/kvm.c                        |    6
>>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>   arch/x86/kernel/module.lds                   |    3
>>   arch/x86/kernel/process.c                    |    5
>>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>   arch/x86/kernel/setup_percpu.c               |    2
>>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>>   arch/x86/kvm/svm.c                           |    4
>>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>   arch/x86/mm/dump_pagetables.c                |   11 +
>>   arch/x86/power/hibernate_asm_64.S            |    4
>>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>   arch/x86/tools/relocs.h                      |    4
>>   arch/x86/tools/relocs_common.c               |   15 +-
>>   arch/x86/xen/xen-asm.S                       |   12 -
>>   arch/x86/xen/xen-head.S                      |    9 -
>>   arch/x86/xen/xen-pvh.S                       |   13 +
>>   drivers/base/firmware_class.c                |    4
>>   include/asm-generic/sections.h               |    6
>>   include/asm-generic/vmlinux.lds.h            |   12 +
>>   include/linux/compiler.h                     |    8 +
>>   init/Kconfig                                 |    9 +
>>   kernel/kallsyms.c                            |   16 +-
>>   kernel/trace/trace.h                         |    4
>>   lib/dynamic_debug.c                          |    4
>>   70 files changed, 1032 insertions(+), 308 deletions(-)
>>



-- 
Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-11 21:34   ` Tom Lendacky
  (?)
  (?)
@ 2017-10-12 15:34   ` Thomas Garnier
  -1 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-12 15:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Nicolas Pitre, Michal Hocko, linux-doc, Daniel Micay,
	Radim Krčmář,
	Peter Zijlstra, Christopher Li, Jan H . Schönherr,
	Alexei Starovoitov, virtualization, David Howells,
	Paul Gortmaker, Waiman Long, Pavel Machek, H . Peter Anvin,
	Kernel Hardening, Christoph Lameter, Thomas Gleixner,
	the arch/x86 maintainers, Herbert Xu, Daniel Borkmann,
	Jonathan Corbet

On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>> Changes:
>>   - patch v1:
>>     - Simplify ftrace implementation.
>>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>   - rfc v3:
>>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>       mapped memory. It also simplifies the relocation process.
>>     - Move the start the module section next to the kernel. Remove the need for
>>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>     - Support for XEN PVH as 32-bit relocations can be ignored with
>>       --emit-relocs.
>>     - Support for GOT relocations previously done automatically with -pie.
>>     - Remove need for dynamic PLT in modules.
>>     - Support dymamic GOT for modules.
>>   - rfc v2:
>>     - Add support for global stack cookie while compiler default to fs without
>>       mcmodel=kernel
>>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>       preserve.
>>
>> These patches make the changes necessary to build the kernel as Position
>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>> the top 2G of the virtual address space. It allows to optionally extend the
>> KASLR randomization range from 1G to 3G.
>
> Hi Thomas,
>
> I've applied your patches so that I can verify that SME works with PIE.
> Unfortunately, I'm running into build warnings and errors when I enable
> PIE.
>
> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>
>   drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>
> Disabling CONFIG_STACK_VALIDATION suppresses those.

I ran into that, I plan to fix it in the next iteration.

>
> But near the end of the build, I receive errors like this:
>
>   arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>   .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>   .
>   . about 10 more of the above type messages
>   .
>   make: *** [vmlinux] Error 1
>   Error building kernel, exiting
>
> Are there any config options that should or should not be enabled when
> building with PIE enabled?  Is there a compiler requirement for PIE (I'm
> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

I never ran into these ones and I tested compilers older and newer.
What was your exact configuration?

>
> Thanks,
> Tom
>
>>
>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>> feedback for using -pie versus --emit-relocs and details on compiler code
>> generation.
>>
>> The patches:
>>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>>   - 15: Provide an option to default visibility to hidden except for key symbols.
>>         It removes errors between compilation units.
>>   - 16: Adapt relocation tool to handle PIE binary correctly.
>>   - 19: Add support for global cookie.
>>   - 20: Support ftrace with PIE (used on Ubuntu config).
>>   - 21: Fix incorrect address marker on dump_pagetables.
>>   - 22: Add option to move the module section just after the kernel.
>>   - 23: Adapt module loading to support PIE with dynamic GOT.
>>   - 24: Make the GOT read-only.
>>   - 25: Add the CONFIG_X86_PIE option (off by default).
>>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>         from 1G to 3G (off by default).
>>
>> Performance/Size impact:
>>
>> Size of vmlinux (Default configuration):
>>   File size:
>>   - PIE disabled: +0.000031%
>>   - PIE enabled: -3.210% (less relocations)
>>   .text section:
>>   - PIE disabled: +0.000644%
>>   - PIE enabled: +0.837%
>>
>> Size of vmlinux (Ubuntu configuration):
>>   File size:
>>   - PIE disabled: -0.201%
>>   - PIE enabled: -0.082%
>>   .text section:
>>   - PIE disabled: same
>>   - PIE enabled: +1.319%
>>
>> Size of vmlinux (Default configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +0.814%
>>
>> Size of vmlinux (Ubuntu configuration + ORC):
>>   File size:
>>   - PIE enabled: -3.167%
>>   .text section:
>>   - PIE enabled: +1.26%
>>
>> The size increase is mainly due to not having access to the 32-bit signed
>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>> code generation for kernel PIE.
>>
>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>   - PIE disabled: no significant change (avg +0.1% on latest test).
>>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>
>> slab_test (average of 10 runs):
>>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>>   - PIE enabled: between -1% and +0.8% on latest runs.
>>
>> Kernbench (average of 10 Half and Optimal runs):
>>   Elapsed Time:
>>   - PIE disabled: no significant change (avg -0.239%)
>>   - PIE enabled: average +0.07%
>>   System Time:
>>   - PIE disabled: no significant change (avg -0.277%)
>>   - PIE enabled: average +0.7%
>>
>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>
>> diffstat:
>>   Documentation/x86/x86_64/mm.txt              |    3
>>   arch/x86/Kconfig                             |   43 ++++++
>>   arch/x86/Makefile                            |   40 +++++
>>   arch/x86/boot/boot.h                         |    2
>>   arch/x86/boot/compressed/Makefile            |    5
>>   arch/x86/boot/compressed/misc.c              |   10 +
>>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>>   arch/x86/entry/entry_32.S                    |    3
>>   arch/x86/entry/entry_64.S                    |   29 ++--
>>   arch/x86/include/asm/asm.h                   |   13 +
>>   arch/x86/include/asm/bug.h                   |    2
>>   arch/x86/include/asm/ftrace.h                |    6
>>   arch/x86/include/asm/jump_label.h            |    8 -
>>   arch/x86/include/asm/kvm_host.h              |    6
>>   arch/x86/include/asm/module.h                |   11 +
>>   arch/x86/include/asm/page_64_types.h         |    9 +
>>   arch/x86/include/asm/paravirt_types.h        |   12 +
>>   arch/x86/include/asm/percpu.h                |   25 ++-
>>   arch/x86/include/asm/pgtable_64_types.h      |    6
>>   arch/x86/include/asm/pm-trace.h              |    2
>>   arch/x86/include/asm/processor.h             |   12 +
>>   arch/x86/include/asm/sections.h              |    8 +
>>   arch/x86/include/asm/setup.h                 |    2
>>   arch/x86/include/asm/stackprotector.h        |   19 ++
>>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>>   arch/x86/kernel/asm-offsets.c                |    3
>>   arch/x86/kernel/asm-offsets_32.c             |    3
>>   arch/x86/kernel/asm-offsets_64.c             |    3
>>   arch/x86/kernel/cpu/common.c                 |    7 -
>>   arch/x86/kernel/cpu/microcode/core.c         |    4
>>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>>   arch/x86/kernel/head64.c                     |   32 +++-
>>   arch/x86/kernel/head_32.S                    |    3
>>   arch/x86/kernel/head_64.S                    |   41 +++++-
>>   arch/x86/kernel/kvm.c                        |    6
>>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>>   arch/x86/kernel/module.lds                   |    3
>>   arch/x86/kernel/process.c                    |    5
>>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>>   arch/x86/kernel/setup_percpu.c               |    2
>>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>>   arch/x86/kvm/svm.c                           |    4
>>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>>   arch/x86/mm/dump_pagetables.c                |   11 +
>>   arch/x86/power/hibernate_asm_64.S            |    4
>>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>>   arch/x86/tools/relocs.h                      |    4
>>   arch/x86/tools/relocs_common.c               |   15 +-
>>   arch/x86/xen/xen-asm.S                       |   12 -
>>   arch/x86/xen/xen-head.S                      |    9 -
>>   arch/x86/xen/xen-pvh.S                       |   13 +
>>   drivers/base/firmware_class.c                |    4
>>   include/asm-generic/sections.h               |    6
>>   include/asm-generic/vmlinux.lds.h            |   12 +
>>   include/linux/compiler.h                     |    8 +
>>   init/Kconfig                                 |    9 +
>>   kernel/kallsyms.c                            |   16 +-
>>   kernel/trace/trace.h                         |    4
>>   lib/dynamic_debug.c                          |    4
>>   70 files changed, 1032 insertions(+), 308 deletions(-)
>>



-- 
Thomas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-11 20:30 ` Thomas Garnier
@ 2017-10-11 21:34   ` Tom Lendacky
  -1 siblings, 0 replies; 19+ messages in thread
From: Tom Lendacky @ 2017-10-11 21:34 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Andy Lutomirski, Kirill A . Shutemov, Borislav Petkov,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Juergen Gross,
	Chris Wright, Alok Kataria, Rusty Russell, Tejun Heo,
	Christoph Lameter
  Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
	virtualization, linux-sparse, linux-crypto, kernel-hardening,
	xen-devel

On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> Changes:
>   - patch v1:
>     - Simplify ftrace implementation.
>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>   - rfc v3:
>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>       mapped memory. It also simplifies the relocation process.
>     - Move the start the module section next to the kernel. Remove the need for
>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>     - Support for XEN PVH as 32-bit relocations can be ignored with
>       --emit-relocs.
>     - Support for GOT relocations previously done automatically with -pie.
>     - Remove need for dynamic PLT in modules.
>     - Support dymamic GOT for modules.
>   - rfc v2:
>     - Add support for global stack cookie while compiler default to fs without
>       mcmodel=kernel
>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>       preserve.
> 
> These patches make the changes necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> the top 2G of the virtual address space. It allows to optionally extend the
> KASLR randomization range from 1G to 3G.

Hi Thomas,

I've applied your patches so that I can verify that SME works with PIE.
Unfortunately, I'm running into build warnings and errors when I enable
PIE.

With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:

  drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup

Disabling CONFIG_STACK_VALIDATION suppresses those.

But near the end of the build, I receive errors like this:

  arch/x86/kernel/setup.o: In function `dump_kernel_offset':
  .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
  .
  . about 10 more of the above type messages
  .
  make: *** [vmlinux] Error 1
  Error building kernel, exiting

Are there any config options that should or should not be enabled when
building with PIE enabled?  Is there a compiler requirement for PIE (I'm
using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

Thanks,
Tom

> 
> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
> feedback for using -pie versus --emit-relocs and details on compiler code
> generation.
> 
> The patches:
>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>   - 15: Provide an option to default visibility to hidden except for key symbols.
>         It removes errors between compilation units.
>   - 16: Adapt relocation tool to handle PIE binary correctly.
>   - 19: Add support for global cookie.
>   - 20: Support ftrace with PIE (used on Ubuntu config).
>   - 21: Fix incorrect address marker on dump_pagetables.
>   - 22: Add option to move the module section just after the kernel.
>   - 23: Adapt module loading to support PIE with dynamic GOT.
>   - 24: Make the GOT read-only.
>   - 25: Add the CONFIG_X86_PIE option (off by default).
>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>         from 1G to 3G (off by default).
> 
> Performance/Size impact:
> 
> Size of vmlinux (Default configuration):
>   File size:
>   - PIE disabled: +0.000031%
>   - PIE enabled: -3.210% (less relocations)
>   .text section:
>   - PIE disabled: +0.000644%
>   - PIE enabled: +0.837%
> 
> Size of vmlinux (Ubuntu configuration):
>   File size:
>   - PIE disabled: -0.201%
>   - PIE enabled: -0.082%
>   .text section:
>   - PIE disabled: same
>   - PIE enabled: +1.319%
> 
> Size of vmlinux (Default configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +0.814%
> 
> Size of vmlinux (Ubuntu configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +1.26%
> 
> The size increase is mainly due to not having access to the 32-bit signed
> relocation that can be used with mcmodel=kernel. A small part is due to reduced
> optimization for PIE code. This bug [1] was opened with gcc to provide a better
> code generation for kernel PIE.
> 
> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>   - PIE disabled: no significant change (avg +0.1% on latest test).
>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
> 
> slab_test (average of 10 runs):
>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>   - PIE enabled: between -1% and +0.8% on latest runs.
> 
> Kernbench (average of 10 Half and Optimal runs):
>   Elapsed Time:
>   - PIE disabled: no significant change (avg -0.239%)
>   - PIE enabled: average +0.07%
>   System Time:
>   - PIE disabled: no significant change (avg -0.277%)
>   - PIE enabled: average +0.7%
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
> 
> diffstat:
>   Documentation/x86/x86_64/mm.txt              |    3
>   arch/x86/Kconfig                             |   43 ++++++
>   arch/x86/Makefile                            |   40 +++++
>   arch/x86/boot/boot.h                         |    2
>   arch/x86/boot/compressed/Makefile            |    5
>   arch/x86/boot/compressed/misc.c              |   10 +
>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>   arch/x86/entry/entry_32.S                    |    3
>   arch/x86/entry/entry_64.S                    |   29 ++--
>   arch/x86/include/asm/asm.h                   |   13 +
>   arch/x86/include/asm/bug.h                   |    2
>   arch/x86/include/asm/ftrace.h                |    6
>   arch/x86/include/asm/jump_label.h            |    8 -
>   arch/x86/include/asm/kvm_host.h              |    6
>   arch/x86/include/asm/module.h                |   11 +
>   arch/x86/include/asm/page_64_types.h         |    9 +
>   arch/x86/include/asm/paravirt_types.h        |   12 +
>   arch/x86/include/asm/percpu.h                |   25 ++-
>   arch/x86/include/asm/pgtable_64_types.h      |    6
>   arch/x86/include/asm/pm-trace.h              |    2
>   arch/x86/include/asm/processor.h             |   12 +
>   arch/x86/include/asm/sections.h              |    8 +
>   arch/x86/include/asm/setup.h                 |    2
>   arch/x86/include/asm/stackprotector.h        |   19 ++
>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>   arch/x86/kernel/asm-offsets.c                |    3
>   arch/x86/kernel/asm-offsets_32.c             |    3
>   arch/x86/kernel/asm-offsets_64.c             |    3
>   arch/x86/kernel/cpu/common.c                 |    7 -
>   arch/x86/kernel/cpu/microcode/core.c         |    4
>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>   arch/x86/kernel/head64.c                     |   32 +++-
>   arch/x86/kernel/head_32.S                    |    3
>   arch/x86/kernel/head_64.S                    |   41 +++++-
>   arch/x86/kernel/kvm.c                        |    6
>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>   arch/x86/kernel/module.lds                   |    3
>   arch/x86/kernel/process.c                    |    5
>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>   arch/x86/kernel/setup_percpu.c               |    2
>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>   arch/x86/kvm/svm.c                           |    4
>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>   arch/x86/mm/dump_pagetables.c                |   11 +
>   arch/x86/power/hibernate_asm_64.S            |    4
>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>   arch/x86/tools/relocs.h                      |    4
>   arch/x86/tools/relocs_common.c               |   15 +-
>   arch/x86/xen/xen-asm.S                       |   12 -
>   arch/x86/xen/xen-head.S                      |    9 -
>   arch/x86/xen/xen-pvh.S                       |   13 +
>   drivers/base/firmware_class.c                |    4
>   include/asm-generic/sections.h               |    6
>   include/asm-generic/vmlinux.lds.h            |   12 +
>   include/linux/compiler.h                     |    8 +
>   init/Kconfig                                 |    9 +
>   kernel/kallsyms.c                            |   16 +-
>   kernel/trace/trace.h                         |    4
>   lib/dynamic_debug.c                          |    4
>   70 files changed, 1032 insertions(+), 308 deletions(-)
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-11 21:34   ` Tom Lendacky
  0 siblings, 0 replies; 19+ messages in thread
From: Tom Lendacky @ 2017-10-11 21:34 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Andy Lutomirski, Kirill A . Shutemov, Borislav Petkov,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Juergen Gross,
	Chris Wright, Alok Kataria, Rusty Russell, Tejun Heo,
	Christoph Lameter
  Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
	virtualization, linux-sparse, linux-crypto, kernel-hardening,
	xen-devel

On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> Changes:
>   - patch v1:
>     - Simplify ftrace implementation.
>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>   - rfc v3:
>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>       mapped memory. It also simplifies the relocation process.
>     - Move the start the module section next to the kernel. Remove the need for
>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>     - Support for XEN PVH as 32-bit relocations can be ignored with
>       --emit-relocs.
>     - Support for GOT relocations previously done automatically with -pie.
>     - Remove need for dynamic PLT in modules.
>     - Support dymamic GOT for modules.
>   - rfc v2:
>     - Add support for global stack cookie while compiler default to fs without
>       mcmodel=kernel
>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>       preserve.
> 
> These patches make the changes necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> the top 2G of the virtual address space. It allows to optionally extend the
> KASLR randomization range from 1G to 3G.

Hi Thomas,

I've applied your patches so that I can verify that SME works with PIE.
Unfortunately, I'm running into build warnings and errors when I enable
PIE.

With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:

  drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup

Disabling CONFIG_STACK_VALIDATION suppresses those.

But near the end of the build, I receive errors like this:

  arch/x86/kernel/setup.o: In function `dump_kernel_offset':
  .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
  .
  . about 10 more of the above type messages
  .
  make: *** [vmlinux] Error 1
  Error building kernel, exiting

Are there any config options that should or should not be enabled when
building with PIE enabled?  Is there a compiler requirement for PIE (I'm
using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

Thanks,
Tom

> 
> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
> feedback for using -pie versus --emit-relocs and details on compiler code
> generation.
> 
> The patches:
>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>   - 15: Provide an option to default visibility to hidden except for key symbols.
>         It removes errors between compilation units.
>   - 16: Adapt relocation tool to handle PIE binary correctly.
>   - 19: Add support for global cookie.
>   - 20: Support ftrace with PIE (used on Ubuntu config).
>   - 21: Fix incorrect address marker on dump_pagetables.
>   - 22: Add option to move the module section just after the kernel.
>   - 23: Adapt module loading to support PIE with dynamic GOT.
>   - 24: Make the GOT read-only.
>   - 25: Add the CONFIG_X86_PIE option (off by default).
>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>         from 1G to 3G (off by default).
> 
> Performance/Size impact:
> 
> Size of vmlinux (Default configuration):
>   File size:
>   - PIE disabled: +0.000031%
>   - PIE enabled: -3.210% (less relocations)
>   .text section:
>   - PIE disabled: +0.000644%
>   - PIE enabled: +0.837%
> 
> Size of vmlinux (Ubuntu configuration):
>   File size:
>   - PIE disabled: -0.201%
>   - PIE enabled: -0.082%
>   .text section:
>   - PIE disabled: same
>   - PIE enabled: +1.319%
> 
> Size of vmlinux (Default configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +0.814%
> 
> Size of vmlinux (Ubuntu configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +1.26%
> 
> The size increase is mainly due to not having access to the 32-bit signed
> relocation that can be used with mcmodel=kernel. A small part is due to reduced
> optimization for PIE code. This bug [1] was opened with gcc to provide a better
> code generation for kernel PIE.
> 
> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>   - PIE disabled: no significant change (avg +0.1% on latest test).
>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
> 
> slab_test (average of 10 runs):
>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>   - PIE enabled: between -1% and +0.8% on latest runs.
> 
> Kernbench (average of 10 Half and Optimal runs):
>   Elapsed Time:
>   - PIE disabled: no significant change (avg -0.239%)
>   - PIE enabled: average +0.07%
>   System Time:
>   - PIE disabled: no significant change (avg -0.277%)
>   - PIE enabled: average +0.7%
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
> 
> diffstat:
>   Documentation/x86/x86_64/mm.txt              |    3
>   arch/x86/Kconfig                             |   43 ++++++
>   arch/x86/Makefile                            |   40 +++++
>   arch/x86/boot/boot.h                         |    2
>   arch/x86/boot/compressed/Makefile            |    5
>   arch/x86/boot/compressed/misc.c              |   10 +
>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>   arch/x86/entry/entry_32.S                    |    3
>   arch/x86/entry/entry_64.S                    |   29 ++--
>   arch/x86/include/asm/asm.h                   |   13 +
>   arch/x86/include/asm/bug.h                   |    2
>   arch/x86/include/asm/ftrace.h                |    6
>   arch/x86/include/asm/jump_label.h            |    8 -
>   arch/x86/include/asm/kvm_host.h              |    6
>   arch/x86/include/asm/module.h                |   11 +
>   arch/x86/include/asm/page_64_types.h         |    9 +
>   arch/x86/include/asm/paravirt_types.h        |   12 +
>   arch/x86/include/asm/percpu.h                |   25 ++-
>   arch/x86/include/asm/pgtable_64_types.h      |    6
>   arch/x86/include/asm/pm-trace.h              |    2
>   arch/x86/include/asm/processor.h             |   12 +
>   arch/x86/include/asm/sections.h              |    8 +
>   arch/x86/include/asm/setup.h                 |    2
>   arch/x86/include/asm/stackprotector.h        |   19 ++
>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>   arch/x86/kernel/asm-offsets.c                |    3
>   arch/x86/kernel/asm-offsets_32.c             |    3
>   arch/x86/kernel/asm-offsets_64.c             |    3
>   arch/x86/kernel/cpu/common.c                 |    7 -
>   arch/x86/kernel/cpu/microcode/core.c         |    4
>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>   arch/x86/kernel/head64.c                     |   32 +++-
>   arch/x86/kernel/head_32.S                    |    3
>   arch/x86/kernel/head_64.S                    |   41 +++++-
>   arch/x86/kernel/kvm.c                        |    6
>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>   arch/x86/kernel/module.lds                   |    3
>   arch/x86/kernel/process.c                    |    5
>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>   arch/x86/kernel/setup_percpu.c               |    2
>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>   arch/x86/kvm/svm.c                           |    4
>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>   arch/x86/mm/dump_pagetables.c                |   11 +
>   arch/x86/power/hibernate_asm_64.S            |    4
>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>   arch/x86/tools/relocs.h                      |    4
>   arch/x86/tools/relocs_common.c               |   15 +-
>   arch/x86/xen/xen-asm.S                       |   12 -
>   arch/x86/xen/xen-head.S                      |    9 -
>   arch/x86/xen/xen-pvh.S                       |   13 +
>   drivers/base/firmware_class.c                |    4
>   include/asm-generic/sections.h               |    6
>   include/asm-generic/vmlinux.lds.h            |   12 +
>   include/linux/compiler.h                     |    8 +
>   init/Kconfig                                 |    9 +
>   kernel/kallsyms.c                            |   16 +-
>   kernel/trace/trace.h                         |    4
>   lib/dynamic_debug.c                          |    4
>   70 files changed, 1032 insertions(+), 308 deletions(-)
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
  2017-10-11 20:30 ` Thomas Garnier
  (?)
  (?)
@ 2017-10-11 21:34 ` Tom Lendacky
  -1 siblings, 0 replies; 19+ messages in thread
From: Tom Lendacky @ 2017-10-11 21:34 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Andy Lutomirski, Kirill A . Shutemov, Borislav Petkov,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Juergen Gross,
	Chris Wright, Alok Kataria, Rusty Russell, Tejun Heo,
	Christoph Lameter
  Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
	virtualization, linux-sparse, linux-crypto, kernel-hardening,
	xen-devel

On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> Changes:
>   - patch v1:
>     - Simplify ftrace implementation.
>     - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>   - rfc v3:
>     - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>       mapped memory. It also simplifies the relocation process.
>     - Move the start the module section next to the kernel. Remove the need for
>       -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>     - Support for XEN PVH as 32-bit relocations can be ignored with
>       --emit-relocs.
>     - Support for GOT relocations previously done automatically with -pie.
>     - Remove need for dynamic PLT in modules.
>     - Support dymamic GOT for modules.
>   - rfc v2:
>     - Add support for global stack cookie while compiler default to fs without
>       mcmodel=kernel
>     - Change patch 7 to correctly jump out of the identity mapping on kexec load
>       preserve.
> 
> These patches make the changes necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> the top 2G of the virtual address space. It allows to optionally extend the
> KASLR randomization range from 1G to 3G.

Hi Thomas,

I've applied your patches so that I can verify that SME works with PIE.
Unfortunately, I'm running into build warnings and errors when I enable
PIE.

With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:

  drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup

Disabling CONFIG_STACK_VALIDATION suppresses those.

But near the end of the build, I receive errors like this:

  arch/x86/kernel/setup.o: In function `dump_kernel_offset':
  .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
  .
  . about 10 more of the above type messages
  .
  make: *** [vmlinux] Error 1
  Error building kernel, exiting

Are there any config options that should or should not be enabled when
building with PIE enabled?  Is there a compiler requirement for PIE (I'm
using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

Thanks,
Tom

> 
> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
> feedback for using -pie versus --emit-relocs and details on compiler code
> generation.
> 
> The patches:
>   - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>   - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>   - 14: Adapt percpu design to work correctly when PIE is enabled.
>   - 15: Provide an option to default visibility to hidden except for key symbols.
>         It removes errors between compilation units.
>   - 16: Adapt relocation tool to handle PIE binary correctly.
>   - 19: Add support for global cookie.
>   - 20: Support ftrace with PIE (used on Ubuntu config).
>   - 21: Fix incorrect address marker on dump_pagetables.
>   - 22: Add option to move the module section just after the kernel.
>   - 23: Adapt module loading to support PIE with dynamic GOT.
>   - 24: Make the GOT read-only.
>   - 25: Add the CONFIG_X86_PIE option (off by default).
>   - 26: Adapt relocation tool to generate a 64-bit relocation table.
>   - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>         from 1G to 3G (off by default).
> 
> Performance/Size impact:
> 
> Size of vmlinux (Default configuration):
>   File size:
>   - PIE disabled: +0.000031%
>   - PIE enabled: -3.210% (less relocations)
>   .text section:
>   - PIE disabled: +0.000644%
>   - PIE enabled: +0.837%
> 
> Size of vmlinux (Ubuntu configuration):
>   File size:
>   - PIE disabled: -0.201%
>   - PIE enabled: -0.082%
>   .text section:
>   - PIE disabled: same
>   - PIE enabled: +1.319%
> 
> Size of vmlinux (Default configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +0.814%
> 
> Size of vmlinux (Ubuntu configuration + ORC):
>   File size:
>   - PIE enabled: -3.167%
>   .text section:
>   - PIE enabled: +1.26%
> 
> The size increase is mainly due to not having access to the 32-bit signed
> relocation that can be used with mcmodel=kernel. A small part is due to reduced
> optimization for PIE code. This bug [1] was opened with gcc to provide a better
> code generation for kernel PIE.
> 
> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>   - PIE disabled: no significant change (avg +0.1% on latest test).
>   - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
> 
> slab_test (average of 10 runs):
>   - PIE disabled: no significant change (-2% on latest run, likely noise).
>   - PIE enabled: between -1% and +0.8% on latest runs.
> 
> Kernbench (average of 10 Half and Optimal runs):
>   Elapsed Time:
>   - PIE disabled: no significant change (avg -0.239%)
>   - PIE enabled: average +0.07%
>   System Time:
>   - PIE disabled: no significant change (avg -0.277%)
>   - PIE enabled: average +0.7%
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
> 
> diffstat:
>   Documentation/x86/x86_64/mm.txt              |    3
>   arch/x86/Kconfig                             |   43 ++++++
>   arch/x86/Makefile                            |   40 +++++
>   arch/x86/boot/boot.h                         |    2
>   arch/x86/boot/compressed/Makefile            |    5
>   arch/x86/boot/compressed/misc.c              |   10 +
>   arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
>   arch/x86/crypto/aesni-intel_asm.S            |   14 +-
>   arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6
>   arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
>   arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
>   arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
>   arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
>   arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
>   arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
>   arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4
>   arch/x86/crypto/glue_helper-asm-avx.S        |    4
>   arch/x86/crypto/glue_helper-asm-avx2.S       |    6
>   arch/x86/entry/entry_32.S                    |    3
>   arch/x86/entry/entry_64.S                    |   29 ++--
>   arch/x86/include/asm/asm.h                   |   13 +
>   arch/x86/include/asm/bug.h                   |    2
>   arch/x86/include/asm/ftrace.h                |    6
>   arch/x86/include/asm/jump_label.h            |    8 -
>   arch/x86/include/asm/kvm_host.h              |    6
>   arch/x86/include/asm/module.h                |   11 +
>   arch/x86/include/asm/page_64_types.h         |    9 +
>   arch/x86/include/asm/paravirt_types.h        |   12 +
>   arch/x86/include/asm/percpu.h                |   25 ++-
>   arch/x86/include/asm/pgtable_64_types.h      |    6
>   arch/x86/include/asm/pm-trace.h              |    2
>   arch/x86/include/asm/processor.h             |   12 +
>   arch/x86/include/asm/sections.h              |    8 +
>   arch/x86/include/asm/setup.h                 |    2
>   arch/x86/include/asm/stackprotector.h        |   19 ++
>   arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
>   arch/x86/kernel/asm-offsets.c                |    3
>   arch/x86/kernel/asm-offsets_32.c             |    3
>   arch/x86/kernel/asm-offsets_64.c             |    3
>   arch/x86/kernel/cpu/common.c                 |    7 -
>   arch/x86/kernel/cpu/microcode/core.c         |    4
>   arch/x86/kernel/ftrace.c                     |   42 +++++-
>   arch/x86/kernel/head64.c                     |   32 +++-
>   arch/x86/kernel/head_32.S                    |    3
>   arch/x86/kernel/head_64.S                    |   41 +++++-
>   arch/x86/kernel/kvm.c                        |    6
>   arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
>   arch/x86/kernel/module.lds                   |    3
>   arch/x86/kernel/process.c                    |    5
>   arch/x86/kernel/relocate_kernel_64.S         |    8 -
>   arch/x86/kernel/setup_percpu.c               |    2
>   arch/x86/kernel/vmlinux.lds.S                |   13 +
>   arch/x86/kvm/svm.c                           |    4
>   arch/x86/lib/cmpxchg16b_emu.S                |    8 -
>   arch/x86/mm/dump_pagetables.c                |   11 +
>   arch/x86/power/hibernate_asm_64.S            |    4
>   arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
>   arch/x86/tools/relocs.h                      |    4
>   arch/x86/tools/relocs_common.c               |   15 +-
>   arch/x86/xen/xen-asm.S                       |   12 -
>   arch/x86/xen/xen-head.S                      |    9 -
>   arch/x86/xen/xen-pvh.S                       |   13 +
>   drivers/base/firmware_class.c                |    4
>   include/asm-generic/sections.h               |    6
>   include/asm-generic/vmlinux.lds.h            |   12 +
>   include/linux/compiler.h                     |    8 +
>   init/Kconfig                                 |    9 +
>   kernel/kallsyms.c                            |   16 +-
>   kernel/trace/trace.h                         |    4
>   lib/dynamic_debug.c                          |    4
>   70 files changed, 1032 insertions(+), 308 deletions(-)
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-11 20:30 ` Thomas Garnier
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-11 20:30 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Thomas Garnier, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Tom Lendacky, Andy Lutomirski, Kirill A . Shutemov,
	Borislav Petkov, Rafael J . Wysocki, Len Brown, Pavel Machek,
	Juergen Gross, Chris Wright, Alok Kataria, Rusty Russell,
	Tejun Heo
  Cc: x86, linux-crypto, linux-kernel, linux-pm, virtualization,
	xen-devel, linux-arch, linux-sparse, kvm, linux-doc,
	kernel-hardening

Changes:
 - patch v1:
   - Simplify ftrace implementation.
   - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
 - rfc v3:
   - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
     mapped memory. It also simplifies the relocation process.
   - Move the start the module section next to the kernel. Remove the need for
     -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
   - Support for XEN PVH as 32-bit relocations can be ignored with
     --emit-relocs.
   - Support for GOT relocations previously done automatically with -pie.
   - Remove need for dynamic PLT in modules.
   - Support dymamic GOT for modules.
 - rfc v2:
   - Add support for global stack cookie while compiler default to fs without
     mcmodel=kernel
   - Change patch 7 to correctly jump out of the identity mapping on kexec load
     preserve.

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.

The patches:
 - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
 - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
 - 14: Adapt percpu design to work correctly when PIE is enabled.
 - 15: Provide an option to default visibility to hidden except for key symbols.
       It removes errors between compilation units.
 - 16: Adapt relocation tool to handle PIE binary correctly.
 - 19: Add support for global cookie.
 - 20: Support ftrace with PIE (used on Ubuntu config).
 - 21: Fix incorrect address marker on dump_pagetables.
 - 22: Add option to move the module section just after the kernel.
 - 23: Adapt module loading to support PIE with dynamic GOT.
 - 24: Make the GOT read-only.
 - 25: Add the CONFIG_X86_PIE option (off by default).
 - 26: Adapt relocation tool to generate a 64-bit relocation table.
 - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
       from 1G to 3G (off by default).

Performance/Size impact:

Size of vmlinux (Default configuration):
 File size:
 - PIE disabled: +0.000031%
 - PIE enabled: -3.210% (less relocations)
 .text section:
 - PIE disabled: +0.000644%
 - PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: -0.201%
 - PIE enabled: -0.082%
 .text section:
 - PIE disabled: same
 - PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
 - PIE disabled: no significant change (avg +0.1% on latest test).
 - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-2% on latest run, likely noise).
 - PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (avg -0.239%)
 - PIE enabled: average +0.07%
 System Time:
 - PIE disabled: no significant change (avg -0.277%)
 - PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

diffstat:
 Documentation/x86/x86_64/mm.txt              |    3 
 arch/x86/Kconfig                             |   43 ++++++
 arch/x86/Makefile                            |   40 +++++
 arch/x86/boot/boot.h                         |    2 
 arch/x86/boot/compressed/Makefile            |    5 
 arch/x86/boot/compressed/misc.c              |   10 +
 arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
 arch/x86/crypto/aesni-intel_asm.S            |   14 +-
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6 
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
 arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
 arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4 
 arch/x86/crypto/glue_helper-asm-avx.S        |    4 
 arch/x86/crypto/glue_helper-asm-avx2.S       |    6 
 arch/x86/entry/entry_32.S                    |    3 
 arch/x86/entry/entry_64.S                    |   29 ++--
 arch/x86/include/asm/asm.h                   |   13 +
 arch/x86/include/asm/bug.h                   |    2 
 arch/x86/include/asm/ftrace.h                |    6 
 arch/x86/include/asm/jump_label.h            |    8 -
 arch/x86/include/asm/kvm_host.h              |    6 
 arch/x86/include/asm/module.h                |   11 +
 arch/x86/include/asm/page_64_types.h         |    9 +
 arch/x86/include/asm/paravirt_types.h        |   12 +
 arch/x86/include/asm/percpu.h                |   25 ++-
 arch/x86/include/asm/pgtable_64_types.h      |    6 
 arch/x86/include/asm/pm-trace.h              |    2 
 arch/x86/include/asm/processor.h             |   12 +
 arch/x86/include/asm/sections.h              |    8 +
 arch/x86/include/asm/setup.h                 |    2 
 arch/x86/include/asm/stackprotector.h        |   19 ++
 arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
 arch/x86/kernel/asm-offsets.c                |    3 
 arch/x86/kernel/asm-offsets_32.c             |    3 
 arch/x86/kernel/asm-offsets_64.c             |    3 
 arch/x86/kernel/cpu/common.c                 |    7 -
 arch/x86/kernel/cpu/microcode/core.c         |    4 
 arch/x86/kernel/ftrace.c                     |   42 +++++-
 arch/x86/kernel/head64.c                     |   32 +++-
 arch/x86/kernel/head_32.S                    |    3 
 arch/x86/kernel/head_64.S                    |   41 +++++-
 arch/x86/kernel/kvm.c                        |    6 
 arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
 arch/x86/kernel/module.lds                   |    3 
 arch/x86/kernel/process.c                    |    5 
 arch/x86/kernel/relocate_kernel_64.S         |    8 -
 arch/x86/kernel/setup_percpu.c               |    2 
 arch/x86/kernel/vmlinux.lds.S                |   13 +
 arch/x86/kvm/svm.c                           |    4 
 arch/x86/lib/cmpxchg16b_emu.S                |    8 -
 arch/x86/mm/dump_pagetables.c                |   11 +
 arch/x86/power/hibernate_asm_64.S            |    4 
 arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
 arch/x86/tools/relocs.h                      |    4 
 arch/x86/tools/relocs_common.c               |   15 +-
 arch/x86/xen/xen-asm.S                       |   12 -
 arch/x86/xen/xen-head.S                      |    9 -
 arch/x86/xen/xen-pvh.S                       |   13 +
 drivers/base/firmware_class.c                |    4 
 include/asm-generic/sections.h               |    6 
 include/asm-generic/vmlinux.lds.h            |   12 +
 include/linux/compiler.h                     |    8 +
 init/Kconfig                                 |    9 +
 kernel/kallsyms.c                            |   16 +-
 kernel/trace/trace.h                         |    4 
 lib/dynamic_debug.c                          |    4 
 70 files changed, 1032 insertions(+), 308 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-11 20:30 ` Thomas Garnier
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier @ 2017-10-11 20:30 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Thomas Garnier, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Tom Lendacky, Andy Lutomirski, Kirill A . Shutemov,
	Borislav Petkov, Rafael J . Wysocki, Len Brown, Pavel Machek,
	Juergen Gross, Chris Wright, Alok Kataria, Rusty Russell
  Cc: x86, linux-crypto, linux-kernel, linux-pm, virtualization,
	xen-devel, linux-arch, linux-sparse, kvm, linux-doc,
	kernel-hardening

Changes:
 - patch v1:
   - Simplify ftrace implementation.
   - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
 - rfc v3:
   - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
     mapped memory. It also simplifies the relocation process.
   - Move the start the module section next to the kernel. Remove the need for
     -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
   - Support for XEN PVH as 32-bit relocations can be ignored with
     --emit-relocs.
   - Support for GOT relocations previously done automatically with -pie.
   - Remove need for dynamic PLT in modules.
   - Support dymamic GOT for modules.
 - rfc v2:
   - Add support for global stack cookie while compiler default to fs without
     mcmodel=kernel
   - Change patch 7 to correctly jump out of the identity mapping on kexec load
     preserve.

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.

The patches:
 - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
 - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
 - 14: Adapt percpu design to work correctly when PIE is enabled.
 - 15: Provide an option to default visibility to hidden except for key symbols.
       It removes errors between compilation units.
 - 16: Adapt relocation tool to handle PIE binary correctly.
 - 19: Add support for global cookie.
 - 20: Support ftrace with PIE (used on Ubuntu config).
 - 21: Fix incorrect address marker on dump_pagetables.
 - 22: Add option to move the module section just after the kernel.
 - 23: Adapt module loading to support PIE with dynamic GOT.
 - 24: Make the GOT read-only.
 - 25: Add the CONFIG_X86_PIE option (off by default).
 - 26: Adapt relocation tool to generate a 64-bit relocation table.
 - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
       from 1G to 3G (off by default).

Performance/Size impact:

Size of vmlinux (Default configuration):
 File size:
 - PIE disabled: +0.000031%
 - PIE enabled: -3.210% (less relocations)
 .text section:
 - PIE disabled: +0.000644%
 - PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: -0.201%
 - PIE enabled: -0.082%
 .text section:
 - PIE disabled: same
 - PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
 - PIE disabled: no significant change (avg +0.1% on latest test).
 - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-2% on latest run, likely noise).
 - PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (avg -0.239%)
 - PIE enabled: average +0.07%
 System Time:
 - PIE disabled: no significant change (avg -0.277%)
 - PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

diffstat:
 Documentation/x86/x86_64/mm.txt              |    3 
 arch/x86/Kconfig                             |   43 ++++++
 arch/x86/Makefile                            |   40 +++++
 arch/x86/boot/boot.h                         |    2 
 arch/x86/boot/compressed/Makefile            |    5 
 arch/x86/boot/compressed/misc.c              |   10 +
 arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
 arch/x86/crypto/aesni-intel_asm.S            |   14 +-
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6 
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
 arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
 arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4 
 arch/x86/crypto/glue_helper-asm-avx.S        |    4 
 arch/x86/crypto/glue_helper-asm-avx2.S       |    6 
 arch/x86/entry/entry_32.S                    |    3 
 arch/x86/entry/entry_64.S                    |   29 ++--
 arch/x86/include/asm/asm.h                   |   13 +
 arch/x86/include/asm/bug.h                   |    2 
 arch/x86/include/asm/ftrace.h                |    6 
 arch/x86/include/asm/jump_label.h            |    8 -
 arch/x86/include/asm/kvm_host.h              |    6 
 arch/x86/include/asm/module.h                |   11 +
 arch/x86/include/asm/page_64_types.h         |    9 +
 arch/x86/include/asm/paravirt_types.h        |   12 +
 arch/x86/include/asm/percpu.h                |   25 ++-
 arch/x86/include/asm/pgtable_64_types.h      |    6 
 arch/x86/include/asm/pm-trace.h              |    2 
 arch/x86/include/asm/processor.h             |   12 +
 arch/x86/include/asm/sections.h              |    8 +
 arch/x86/include/asm/setup.h                 |    2 
 arch/x86/include/asm/stackprotector.h        |   19 ++
 arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
 arch/x86/kernel/asm-offsets.c                |    3 
 arch/x86/kernel/asm-offsets_32.c             |    3 
 arch/x86/kernel/asm-offsets_64.c             |    3 
 arch/x86/kernel/cpu/common.c                 |    7 -
 arch/x86/kernel/cpu/microcode/core.c         |    4 
 arch/x86/kernel/ftrace.c                     |   42 +++++-
 arch/x86/kernel/head64.c                     |   32 +++-
 arch/x86/kernel/head_32.S                    |    3 
 arch/x86/kernel/head_64.S                    |   41 +++++-
 arch/x86/kernel/kvm.c                        |    6 
 arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
 arch/x86/kernel/module.lds                   |    3 
 arch/x86/kernel/process.c                    |    5 
 arch/x86/kernel/relocate_kernel_64.S         |    8 -
 arch/x86/kernel/setup_percpu.c               |    2 
 arch/x86/kernel/vmlinux.lds.S                |   13 +
 arch/x86/kvm/svm.c                           |    4 
 arch/x86/lib/cmpxchg16b_emu.S                |    8 -
 arch/x86/mm/dump_pagetables.c                |   11 +
 arch/x86/power/hibernate_asm_64.S            |    4 
 arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
 arch/x86/tools/relocs.h                      |    4 
 arch/x86/tools/relocs_common.c               |   15 +-
 arch/x86/xen/xen-asm.S                       |   12 -
 arch/x86/xen/xen-head.S                      |    9 -
 arch/x86/xen/xen-pvh.S                       |   13 +
 drivers/base/firmware_class.c                |    4 
 include/asm-generic/sections.h               |    6 
 include/asm-generic/vmlinux.lds.h            |   12 +
 include/linux/compiler.h                     |    8 +
 init/Kconfig                                 |    9 +
 kernel/kallsyms.c                            |   16 +-
 kernel/trace/trace.h                         |    4 
 lib/dynamic_debug.c                          |    4 
 70 files changed, 1032 insertions(+), 308 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization
@ 2017-10-11 20:30 Thomas Garnier via Virtualization
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Garnier via Virtualization @ 2017-10-11 20:30 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Thomas Garnier, Kees Cook, Andrey Ryabinin, Matthias Kaehlcke,
	Tom Lendacky, Andy Lutomirski, Kirill A . Shutemov,
	Borislav Petkov, Rafael J . Wysocki, Len Brown, Pavel Machek,
	Juergen Gross, Chris Wright, Alok Kataria, Rusty Russell,
	Tejun Heo
  Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
	virtualization, linux-sparse, linux-crypto, kernel-hardening,
	xen-devel

Changes:
 - patch v1:
   - Simplify ftrace implementation.
   - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
 - rfc v3:
   - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
     mapped memory. It also simplifies the relocation process.
   - Move the start the module section next to the kernel. Remove the need for
     -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
   - Support for XEN PVH as 32-bit relocations can be ignored with
     --emit-relocs.
   - Support for GOT relocations previously done automatically with -pie.
   - Remove need for dynamic PLT in modules.
   - Support dymamic GOT for modules.
 - rfc v2:
   - Add support for global stack cookie while compiler default to fs without
     mcmodel=kernel
   - Change patch 7 to correctly jump out of the identity mapping on kexec load
     preserve.

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.

The patches:
 - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
 - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
 - 14: Adapt percpu design to work correctly when PIE is enabled.
 - 15: Provide an option to default visibility to hidden except for key symbols.
       It removes errors between compilation units.
 - 16: Adapt relocation tool to handle PIE binary correctly.
 - 19: Add support for global cookie.
 - 20: Support ftrace with PIE (used on Ubuntu config).
 - 21: Fix incorrect address marker on dump_pagetables.
 - 22: Add option to move the module section just after the kernel.
 - 23: Adapt module loading to support PIE with dynamic GOT.
 - 24: Make the GOT read-only.
 - 25: Add the CONFIG_X86_PIE option (off by default).
 - 26: Adapt relocation tool to generate a 64-bit relocation table.
 - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
       from 1G to 3G (off by default).

Performance/Size impact:

Size of vmlinux (Default configuration):
 File size:
 - PIE disabled: +0.000031%
 - PIE enabled: -3.210% (less relocations)
 .text section:
 - PIE disabled: +0.000644%
 - PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: -0.201%
 - PIE enabled: -0.082%
 .text section:
 - PIE disabled: same
 - PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
 File size:
 - PIE enabled: -3.167%
 .text section:
 - PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
 - PIE disabled: no significant change (avg +0.1% on latest test).
 - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-2% on latest run, likely noise).
 - PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (avg -0.239%)
 - PIE enabled: average +0.07%
 System Time:
 - PIE disabled: no significant change (avg -0.277%)
 - PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

diffstat:
 Documentation/x86/x86_64/mm.txt              |    3 
 arch/x86/Kconfig                             |   43 ++++++
 arch/x86/Makefile                            |   40 +++++
 arch/x86/boot/boot.h                         |    2 
 arch/x86/boot/compressed/Makefile            |    5 
 arch/x86/boot/compressed/misc.c              |   10 +
 arch/x86/crypto/aes-x86_64-asm_64.S          |   45 ++++--
 arch/x86/crypto/aesni-intel_asm.S            |   14 +-
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6 
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 +++---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
 arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
 arch/x86/crypto/des3_ede-asm_64.S            |   96 +++++++++-----
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4 
 arch/x86/crypto/glue_helper-asm-avx.S        |    4 
 arch/x86/crypto/glue_helper-asm-avx2.S       |    6 
 arch/x86/entry/entry_32.S                    |    3 
 arch/x86/entry/entry_64.S                    |   29 ++--
 arch/x86/include/asm/asm.h                   |   13 +
 arch/x86/include/asm/bug.h                   |    2 
 arch/x86/include/asm/ftrace.h                |    6 
 arch/x86/include/asm/jump_label.h            |    8 -
 arch/x86/include/asm/kvm_host.h              |    6 
 arch/x86/include/asm/module.h                |   11 +
 arch/x86/include/asm/page_64_types.h         |    9 +
 arch/x86/include/asm/paravirt_types.h        |   12 +
 arch/x86/include/asm/percpu.h                |   25 ++-
 arch/x86/include/asm/pgtable_64_types.h      |    6 
 arch/x86/include/asm/pm-trace.h              |    2 
 arch/x86/include/asm/processor.h             |   12 +
 arch/x86/include/asm/sections.h              |    8 +
 arch/x86/include/asm/setup.h                 |    2 
 arch/x86/include/asm/stackprotector.h        |   19 ++
 arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
 arch/x86/kernel/asm-offsets.c                |    3 
 arch/x86/kernel/asm-offsets_32.c             |    3 
 arch/x86/kernel/asm-offsets_64.c             |    3 
 arch/x86/kernel/cpu/common.c                 |    7 -
 arch/x86/kernel/cpu/microcode/core.c         |    4 
 arch/x86/kernel/ftrace.c                     |   42 +++++-
 arch/x86/kernel/head64.c                     |   32 +++-
 arch/x86/kernel/head_32.S                    |    3 
 arch/x86/kernel/head_64.S                    |   41 +++++-
 arch/x86/kernel/kvm.c                        |    6 
 arch/x86/kernel/module.c                     |  182 ++++++++++++++++++++++++++-
 arch/x86/kernel/module.lds                   |    3 
 arch/x86/kernel/process.c                    |    5 
 arch/x86/kernel/relocate_kernel_64.S         |    8 -
 arch/x86/kernel/setup_percpu.c               |    2 
 arch/x86/kernel/vmlinux.lds.S                |   13 +
 arch/x86/kvm/svm.c                           |    4 
 arch/x86/lib/cmpxchg16b_emu.S                |    8 -
 arch/x86/mm/dump_pagetables.c                |   11 +
 arch/x86/power/hibernate_asm_64.S            |    4 
 arch/x86/tools/relocs.c                      |  170 +++++++++++++++++++++++--
 arch/x86/tools/relocs.h                      |    4 
 arch/x86/tools/relocs_common.c               |   15 +-
 arch/x86/xen/xen-asm.S                       |   12 -
 arch/x86/xen/xen-head.S                      |    9 -
 arch/x86/xen/xen-pvh.S                       |   13 +
 drivers/base/firmware_class.c                |    4 
 include/asm-generic/sections.h               |    6 
 include/asm-generic/vmlinux.lds.h            |   12 +
 include/linux/compiler.h                     |    8 +
 init/Kconfig                                 |    9 +
 kernel/kallsyms.c                            |   16 +-
 kernel/trace/trace.h                         |    4 
 lib/dynamic_debug.c                          |    4 
 70 files changed, 1032 insertions(+), 308 deletions(-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-10-18 23:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 20:30 [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization Thomas Garnier
  -- strict thread matches above, loose matches on Subject: below --
2017-10-11 20:30 Thomas Garnier
2017-10-11 20:30 ` Thomas Garnier
2017-10-11 21:34 ` Tom Lendacky
2017-10-11 21:34   ` Tom Lendacky
2017-10-12 15:34   ` Thomas Garnier via Virtualization
2017-10-12 15:34   ` Thomas Garnier
2017-10-12 15:34   ` Thomas Garnier
2017-10-12 15:34     ` Thomas Garnier
2017-10-12 15:51     ` Markus Trippelsdorf
2017-10-12 15:51     ` Markus Trippelsdorf
2017-10-12 16:28     ` Tom Lendacky
2017-10-12 16:28     ` Tom Lendacky
2017-10-18 23:17       ` Thomas Garnier via Virtualization
2017-10-18 23:17       ` Thomas Garnier
2017-10-18 23:17       ` Thomas Garnier
2017-10-18 23:17         ` Thomas Garnier
2017-10-11 21:34 ` Tom Lendacky
2017-10-11 20:30 Thomas Garnier via Virtualization

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.