kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86: PIE support and option to extend KASLR randomization
@ 2017-07-18 22:33 Thomas Garnier
  2017-07-18 22:33 ` [RFC 01/22] x86/crypto: Adapt assembly for PIE support Thomas Garnier
                   ` (22 more replies)
  0 siblings, 23 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: x86, linux-crypto, linux-kernel, xen-devel, kvm, linux-pm,
	linux-arch, linux-sparse, kernel-hardening

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general.

The patches:
 - 1-3, 5-15: Change in assembly code to be PIE compliant.
 - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
 - 16: Adapt percpu design to work correctly when PIE is enabled.
 - 17: Provide an option to default visibility to hidden except for key symbols.
       It removes errors between compilation units.
 - 18: Adapt relocation tool to handle PIE binary correctly.
 - 19: Add the CONFIG_X86_PIE option (off by default)
 - 20: Adapt relocation tool to generate a 64-bit relocation table.
 - 21: Add options to build modules as mcmodel=large and dynamically create a
       PLT for relative references out of range (adapted from arm64).
 - 22: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
       from 1G to 3G (off by default).

Performance/Size impact:

Hackbench (50% and 1600% loads):
 - PIE disabled: no significant change (-0.50% / +0.50%)
 - PIE enabled: 7% to 8% on half load, 10% on heavy load.

These results are aligned with the different research on user-mode PIE
impact on cpu intensive benchmarks (around 10% on x86_64).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-1% / +1%)
 - PIE enabled: 3% to 4%

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (-0.22% / +0.06%)
 - PIE enabled: around 0.50%
 System Time:
 - PIE disabled: no significant change (-0.99% / -1.28%)
 - PIE enabled: 5% to 6%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: 472928672 bytes (-0.000169% from baseline)
 - PIE enabled: 216878461 bytes (-54.14% from baseline)
 .text sections:
 - PIE disabled: 9373572 bytes (+0.04% from baseline)
 - PIE enabled: 9499138 bytes (+1.38% from baseline)

The big decrease in vmlinux file size is due to the lower number of
relocations appended to the file.

diffstat:
 arch/x86/Kconfig                             |   37 +++++
 arch/x86/Makefile                            |   17 ++
 arch/x86/boot/boot.h                         |    2 
 arch/x86/boot/compressed/Makefile            |    5 
 arch/x86/boot/compressed/misc.c              |   10 +
 arch/x86/crypto/aes-x86_64-asm_64.S          |   45 +++---
 arch/x86/crypto/aesni-intel_asm.S            |   14 +
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |    6 
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  |   42 ++---
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S |   44 +++---
 arch/x86/crypto/camellia-x86_64-asm_64.S     |    8 -
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    |   50 +++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    |   44 +++---
 arch/x86/crypto/des3_ede-asm_64.S            |   96 ++++++++-----
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |    4 
 arch/x86/crypto/glue_helper-asm-avx.S        |    4 
 arch/x86/crypto/glue_helper-asm-avx2.S       |    6 
 arch/x86/entry/entry_64.S                    |   26 ++-
 arch/x86/include/asm/asm.h                   |   13 +
 arch/x86/include/asm/bug.h                   |    2 
 arch/x86/include/asm/jump_label.h            |    8 -
 arch/x86/include/asm/kvm_host.h              |    6 
 arch/x86/include/asm/module.h                |   16 ++
 arch/x86/include/asm/page_64_types.h         |    9 +
 arch/x86/include/asm/paravirt_types.h        |   12 +
 arch/x86/include/asm/percpu.h                |   25 ++-
 arch/x86/include/asm/pm-trace.h              |    2 
 arch/x86/include/asm/processor.h             |    8 -
 arch/x86/include/asm/setup.h                 |    2 
 arch/x86/kernel/Makefile                     |    2 
 arch/x86/kernel/acpi/wakeup_64.S             |   31 ++--
 arch/x86/kernel/cpu/common.c                 |    4 
 arch/x86/kernel/head64.c                     |   28 +++
 arch/x86/kernel/head_64.S                    |   47 +++++-
 arch/x86/kernel/kvm.c                        |    6 
 arch/x86/kernel/module-plts.c                |  198 +++++++++++++++++++++++++++
 arch/x86/kernel/module.c                     |   18 +-
 arch/x86/kernel/module.lds                   |    4 
 arch/x86/kernel/relocate_kernel_64.S         |    2 
 arch/x86/kernel/setup_percpu.c               |    2 
 arch/x86/kernel/vmlinux.lds.S                |   13 +
 arch/x86/kvm/svm.c                           |    4 
 arch/x86/lib/cmpxchg16b_emu.S                |    8 -
 arch/x86/power/hibernate_asm_64.S            |    4 
 arch/x86/tools/relocs.c                      |  134 +++++++++++++++---
 arch/x86/tools/relocs.h                      |    4 
 arch/x86/tools/relocs_common.c               |   15 +-
 arch/x86/xen/xen-asm.S                       |   12 -
 arch/x86/xen/xen-asm.h                       |    3 
 arch/x86/xen/xen-head.S                      |    9 -
 include/asm-generic/sections.h               |    6 
 include/linux/compiler.h                     |    8 +
 init/Kconfig                                 |    9 +
 kernel/kallsyms.c                            |   16 +-
 54 files changed, 868 insertions(+), 282 deletions(-)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [RFC 01/22] x86/crypto: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 02/22] x86: Use symbol name on bug table " Thomas Garnier
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: x86, linux-crypto, linux-kernel, xen-devel, kvm, linux-pm,
	linux-arch, linux-sparse, kernel-hardening

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/crypto/aes-x86_64-asm_64.S          | 45 ++++++++-----
 arch/x86/crypto/aesni-intel_asm.S            | 14 ++--
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |  6 +-
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  | 42 ++++++------
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 ++++++-------
 arch/x86/crypto/camellia-x86_64-asm_64.S     |  8 ++-
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    | 50 ++++++++-------
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    | 44 +++++++------
 arch/x86/crypto/des3_ede-asm_64.S            | 96 ++++++++++++++++++----------
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |  4 +-
 arch/x86/crypto/glue_helper-asm-avx.S        |  4 +-
 arch/x86/crypto/glue_helper-asm-avx2.S       |  6 +-
 12 files changed, 211 insertions(+), 152 deletions(-)

diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
index 8739cf7795de..86fa068e5e81 100644
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ b/arch/x86/crypto/aes-x86_64-asm_64.S
@@ -48,8 +48,12 @@
 #define R10	%r10
 #define R11	%r11
 
+/* Hold global for PIE suport */
+#define RBASE	%r12
+
 #define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
 	ENTRY(FUNC);			\
+	pushq	RBASE;			\
 	movq	r1,r2;			\
 	leaq	KEY+48(r8),r9;		\
 	movq	r10,r11;		\
@@ -74,54 +78,63 @@
 	movl	r6 ## E,4(r9);		\
 	movl	r7 ## E,8(r9);		\
 	movl	r8 ## E,12(r9);		\
+	popq	RBASE;			\
 	ret;				\
 	ENDPROC(FUNC);
 
+#define round_mov(tab_off, reg_i, reg_o) \
+	leaq	tab_off(%rip), RBASE; \
+	movl	(RBASE,reg_i,4), reg_o;
+
+#define round_xor(tab_off, reg_i, reg_o) \
+	leaq	tab_off(%rip), RBASE; \
+	xorl	(RBASE,reg_i,4), reg_o;
+
 #define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
 	movzbl	r2 ## H,r5 ## E;	\
 	movzbl	r2 ## L,r6 ## E;	\
-	movl	TAB+1024(,r5,4),r5 ## E;\
+	round_mov(TAB+1024, r5, r5 ## E)\
 	movw	r4 ## X,r2 ## X;	\
-	movl	TAB(,r6,4),r6 ## E;	\
+	round_mov(TAB, r6, r6 ## E)	\
 	roll	$16,r2 ## E;		\
 	shrl	$16,r4 ## E;		\
 	movzbl	r4 ## L,r7 ## E;	\
 	movzbl	r4 ## H,r4 ## E;	\
 	xorl	OFFSET(r8),ra ## E;	\
 	xorl	OFFSET+4(r8),rb ## E;	\
-	xorl	TAB+3072(,r4,4),r5 ## E;\
-	xorl	TAB+2048(,r7,4),r6 ## E;\
+	round_xor(TAB+3072, r4, r5 ## E)\
+	round_xor(TAB+2048, r7, r6 ## E)\
 	movzbl	r1 ## L,r7 ## E;	\
 	movzbl	r1 ## H,r4 ## E;	\
-	movl	TAB+1024(,r4,4),r4 ## E;\
+	round_mov(TAB+1024, r4, r4 ## E)\
 	movw	r3 ## X,r1 ## X;	\
 	roll	$16,r1 ## E;		\
 	shrl	$16,r3 ## E;		\
-	xorl	TAB(,r7,4),r5 ## E;	\
+	round_xor(TAB, r7, r5 ## E)	\
 	movzbl	r3 ## L,r7 ## E;	\
 	movzbl	r3 ## H,r3 ## E;	\
-	xorl	TAB+3072(,r3,4),r4 ## E;\
-	xorl	TAB+2048(,r7,4),r5 ## E;\
+	round_xor(TAB+3072, r3, r4 ## E)\
+	round_xor(TAB+2048, r7, r5 ## E)\
 	movzbl	r1 ## L,r7 ## E;	\
 	movzbl	r1 ## H,r3 ## E;	\
 	shrl	$16,r1 ## E;		\
-	xorl	TAB+3072(,r3,4),r6 ## E;\
-	movl	TAB+2048(,r7,4),r3 ## E;\
+	round_xor(TAB+3072, r3, r6 ## E)\
+	round_mov(TAB+2048, r7, r3 ## E)\
 	movzbl	r1 ## L,r7 ## E;	\
 	movzbl	r1 ## H,r1 ## E;	\
-	xorl	TAB+1024(,r1,4),r6 ## E;\
-	xorl	TAB(,r7,4),r3 ## E;	\
+	round_xor(TAB+1024, r1, r6 ## E)\
+	round_xor(TAB, r7, r3 ## E)	\
 	movzbl	r2 ## H,r1 ## E;	\
 	movzbl	r2 ## L,r7 ## E;	\
 	shrl	$16,r2 ## E;		\
-	xorl	TAB+3072(,r1,4),r3 ## E;\
-	xorl	TAB+2048(,r7,4),r4 ## E;\
+	round_xor(TAB+3072, r1, r3 ## E)\
+	round_xor(TAB+2048, r7, r4 ## E)\
 	movzbl	r2 ## H,r1 ## E;	\
 	movzbl	r2 ## L,r2 ## E;	\
 	xorl	OFFSET+8(r8),rc ## E;	\
 	xorl	OFFSET+12(r8),rd ## E;	\
-	xorl	TAB+1024(,r1,4),r3 ## E;\
-	xorl	TAB(,r2,4),r4 ## E;
+	round_xor(TAB+1024, r1, r3 ## E)\
+	round_xor(TAB, r2, r4 ## E)
 
 #define move_regs(r1,r2,r3,r4) \
 	movl	r3 ## E,r1 ## E;	\
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 16627fec80b2..5f73201dff32 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -325,7 +325,8 @@ _get_AAD_rest0\num_initial_blocks\operation:
 	vpshufb and an array of shuffle masks */
 	movq	   %r12, %r11
 	salq	   $4, %r11
-	movdqu	   aad_shift_arr(%r11), \TMP1
+	leaq	   aad_shift_arr(%rip), %rax
+	movdqu	   (%rax,%r11,), \TMP1
 	PSHUFB_XMM \TMP1, %xmm\i
 _get_AAD_rest_final\num_initial_blocks\operation:
 	PSHUFB_XMM   %xmm14, %xmm\i # byte-reflect the AAD data
@@ -584,7 +585,8 @@ _get_AAD_rest0\num_initial_blocks\operation:
 	vpshufb and an array of shuffle masks */
 	movq	   %r12, %r11
 	salq	   $4, %r11
-	movdqu	   aad_shift_arr(%r11), \TMP1
+	leaq	   aad_shift_arr(%rip), %rax
+	movdqu	   (%rax,%r11,), \TMP1
 	PSHUFB_XMM \TMP1, %xmm\i
 _get_AAD_rest_final\num_initial_blocks\operation:
 	PSHUFB_XMM   %xmm14, %xmm\i # byte-reflect the AAD data
@@ -2722,7 +2724,7 @@ ENDPROC(aesni_cbc_dec)
  */
 .align 4
 _aesni_inc_init:
-	movaps .Lbswap_mask, BSWAP_MASK
+	movaps .Lbswap_mask(%rip), BSWAP_MASK
 	movaps IV, CTR
 	PSHUFB_XMM BSWAP_MASK CTR
 	mov $1, TCTR_LOW
@@ -2850,12 +2852,12 @@ ENTRY(aesni_xts_crypt8)
 	cmpb $0, %cl
 	movl $0, %ecx
 	movl $240, %r10d
-	leaq _aesni_enc4, %r11
-	leaq _aesni_dec4, %rax
+	leaq _aesni_enc4(%rip), %r11
+	leaq _aesni_dec4(%rip), %rax
 	cmovel %r10d, %ecx
 	cmoveq %rax, %r11
 
-	movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
+	movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
 	movups (IVP), IV
 
 	mov 480(KEYP), KLEN
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index faecb1518bf8..488605b19fe8 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -454,7 +454,8 @@ _get_AAD_rest0\@:
 	vpshufb and an array of shuffle masks */
 	movq    %r12, %r11
 	salq    $4, %r11
-	movdqu  aad_shift_arr(%r11), \T1
+	leaq	aad_shift_arr(%rip), %rax
+	movdqu  (%rax,%r11,), \T1
 	vpshufb \T1, reg_i, reg_i
 _get_AAD_rest_final\@:
 	vpshufb SHUF_MASK(%rip), reg_i, reg_i
@@ -1761,7 +1762,8 @@ _get_AAD_rest0\@:
 	vpshufb and an array of shuffle masks */
 	movq    %r12, %r11
 	salq    $4, %r11
-	movdqu  aad_shift_arr(%r11), \T1
+	leaq	aad_shift_arr(%rip), %rax
+	movdqu  (%rax,%r11,), \T1
 	vpshufb \T1, reg_i, reg_i
 _get_AAD_rest_final\@:
 	vpshufb SHUF_MASK(%rip), reg_i, reg_i
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index f7c495e2863c..46feaea52632 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -52,10 +52,10 @@
 	/* \
 	 * S-function with AES subbytes \
 	 */ \
-	vmovdqa .Linv_shift_row, t4; \
-	vbroadcastss .L0f0f0f0f, t7; \
-	vmovdqa .Lpre_tf_lo_s1, t0; \
-	vmovdqa .Lpre_tf_hi_s1, t1; \
+	vmovdqa .Linv_shift_row(%rip), t4; \
+	vbroadcastss .L0f0f0f0f(%rip), t7; \
+	vmovdqa .Lpre_tf_lo_s1(%rip), t0; \
+	vmovdqa .Lpre_tf_hi_s1(%rip), t1; \
 	\
 	/* AES inverse shift rows */ \
 	vpshufb t4, x0, x0; \
@@ -68,8 +68,8 @@
 	vpshufb t4, x6, x6; \
 	\
 	/* prefilter sboxes 1, 2 and 3 */ \
-	vmovdqa .Lpre_tf_lo_s4, t2; \
-	vmovdqa .Lpre_tf_hi_s4, t3; \
+	vmovdqa .Lpre_tf_lo_s4(%rip), t2; \
+	vmovdqa .Lpre_tf_hi_s4(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x1, t0, t1, t7, t6); \
@@ -83,8 +83,8 @@
 	filter_8bit(x6, t2, t3, t7, t6); \
 	\
 	/* AES subbytes + AES shift rows */ \
-	vmovdqa .Lpost_tf_lo_s1, t0; \
-	vmovdqa .Lpost_tf_hi_s1, t1; \
+	vmovdqa .Lpost_tf_lo_s1(%rip), t0; \
+	vmovdqa .Lpost_tf_hi_s1(%rip), t1; \
 	vaesenclast t4, x0, x0; \
 	vaesenclast t4, x7, x7; \
 	vaesenclast t4, x1, x1; \
@@ -95,16 +95,16 @@
 	vaesenclast t4, x6, x6; \
 	\
 	/* postfilter sboxes 1 and 4 */ \
-	vmovdqa .Lpost_tf_lo_s3, t2; \
-	vmovdqa .Lpost_tf_hi_s3, t3; \
+	vmovdqa .Lpost_tf_lo_s3(%rip), t2; \
+	vmovdqa .Lpost_tf_hi_s3(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x3, t0, t1, t7, t6); \
 	filter_8bit(x6, t0, t1, t7, t6); \
 	\
 	/* postfilter sbox 3 */ \
-	vmovdqa .Lpost_tf_lo_s2, t4; \
-	vmovdqa .Lpost_tf_hi_s2, t5; \
+	vmovdqa .Lpost_tf_lo_s2(%rip), t4; \
+	vmovdqa .Lpost_tf_hi_s2(%rip), t5; \
 	filter_8bit(x2, t2, t3, t7, t6); \
 	filter_8bit(x5, t2, t3, t7, t6); \
 	\
@@ -443,7 +443,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	transpose_4x4(c0, c1, c2, c3, a0, a1); \
 	transpose_4x4(d0, d1, d2, d3, a0, a1); \
 	\
-	vmovdqu .Lshufb_16x16b, a0; \
+	vmovdqu .Lshufb_16x16b(%rip), a0; \
 	vmovdqu st1, a1; \
 	vpshufb a0, a2, a2; \
 	vpshufb a0, a3, a3; \
@@ -482,7 +482,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 #define inpack16_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
 		     y6, y7, rio, key) \
 	vmovq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor 0 * 16(rio), x0, y7; \
 	vpxor 1 * 16(rio), x0, y6; \
@@ -533,7 +533,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	vmovdqu x0, stack_tmp0; \
 	\
 	vmovq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor x0, y7, y7; \
 	vpxor x0, y6, y6; \
@@ -1016,7 +1016,7 @@ ENTRY(camellia_ctr_16way)
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
 
-	vmovdqa .Lbswap128_mask, %xmm14;
+	vmovdqa .Lbswap128_mask(%rip), %xmm14;
 
 	/* load IV and byteswap */
 	vmovdqu (%rcx), %xmm0;
@@ -1065,7 +1065,7 @@ ENTRY(camellia_ctr_16way)
 
 	/* inpack16_pre: */
 	vmovq (key_table)(CTX), %xmm15;
-	vpshufb .Lpack_bswap, %xmm15, %xmm15;
+	vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
 	vpxor %xmm0, %xmm15, %xmm0;
 	vpxor %xmm1, %xmm15, %xmm1;
 	vpxor %xmm2, %xmm15, %xmm2;
@@ -1133,7 +1133,7 @@ camellia_xts_crypt_16way:
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
 
-	vmovdqa .Lxts_gf128mul_and_shl1_mask, %xmm14;
+	vmovdqa .Lxts_gf128mul_and_shl1_mask(%rip), %xmm14;
 
 	/* load IV */
 	vmovdqu (%rcx), %xmm0;
@@ -1209,7 +1209,7 @@ camellia_xts_crypt_16way:
 
 	/* inpack16_pre: */
 	vmovq (key_table)(CTX, %r8, 8), %xmm15;
-	vpshufb .Lpack_bswap, %xmm15, %xmm15;
+	vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
 	vpxor 0 * 16(%rax), %xmm15, %xmm0;
 	vpxor %xmm1, %xmm15, %xmm1;
 	vpxor %xmm2, %xmm15, %xmm2;
@@ -1264,7 +1264,7 @@ ENTRY(camellia_xts_enc_16way)
 	 */
 	xorl %r8d, %r8d; /* input whitening key, 0 for enc */
 
-	leaq __camellia_enc_blk16, %r9;
+	leaq __camellia_enc_blk16(%rip), %r9;
 
 	jmp camellia_xts_crypt_16way;
 ENDPROC(camellia_xts_enc_16way)
@@ -1282,7 +1282,7 @@ ENTRY(camellia_xts_dec_16way)
 	movl $24, %eax;
 	cmovel %eax, %r8d;  /* input whitening key, last for dec */
 
-	leaq __camellia_dec_blk16, %r9;
+	leaq __camellia_dec_blk16(%rip), %r9;
 
 	jmp camellia_xts_crypt_16way;
 ENDPROC(camellia_xts_dec_16way)
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index eee5b3982cfd..93da327fec83 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -69,12 +69,12 @@
 	/* \
 	 * S-function with AES subbytes \
 	 */ \
-	vbroadcasti128 .Linv_shift_row, t4; \
-	vpbroadcastd .L0f0f0f0f, t7; \
-	vbroadcasti128 .Lpre_tf_lo_s1, t5; \
-	vbroadcasti128 .Lpre_tf_hi_s1, t6; \
-	vbroadcasti128 .Lpre_tf_lo_s4, t2; \
-	vbroadcasti128 .Lpre_tf_hi_s4, t3; \
+	vbroadcasti128 .Linv_shift_row(%rip), t4; \
+	vpbroadcastd .L0f0f0f0f(%rip), t7; \
+	vbroadcasti128 .Lpre_tf_lo_s1(%rip), t5; \
+	vbroadcasti128 .Lpre_tf_hi_s1(%rip), t6; \
+	vbroadcasti128 .Lpre_tf_lo_s4(%rip), t2; \
+	vbroadcasti128 .Lpre_tf_hi_s4(%rip), t3; \
 	\
 	/* AES inverse shift rows */ \
 	vpshufb t4, x0, x0; \
@@ -120,8 +120,8 @@
 	vinserti128 $1, t2##_x, x6, x6; \
 	vextracti128 $1, x1, t3##_x; \
 	vextracti128 $1, x4, t2##_x; \
-	vbroadcasti128 .Lpost_tf_lo_s1, t0; \
-	vbroadcasti128 .Lpost_tf_hi_s1, t1; \
+	vbroadcasti128 .Lpost_tf_lo_s1(%rip), t0; \
+	vbroadcasti128 .Lpost_tf_hi_s1(%rip), t1; \
 	vaesenclast t4##_x, x2##_x, x2##_x; \
 	vaesenclast t4##_x, t6##_x, t6##_x; \
 	vinserti128 $1, t6##_x, x2, x2; \
@@ -136,16 +136,16 @@
 	vinserti128 $1, t2##_x, x4, x4; \
 	\
 	/* postfilter sboxes 1 and 4 */ \
-	vbroadcasti128 .Lpost_tf_lo_s3, t2; \
-	vbroadcasti128 .Lpost_tf_hi_s3, t3; \
+	vbroadcasti128 .Lpost_tf_lo_s3(%rip), t2; \
+	vbroadcasti128 .Lpost_tf_hi_s3(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x3, t0, t1, t7, t6); \
 	filter_8bit(x6, t0, t1, t7, t6); \
 	\
 	/* postfilter sbox 3 */ \
-	vbroadcasti128 .Lpost_tf_lo_s2, t4; \
-	vbroadcasti128 .Lpost_tf_hi_s2, t5; \
+	vbroadcasti128 .Lpost_tf_lo_s2(%rip), t4; \
+	vbroadcasti128 .Lpost_tf_hi_s2(%rip), t5; \
 	filter_8bit(x2, t2, t3, t7, t6); \
 	filter_8bit(x5, t2, t3, t7, t6); \
 	\
@@ -482,7 +482,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	transpose_4x4(c0, c1, c2, c3, a0, a1); \
 	transpose_4x4(d0, d1, d2, d3, a0, a1); \
 	\
-	vbroadcasti128 .Lshufb_16x16b, a0; \
+	vbroadcasti128 .Lshufb_16x16b(%rip), a0; \
 	vmovdqu st1, a1; \
 	vpshufb a0, a2, a2; \
 	vpshufb a0, a3, a3; \
@@ -521,7 +521,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 #define inpack32_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
 		     y6, y7, rio, key) \
 	vpbroadcastq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor 0 * 32(rio), x0, y7; \
 	vpxor 1 * 32(rio), x0, y6; \
@@ -572,7 +572,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	vmovdqu x0, stack_tmp0; \
 	\
 	vpbroadcastq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor x0, y7, y7; \
 	vpxor x0, y6, y6; \
@@ -1112,7 +1112,7 @@ ENTRY(camellia_ctr_32way)
 	vmovdqu (%rcx), %xmm0;
 	vmovdqa %xmm0, %xmm1;
 	inc_le128(%xmm0, %xmm15, %xmm14);
-	vbroadcasti128 .Lbswap128_mask, %ymm14;
+	vbroadcasti128 .Lbswap128_mask(%rip), %ymm14;
 	vinserti128 $1, %xmm0, %ymm1, %ymm0;
 	vpshufb %ymm14, %ymm0, %ymm13;
 	vmovdqu %ymm13, 15 * 32(%rax);
@@ -1158,7 +1158,7 @@ ENTRY(camellia_ctr_32way)
 
 	/* inpack32_pre: */
 	vpbroadcastq (key_table)(CTX), %ymm15;
-	vpshufb .Lpack_bswap, %ymm15, %ymm15;
+	vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
 	vpxor %ymm0, %ymm15, %ymm0;
 	vpxor %ymm1, %ymm15, %ymm1;
 	vpxor %ymm2, %ymm15, %ymm2;
@@ -1242,13 +1242,13 @@ camellia_xts_crypt_32way:
 	subq $(16 * 32), %rsp;
 	movq %rsp, %rax;
 
-	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0, %ymm12;
+	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0(%rip), %ymm12;
 
 	/* load IV and construct second IV */
 	vmovdqu (%rcx), %xmm0;
 	vmovdqa %xmm0, %xmm15;
 	gf128mul_x_ble(%xmm0, %xmm12, %xmm13);
-	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1, %ymm13;
+	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1(%rip), %ymm13;
 	vinserti128 $1, %xmm0, %ymm15, %ymm0;
 	vpxor 0 * 32(%rdx), %ymm0, %ymm15;
 	vmovdqu %ymm15, 15 * 32(%rax);
@@ -1325,7 +1325,7 @@ camellia_xts_crypt_32way:
 
 	/* inpack32_pre: */
 	vpbroadcastq (key_table)(CTX, %r8, 8), %ymm15;
-	vpshufb .Lpack_bswap, %ymm15, %ymm15;
+	vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
 	vpxor 0 * 32(%rax), %ymm15, %ymm0;
 	vpxor %ymm1, %ymm15, %ymm1;
 	vpxor %ymm2, %ymm15, %ymm2;
@@ -1383,7 +1383,7 @@ ENTRY(camellia_xts_enc_32way)
 
 	xorl %r8d, %r8d; /* input whitening key, 0 for enc */
 
-	leaq __camellia_enc_blk32, %r9;
+	leaq __camellia_enc_blk32(%rip), %r9;
 
 	jmp camellia_xts_crypt_32way;
 ENDPROC(camellia_xts_enc_32way)
@@ -1401,7 +1401,7 @@ ENTRY(camellia_xts_dec_32way)
 	movl $24, %eax;
 	cmovel %eax, %r8d;  /* input whitening key, last for dec */
 
-	leaq __camellia_dec_blk32, %r9;
+	leaq __camellia_dec_blk32(%rip), %r9;
 
 	jmp camellia_xts_crypt_32way;
 ENDPROC(camellia_xts_dec_32way)
diff --git a/arch/x86/crypto/camellia-x86_64-asm_64.S b/arch/x86/crypto/camellia-x86_64-asm_64.S
index 310319c601ed..b8c81e2f9973 100644
--- a/arch/x86/crypto/camellia-x86_64-asm_64.S
+++ b/arch/x86/crypto/camellia-x86_64-asm_64.S
@@ -92,11 +92,13 @@
 #define RXORbl %r9b
 
 #define xor2ror16(T0, T1, tmp1, tmp2, ab, dst) \
+	leaq T0(%rip), 			tmp1; \
 	movzbl ab ## bl,		tmp2 ## d; \
+	xorq (tmp1, tmp2, 8),		dst; \
+	leaq T1(%rip), 			tmp2; \
 	movzbl ab ## bh,		tmp1 ## d; \
-	rorq $16,			ab; \
-	xorq T0(, tmp2, 8),		dst; \
-	xorq T1(, tmp1, 8),		dst;
+	xorq (tmp2, tmp1, 8),		dst; \
+	rorq $16,			ab;
 
 /**********************************************************************
   1-way camellia
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index b4a8806234ea..ae2976b56b27 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@
 
 
 #define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	shrq $16,	src;                     \
-	movl		s1(, RID1, 4), dst ## d; \
-	op1		s2(, RID2, 4), dst ## d; \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	interleave_op(il_reg);			 \
-	op2		s3(, RID1, 4), dst ## d; \
-	op3		s4(, RID2, 4), dst ## d;
+	movzbl		src ## bh,       RID1d;    \
+	leaq		s1(%rip),        RID2;     \
+	movl		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,       RID2d;    \
+	leaq		s2(%rip),        RID1;     \
+	op1		(RID1, RID2, 4), dst ## d; \
+	shrq $16,	src;                       \
+	movzbl		src ## bh,     RID1d;      \
+	leaq		s3(%rip),        RID2;     \
+	op2		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,     RID2d;      \
+	leaq		s4(%rip),        RID1;     \
+	op3		(RID1, RID2, 4), dst ## d; \
+	interleave_op(il_reg);
 
 #define dummy(d) /* do nothing */
 
@@ -166,15 +170,15 @@
 	subround(l ## 3, r ## 3, l ## 4, r ## 4, f);
 
 #define enc_preload_rkr() \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		kr(CTX),                  RKR, RKR;
 
 #define dec_preload_rkr() \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		kr(CTX),                  RKR, RKR; \
-	vpshufb		.Lbswap128_mask,          RKR, RKR;
+	vpshufb		.Lbswap128_mask(%rip),    RKR, RKR;
 
 #define transpose_2x4(x0, x1, t0, t1) \
 	vpunpckldq		x1, x0, t0; \
@@ -249,9 +253,9 @@ __cast5_enc_blk16:
 	pushq %rbp;
 	pushq %rbx;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 	enc_preload_rkr();
 
 	inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -285,7 +289,7 @@ __cast5_enc_blk16:
 	popq %rbx;
 	popq %rbp;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 
 	outunpack_blocks(RR1, RL1, RTMP, RX, RKM);
 	outunpack_blocks(RR2, RL2, RTMP, RX, RKM);
@@ -321,9 +325,9 @@ __cast5_dec_blk16:
 	pushq %rbp;
 	pushq %rbx;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 	dec_preload_rkr();
 
 	inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -354,7 +358,7 @@ __cast5_dec_blk16:
 	round(RL, RR, 1, 2);
 	round(RR, RL, 0, 1);
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 	popq %rbx;
 	popq %rbp;
 
@@ -508,8 +512,8 @@ ENTRY(cast5_ctr_16way)
 
 	vpcmpeqd RKR, RKR, RKR;
 	vpaddq RKR, RKR, RKR; /* low: -2, high: -2 */
-	vmovdqa .Lbswap_iv_mask, R1ST;
-	vmovdqa .Lbswap128_mask, RKM;
+	vmovdqa .Lbswap_iv_mask(%rip), R1ST;
+	vmovdqa .Lbswap128_mask(%rip), RKM;
 
 	/* load IV and byteswap */
 	vmovq (%rcx), RX;
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index 952d3156a933..6bd52210a3c1 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@
 
 
 #define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	shrq $16,	src;                     \
-	movl		s1(, RID1, 4), dst ## d; \
-	op1		s2(, RID2, 4), dst ## d; \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	interleave_op(il_reg);			 \
-	op2		s3(, RID1, 4), dst ## d; \
-	op3		s4(, RID2, 4), dst ## d;
+	movzbl		src ## bh,       RID1d;    \
+	leaq		s1(%rip),        RID2;     \
+	movl		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,       RID2d;    \
+	leaq		s2(%rip),        RID1;     \
+	op1		(RID1, RID2, 4), dst ## d; \
+	shrq $16,	src;                       \
+	movzbl		src ## bh,     RID1d;      \
+	leaq		s3(%rip),        RID2;     \
+	op2		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,     RID2d;      \
+	leaq		s4(%rip),        RID1;     \
+	op3		(RID1, RID2, 4), dst ## d; \
+	interleave_op(il_reg);
 
 #define dummy(d) /* do nothing */
 
@@ -190,10 +194,10 @@
 	qop(RD, RC, 1);
 
 #define shuffle(mask) \
-	vpshufb		mask,            RKR, RKR;
+	vpshufb		mask(%rip),            RKR, RKR;
 
 #define preload_rkr(n, do_mask, mask) \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		(kr+n*16)(CTX),           RKR, RKR; \
 	do_mask(mask);
@@ -273,9 +277,9 @@ __cast6_enc_blk8:
 	pushq %rbp;
 	pushq %rbx;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 
 	inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -299,7 +303,7 @@ __cast6_enc_blk8:
 	popq %rbx;
 	popq %rbp;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 
 	outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -319,9 +323,9 @@ __cast6_dec_blk8:
 	pushq %rbp;
 	pushq %rbx;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 
 	inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -345,7 +349,7 @@ __cast6_dec_blk8:
 	popq %rbx;
 	popq %rbp;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 	outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
 
diff --git a/arch/x86/crypto/des3_ede-asm_64.S b/arch/x86/crypto/des3_ede-asm_64.S
index f3e91647ca27..d532ff94b70a 100644
--- a/arch/x86/crypto/des3_ede-asm_64.S
+++ b/arch/x86/crypto/des3_ede-asm_64.S
@@ -138,21 +138,29 @@
 	movzbl RW0bl, RT2d; \
 	movzbl RW0bh, RT3d; \
 	shrq $16, RW0; \
-	movq s8(, RT0, 8), RT0; \
-	xorq s6(, RT1, 8), to; \
+	leaq s8(%rip), RW1; \
+	movq (RW1, RT0, 8), RT0; \
+	leaq s6(%rip), RW1; \
+	xorq (RW1, RT1, 8), to; \
 	movzbl RW0bl, RL1d; \
 	movzbl RW0bh, RT1d; \
 	shrl $16, RW0d; \
-	xorq s4(, RT2, 8), RT0; \
-	xorq s2(, RT3, 8), to; \
+	leaq s4(%rip), RW1; \
+	xorq (RW1, RT2, 8), RT0; \
+	leaq s2(%rip), RW1; \
+	xorq (RW1, RT3, 8), to; \
 	movzbl RW0bl, RT2d; \
 	movzbl RW0bh, RT3d; \
-	xorq s7(, RL1, 8), RT0; \
-	xorq s5(, RT1, 8), to; \
-	xorq s3(, RT2, 8), RT0; \
+	leaq s7(%rip), RW1; \
+	xorq (RW1, RL1, 8), RT0; \
+	leaq s5(%rip), RW1; \
+	xorq (RW1, RT1, 8), to; \
+	leaq s3(%rip), RW1; \
+	xorq (RW1, RT2, 8), RT0; \
 	load_next_key(n, RW0); \
 	xorq RT0, to; \
-	xorq s1(, RT3, 8), to; \
+	leaq s1(%rip), RW1; \
+	xorq (RW1, RT3, 8), to; \
 
 #define load_next_key(n, RWx) \
 	movq (((n) + 1) * 8)(CTX), RWx;
@@ -362,65 +370,89 @@ ENDPROC(des3_ede_x86_64_crypt_blk)
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrq $16, RW0; \
-	xorq s8(, RT3, 8), to##0; \
-	xorq s6(, RT1, 8), to##0; \
+	leaq s8(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s6(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrq $16, RW0; \
-	xorq s4(, RT3, 8), to##0; \
-	xorq s2(, RT1, 8), to##0; \
+	leaq s4(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s2(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrl $16, RW0d; \
-	xorq s7(, RT3, 8), to##0; \
-	xorq s5(, RT1, 8), to##0; \
+	leaq s7(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s5(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	load_next_key(n, RW0); \
-	xorq s3(, RT3, 8), to##0; \
-	xorq s1(, RT1, 8), to##0; \
+	leaq s3(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s1(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 		xorq from##1, RW1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrq $16, RW1; \
-		xorq s8(, RT3, 8), to##1; \
-		xorq s6(, RT1, 8), to##1; \
+		leaq s8(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s6(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrq $16, RW1; \
-		xorq s4(, RT3, 8), to##1; \
-		xorq s2(, RT1, 8), to##1; \
+		leaq s4(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s2(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrl $16, RW1d; \
-		xorq s7(, RT3, 8), to##1; \
-		xorq s5(, RT1, 8), to##1; \
+		leaq s7(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s5(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		do_movq(RW0, RW1); \
-		xorq s3(, RT3, 8), to##1; \
-		xorq s1(, RT1, 8), to##1; \
+		leaq s3(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s1(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 			xorq from##2, RW2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrq $16, RW2; \
-			xorq s8(, RT3, 8), to##2; \
-			xorq s6(, RT1, 8), to##2; \
+			leaq s8(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s6(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrq $16, RW2; \
-			xorq s4(, RT3, 8), to##2; \
-			xorq s2(, RT1, 8), to##2; \
+			leaq s4(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s2(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrl $16, RW2d; \
-			xorq s7(, RT3, 8), to##2; \
-			xorq s5(, RT1, 8), to##2; \
+			leaq s7(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s5(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			do_movq(RW0, RW2); \
-			xorq s3(, RT3, 8), to##2; \
-			xorq s1(, RT1, 8), to##2;
+			leaq s3(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s1(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2;
 
 #define __movq(src, dst) \
 	movq src, dst;
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index f94375a8dcd1..d56a281221fb 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -97,7 +97,7 @@ ENTRY(clmul_ghash_mul)
 	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
-	movaps .Lbswap_mask, BSWAP
+	movaps .Lbswap_mask(%rip), BSWAP
 	PSHUFB_XMM BSWAP DATA
 	call __clmul_gf128mul_ble
 	PSHUFB_XMM BSWAP DATA
@@ -114,7 +114,7 @@ ENTRY(clmul_ghash_update)
 	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
-	movaps .Lbswap_mask, BSWAP
+	movaps .Lbswap_mask(%rip), BSWAP
 	movups (%rdi), DATA
 	movups (%rcx), SHASH
 	PSHUFB_XMM BSWAP DATA
diff --git a/arch/x86/crypto/glue_helper-asm-avx.S b/arch/x86/crypto/glue_helper-asm-avx.S
index 02ee2308fb38..8a49ab1699ef 100644
--- a/arch/x86/crypto/glue_helper-asm-avx.S
+++ b/arch/x86/crypto/glue_helper-asm-avx.S
@@ -54,7 +54,7 @@
 #define load_ctr_8way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t1, t2) \
 	vpcmpeqd t0, t0, t0; \
 	vpsrldq $8, t0, t0; /* low: -1, high: 0 */ \
-	vmovdqa bswap, t1; \
+	vmovdqa bswap(%rip), t1; \
 	\
 	/* load IV and byteswap */ \
 	vmovdqu (iv), x7; \
@@ -99,7 +99,7 @@
 
 #define load_xts_8way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, t0, \
 		      t1, xts_gf128mul_and_shl1_mask) \
-	vmovdqa xts_gf128mul_and_shl1_mask, t0; \
+	vmovdqa xts_gf128mul_and_shl1_mask(%rip), t0; \
 	\
 	/* load IV */ \
 	vmovdqu (iv), tiv; \
diff --git a/arch/x86/crypto/glue_helper-asm-avx2.S b/arch/x86/crypto/glue_helper-asm-avx2.S
index a53ac11dd385..e04c80467bd2 100644
--- a/arch/x86/crypto/glue_helper-asm-avx2.S
+++ b/arch/x86/crypto/glue_helper-asm-avx2.S
@@ -67,7 +67,7 @@
 	vmovdqu (iv), t2x; \
 	vmovdqa t2x, t3x; \
 	inc_le128(t2x, t0x, t1x); \
-	vbroadcasti128 bswap, t1; \
+	vbroadcasti128 bswap(%rip), t1; \
 	vinserti128 $1, t2x, t3, t2; /* ab: le0 ; cd: le1 */ \
 	vpshufb t1, t2, x0; \
 	\
@@ -124,13 +124,13 @@
 		       tivx, t0, t0x, t1, t1x, t2, t2x, t3, \
 		       xts_gf128mul_and_shl1_mask_0, \
 		       xts_gf128mul_and_shl1_mask_1) \
-	vbroadcasti128 xts_gf128mul_and_shl1_mask_0, t1; \
+	vbroadcasti128 xts_gf128mul_and_shl1_mask_0(%rip), t1; \
 	\
 	/* load IV and construct second IV */ \
 	vmovdqu (iv), tivx; \
 	vmovdqa tivx, t0x; \
 	gf128mul_x_ble(tivx, t1x, t2x); \
-	vbroadcasti128 xts_gf128mul_and_shl1_mask_1, t2; \
+	vbroadcasti128 xts_gf128mul_and_shl1_mask_1(%rip), t2; \
 	vinserti128 $1, tivx, t0, tiv; \
 	vpxor (0*32)(src), tiv, x0; \
 	vmovdqu tiv, (0*32)(dst); \
-- 
2.13.2.932.g7449e964c-goog

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 02/22] x86: Use symbol name on bug table for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
  2017-07-18 22:33 ` [RFC 01/22] x86/crypto: Adapt assembly for PIE support Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 03/22] x86: Use symbol name in jump " Thomas Garnier
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/bug.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 39e702d90cdb..2307e2aceb00 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -37,7 +37,7 @@ do {									\
 	asm volatile("1:\t" ins "\n"					\
 		     ".pushsection __bug_table,\"a\"\n"			\
 		     "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n"	\
-		     "\t"  __BUG_REL(%c0) "\t# bug_entry::file\n"	\
+		     "\t"  __BUG_REL(%P0) "\t# bug_entry::file\n"	\
 		     "\t.word %c1"        "\t# bug_entry::line\n"	\
 		     "\t.word %c2"        "\t# bug_entry::flags\n"	\
 		     "\t.org 2b+%c3\n"					\
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 03/22] x86: Use symbol name in jump table for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
  2017-07-18 22:33 ` [RFC 01/22] x86/crypto: Adapt assembly for PIE support Thomas Garnier
  2017-07-18 22:33 ` [RFC 02/22] x86: Use symbol name on bug table " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 04/22] x86: Add macro to get symbol address " Thomas Garnier
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/jump_label.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index adc54c12cbd1..6e558e4524dc 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -36,9 +36,9 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
 		".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
 		".pushsection __jump_table,  \"aw\" \n\t"
 		_ASM_ALIGN "\n\t"
-		_ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+		_ASM_PTR "1b, %l[l_yes], %P0 \n\t"
 		".popsection \n\t"
-		: :  "i" (key), "i" (branch) : : l_yes);
+		: :  "X" (&((char *)key)[branch]) : : l_yes);
 
 	return false;
 l_yes:
@@ -52,9 +52,9 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
 		"2:\n\t"
 		".pushsection __jump_table,  \"aw\" \n\t"
 		_ASM_ALIGN "\n\t"
-		_ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+		_ASM_PTR "1b, %l[l_yes], %P0 \n\t"
 		".popsection \n\t"
-		: :  "i" (key), "i" (branch) : : l_yes);
+		: :  "X" (&((char *)key)[branch]) : : l_yes);
 
 	return false;
 l_yes:
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 04/22] x86: Add macro to get symbol address for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (2 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 03/22] x86: Use symbol name in jump " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 05/22] xen: Adapt assembly " Thomas Garnier
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Add a new _ASM_GET_PTR macro to fetch a symbol address. It will be used
to replace "_ASM_MOV $<symbol>, %dst" code construct that are not compatible
with PIE.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/asm.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 7a9df3beb89b..bf2842cfb583 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -55,6 +55,19 @@
 # define CC_OUT(c) [_cc_ ## c] "=qm"
 #endif
 
+/* Macros to get a global variable address with PIE support on 64-bit */
+#ifdef CONFIG_X86_32
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(movl $##_src)
+#else
+#ifdef __ASSEMBLY__
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(leaq (_src)(%rip))
+#else
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(leaq (_src)(%%rip))
+#endif
+#endif
+#define _ASM_GET_PTR(_src, _dst) \
+		__ASM_GET_PTR_PRE(_src) __ASM_FORM(_dst)
+
 /* Exception table entry */
 #ifdef __ASSEMBLY__
 # define _ASM_EXTABLE_HANDLE(from, to, handler)			\
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 05/22] xen: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (3 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 04/22] x86: Add macro to get symbol address " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 06/22] kvm: " Thomas Garnier
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use the new _ASM_GET_PTR macro which get a
symbol reference while being PIE compatible. Modify the RELOC macro that
was using an assignment generating a non-relative reference.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/xen/xen-asm.h  | 3 ++-
 arch/x86/xen/xen-head.S | 9 +++++----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/xen-asm.h b/arch/x86/xen/xen-asm.h
index 465276467a47..3b1c8a2e77d8 100644
--- a/arch/x86/xen/xen-asm.h
+++ b/arch/x86/xen/xen-asm.h
@@ -2,8 +2,9 @@
 #define _XEN_XEN_ASM_H
 
 #include <linux/linkage.h>
+#include <asm/asm.h>
 
-#define RELOC(x, v)	.globl x##_reloc; x##_reloc=v
+#define RELOC(x, v)	.globl x##_reloc; x##_reloc: _ASM_PTR v
 #define ENDPATCH(x)	.globl x##_end; x##_end=.
 
 /* Pseudo-flag used for virtual NMI, which we don't implement yet */
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 72a8e6adebe6..ab2462396bd8 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -23,14 +23,15 @@ ENTRY(startup_xen)
 
 	/* Clear .bss */
 	xor %eax,%eax
-	mov $__bss_start, %_ASM_DI
-	mov $__bss_stop, %_ASM_CX
+	_ASM_GET_PTR(__bss_start, %_ASM_DI)
+	_ASM_GET_PTR(__bss_stop, %_ASM_CX)
 	sub %_ASM_DI, %_ASM_CX
 	shr $__ASM_SEL(2, 3), %_ASM_CX
 	rep __ASM_SIZE(stos)
 
-	mov %_ASM_SI, xen_start_info
-	mov $init_thread_union+THREAD_SIZE, %_ASM_SP
+	_ASM_GET_PTR(xen_start_info, %_ASM_AX)
+	mov %_ASM_SI, (%_ASM_AX)
+	_ASM_GET_PTR(init_thread_union+THREAD_SIZE, %_ASM_SP)
 
 	jmp xen_start_kernel
 
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (4 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 05/22] xen: Adapt assembly " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19  2:49   ` Brian Gerst
  2017-07-18 22:33 ` [RFC 07/22] x86: relocate_kernel - " Thomas Garnier
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible. The new __ASM_GET_PTR_PRE macro is used to
get the address of a symbol on both 32 and 64-bit with PIE support.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/kvm_host.h | 6 ++++--
 arch/x86/kernel/kvm.c           | 6 ++++--
 arch/x86/kvm/svm.c              | 4 ++--
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 87ac4fba6d8e..3041201a3aeb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1352,9 +1352,11 @@ asmlinkage void kvm_spurious_fault(void);
 	".pushsection .fixup, \"ax\" \n" \
 	"667: \n\t" \
 	cleanup_insn "\n\t"		      \
-	"cmpb $0, kvm_rebooting \n\t"	      \
+	"cmpb $0, kvm_rebooting" __ASM_SEL(,(%%rip)) " \n\t" \
 	"jne 668b \n\t"      		      \
-	__ASM_SIZE(push) " $666b \n\t"	      \
+	__ASM_SIZE(push) "%%" _ASM_AX " \n\t"		\
+	__ASM_GET_PTR_PRE(666b) "%%" _ASM_AX "\n\t"	\
+	"xchg %%" _ASM_AX ", (%%" _ASM_SP ") \n\t"	\
 	"call kvm_spurious_fault \n\t"	      \
 	".popsection \n\t" \
 	_ASM_EXTABLE(666b, 667b)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 71c17a5be983..53b8ad162589 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -618,8 +618,10 @@ asm(
 ".global __raw_callee_save___kvm_vcpu_is_preempted;"
 ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
 "__raw_callee_save___kvm_vcpu_is_preempted:"
-"movq	__per_cpu_offset(,%rdi,8), %rax;"
-"cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
+"leaq	__per_cpu_offset(%rip), %rax;"
+"movq	(%rax,%rdi,8), %rax;"
+"addq	" __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;"
+"cmpb	$0, (%rax);"
 "setne	%al;"
 "ret;"
 ".popsection");
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 4d8141e533c3..8b718c6d6729 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -554,12 +554,12 @@ static u32 svm_msrpm_offset(u32 msr)
 
 static inline void clgi(void)
 {
-	asm volatile (__ex(SVM_CLGI));
+	asm volatile (__ex(SVM_CLGI) : :);
 }
 
 static inline void stgi(void)
 {
-	asm volatile (__ex(SVM_STGI));
+	asm volatile (__ex(SVM_STGI) : :);
 }
 
 static inline void invlpga(unsigned long addr, u32 asid)
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 07/22] x86: relocate_kernel - Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (5 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 06/22] kvm: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19 22:58   ` H. Peter Anvin
  2017-07-18 22:33 ` [RFC 08/22] x86/entry/64: " Thomas Garnier
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/kernel/relocate_kernel_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 98111b38ebfd..da817d1628ac 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -186,7 +186,7 @@ identity_mapped:
 	movq	%rax, %cr3
 	lea	PAGE_SIZE(%r8), %rsp
 	call	swap_pages
-	movq	$virtual_mapped, %rax
+	leaq	virtual_mapped(%rip), %rax
 	pushq	%rax
 	ret
 
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 08/22] x86/entry/64: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (6 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 07/22] x86: relocate_kernel - " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 09/22] x86: pm-trace - " Thomas Garnier
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/entry/entry_64.S | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a9a8027a6c0e..691c4755269b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -195,12 +195,15 @@ entry_SYSCALL_64_fastpath:
 	ja	1f				/* return -ENOSYS (already in pt_regs->ax) */
 	movq	%r10, %rcx
 
+	/* Ensures the call is position independent */
+	leaq	sys_call_table(%rip), %r11
+
 	/*
 	 * This call instruction is handled specially in stub_ptregs_64.
 	 * It might end up jumping to the slow path.  If it jumps, RAX
 	 * and all argument registers are clobbered.
 	 */
-	call	*sys_call_table(, %rax, 8)
+	call	*(%r11, %rax, 8)
 .Lentry_SYSCALL_64_after_fastpath_call:
 
 	movq	%rax, RAX(%rsp)
@@ -333,7 +336,8 @@ ENTRY(stub_ptregs_64)
 	 * RAX stores a pointer to the C function implementing the syscall.
 	 * IRQs are on.
 	 */
-	cmpq	$.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
+	leaq	.Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
+	cmpq	%r11, (%rsp)
 	jne	1f
 
 	/*
@@ -1109,7 +1113,8 @@ ENTRY(error_entry)
 	movl	%ecx, %eax			/* zero extend */
 	cmpq	%rax, RIP+8(%rsp)
 	je	.Lbstep_iret
-	cmpq	$.Lgs_change, RIP+8(%rsp)
+	leaq	.Lgs_change(%rip), %rcx
+	cmpq	%rcx, RIP+8(%rsp)
 	jne	.Lerror_entry_done
 
 	/*
@@ -1324,10 +1329,10 @@ ENTRY(nmi)
 	 * resume the outer NMI.
 	 */
 
-	movq	$repeat_nmi, %rdx
+	leaq	repeat_nmi(%rip), %rdx
 	cmpq	8(%rsp), %rdx
 	ja	1f
-	movq	$end_repeat_nmi, %rdx
+	leaq	end_repeat_nmi(%rip), %rdx
 	cmpq	8(%rsp), %rdx
 	ja	nested_nmi_out
 1:
@@ -1381,7 +1386,8 @@ nested_nmi:
 	pushq	%rdx
 	pushfq
 	pushq	$__KERNEL_CS
-	pushq	$repeat_nmi
+	leaq	repeat_nmi(%rip), %rdx
+	pushq	%rdx
 
 	/* Put stack back */
 	addq	$(6*8), %rsp
@@ -1419,7 +1425,9 @@ first_nmi:
 	addq	$8, (%rsp)	/* Fix up RSP */
 	pushfq			/* RFLAGS */
 	pushq	$__KERNEL_CS	/* CS */
-	pushq	$1f		/* RIP */
+	pushq	%rax		/* Support Position Independent Code */
+	leaq	1f(%rip), %rax	/* RIP */
+	xchgq	%rax, (%rsp)	/* Restore RAX, put 1f */
 	INTERRUPT_RETURN	/* continues at repeat_nmi below */
 1:
 #endif
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 09/22] x86: pm-trace - Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (7 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 08/22] x86/entry/64: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 10/22] x86/CPU: " Thomas Garnier
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change assembly to use the new _ASM_GET_PTR macro instead of _ASM_MOV for
the assembly to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/pm-trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pm-trace.h b/arch/x86/include/asm/pm-trace.h
index 7b7ac42c3661..a3801261f0dd 100644
--- a/arch/x86/include/asm/pm-trace.h
+++ b/arch/x86/include/asm/pm-trace.h
@@ -7,7 +7,7 @@
 do {								\
 	if (pm_trace_enabled) {					\
 		const void *tracedata;				\
-		asm volatile(_ASM_MOV " $1f,%0\n"		\
+		asm volatile(_ASM_GET_PTR(1f, %0) "\n"		\
 			     ".section .tracedata,\"a\"\n"	\
 			     "1:\t.word %c1\n\t"		\
 			     _ASM_PTR " %c2\n"			\
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 10/22] x86/CPU: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (8 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 09/22] x86: pm-trace - " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 11/22] x86/acpi: " Thomas Garnier
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible. Use the new _ASM_GET_PTR macro instead of
the 'mov $symbol, %dst' construct to not have an absolute reference.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/processor.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 028245e1c42b..1adea8c4436e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -47,7 +47,7 @@ static inline void *current_text_addr(void)
 {
 	void *pc;
 
-	asm volatile("mov $1f, %0; 1:":"=r" (pc));
+	asm volatile(_ASM_GET_PTR(1f, %0) "; 1:":"=r" (pc));
 
 	return pc;
 }
@@ -682,6 +682,7 @@ static inline void sync_core(void)
 		: "+r" (__sp) : : "memory");
 #else
 	unsigned int tmp;
+	unsigned long tmp2;
 
 	asm volatile (
 		"mov %%ss, %0\n\t"
@@ -691,10 +692,11 @@ static inline void sync_core(void)
 		"pushfq\n\t"
 		"mov %%cs, %0\n\t"
 		"pushq %q0\n\t"
-		"pushq $1f\n\t"
+		"leaq 1f(%%rip), %1\n\t"
+		"pushq %1\n\t"
 		"iretq\n\t"
 		"1:"
-		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
+		: "=&r" (tmp), "=&r" (tmp2), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 11/22] x86/acpi: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (9 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 10/22] x86/CPU: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 12/22] x86/boot/64: " Thomas Garnier
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/kernel/acpi/wakeup_64.S | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 50b8ed0317a3..472659c0f811 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -14,7 +14,7 @@
 	 * Hooray, we are in Long 64-bit mode (but still running in low memory)
 	 */
 ENTRY(wakeup_long64)
-	movq	saved_magic, %rax
+	movq	saved_magic(%rip), %rax
 	movq	$0x123456789abcdef0, %rdx
 	cmpq	%rdx, %rax
 	jne	bogus_64_magic
@@ -25,14 +25,14 @@ ENTRY(wakeup_long64)
 	movw	%ax, %es
 	movw	%ax, %fs
 	movw	%ax, %gs
-	movq	saved_rsp, %rsp
+	movq	saved_rsp(%rip), %rsp
 
-	movq	saved_rbx, %rbx
-	movq	saved_rdi, %rdi
-	movq	saved_rsi, %rsi
-	movq	saved_rbp, %rbp
+	movq	saved_rbx(%rip), %rbx
+	movq	saved_rdi(%rip), %rdi
+	movq	saved_rsi(%rip), %rsi
+	movq	saved_rbp(%rip), %rbp
 
-	movq	saved_rip, %rax
+	movq	saved_rip(%rip), %rax
 	jmp	*%rax
 ENDPROC(wakeup_long64)
 
@@ -45,7 +45,7 @@ ENTRY(do_suspend_lowlevel)
 	xorl	%eax, %eax
 	call	save_processor_state
 
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
 	movq	%rsi, pt_regs_si(%rax)
@@ -64,13 +64,14 @@ ENTRY(do_suspend_lowlevel)
 	pushfq
 	popq	pt_regs_flags(%rax)
 
-	movq	$.Lresume_point, saved_rip(%rip)
+	leaq	.Lresume_point(%rip), %rax
+	movq	%rax, saved_rip(%rip)
 
-	movq	%rsp, saved_rsp
-	movq	%rbp, saved_rbp
-	movq	%rbx, saved_rbx
-	movq	%rdi, saved_rdi
-	movq	%rsi, saved_rsi
+	movq	%rsp, saved_rsp(%rip)
+	movq	%rbp, saved_rbp(%rip)
+	movq	%rbx, saved_rbx(%rip)
+	movq	%rdi, saved_rdi(%rip)
+	movq	%rsi, saved_rsi(%rip)
 
 	addq	$8, %rsp
 	movl	$3, %edi
@@ -82,7 +83,7 @@ ENTRY(do_suspend_lowlevel)
 	.align 4
 .Lresume_point:
 	/* We don't restore %rax, it must be 0 anyway */
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	saved_context_cr4(%rax), %rbx
 	movq	%rbx, %cr4
 	movq	saved_context_cr3(%rax), %rbx
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 12/22] x86/boot/64: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (10 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 11/22] x86/acpi: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 13/22] x86/power/64: " Thomas Garnier
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Early at boot, the kernel is mapped at a temporary address while preparing
the page table. To know the changes needed for the page table with KASLR,
the boot code calculate the difference between the expected address of the
kernel and the one chosen by KASLR. It does not work with PIE because all
symbols in code are relatives. Instead of getting the future relocated
virtual address, you will get the current temporary mapping. The solution
is using global variables that will be relocated as expected.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/kernel/head_64.S | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 6225550883df..7e4f7a83a15a 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -78,8 +78,23 @@ startup_64:
 	call	__startup_64
 	popq	%rsi
 
-	movq	$(early_top_pgt - __START_KERNEL_map), %rax
+	movq    _early_top_pgt_offset(%rip), %rax
 	jmp 1f
+
+	/*
+	 * Position Independent Code takes only relative references in code
+	 * meaning a global variable address is relative to RIP and not its
+	 * future virtual address. Global variables can be used instead as they
+	 * are still relocated on the expected kernel mapping address.
+	 */
+	.align 8
+_early_top_pgt_offset:
+	.quad early_top_pgt - __START_KERNEL_map
+_init_top_offset:
+	.quad init_top_pgt - __START_KERNEL_map
+_va_jump:
+	.quad 2f
+
 ENTRY(secondary_startup_64)
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
@@ -98,7 +113,8 @@ ENTRY(secondary_startup_64)
 	/* Sanitize CPU configuration */
 	call verify_cpu
 
-	movq	$(init_top_pgt - __START_KERNEL_map), %rax
+	movq    _init_top_offset(%rip), %rax
+
 1:
 
 	/* Enable PAE mode, PGE and LA57 */
@@ -113,9 +129,8 @@ ENTRY(secondary_startup_64)
 	movq	%rax, %cr3
 
 	/* Ensure I am executing from virtual addresses */
-	movq	$1f, %rax
-	jmp	*%rax
-1:
+	jmp	*_va_jump(%rip)
+2:
 
 	/* Check if nx is implemented */
 	movl	$0x80000001, %eax
@@ -211,11 +226,12 @@ ENTRY(secondary_startup_64)
 	 *	REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
 	 *		address given in m16:64.
 	 */
-	pushq	$.Lafter_lret	# put return address on stack for unwinder
+	leaq	.Lafter_lret(%rip), %rax
+	pushq	%rax		# put return address on stack for unwinder
 	xorq	%rbp, %rbp	# clear frame pointer
-	movq	initial_code(%rip), %rax
+	leaq	initial_code(%rip), %rax
 	pushq	$__KERNEL_CS	# set correct cs
-	pushq	%rax		# target address in negative space
+	pushq	(%rax)		# target address in negative space
 	lretq
 .Lafter_lret:
 ENDPROC(secondary_startup_64)
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 13/22] x86/power/64: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (11 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 12/22] x86/boot/64: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19 18:41   ` Pavel Machek
  2017-07-18 22:33 ` [RFC 14/22] x86/paravirt: " Thomas Garnier
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/power/hibernate_asm_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index ce8da3a0412c..6fdd7bbc3c33 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -24,7 +24,7 @@
 #include <asm/frame.h>
 
 ENTRY(swsusp_arch_suspend)
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
 	movq	%rsi, pt_regs_si(%rax)
@@ -115,7 +115,7 @@ ENTRY(restore_registers)
 	movq	%rax, %cr4;  # turn PGE back on
 
 	/* We don't restore %rax, it must be 0 anyway */
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	pt_regs_sp(%rax), %rsp
 	movq	pt_regs_bp(%rax), %rbp
 	movq	pt_regs_si(%rax), %rsi
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 14/22] x86/paravirt: Adapt assembly for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (12 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 13/22] x86/power/64: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 15/22] x86/boot/64: Use _text in a global " Thomas Garnier
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

if PIE is enabled, switch the paravirt assembly constraints to be
compatible. The %c/i constrains generate smaller code so is kept by
default.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/include/asm/paravirt_types.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 9ffc36bfe4cd..6f67c10672ec 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -347,9 +347,17 @@ extern struct pv_lock_ops pv_lock_ops;
 #define PARAVIRT_PATCH(x)					\
 	(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
 
+#ifdef CONFIG_X86_PIE
+#define paravirt_opptr_call "a"
+#define paravirt_opptr_type "p"
+#else
+#define paravirt_opptr_call "c"
+#define paravirt_opptr_type "i"
+#endif
+
 #define paravirt_type(op)				\
 	[paravirt_typenum] "i" (PARAVIRT_PATCH(op)),	\
-	[paravirt_opptr] "i" (&(op))
+	[paravirt_opptr] paravirt_opptr_type (&(op))
 #define paravirt_clobber(clobber)		\
 	[paravirt_clobber] "i" (clobber)
 
@@ -403,7 +411,7 @@ int paravirt_disable_iospace(void);
  * offset into the paravirt_patch_template structure, and can therefore be
  * freely converted back into a structure offset.
  */
-#define PARAVIRT_CALL	"call *%c[paravirt_opptr];"
+#define PARAVIRT_CALL	"call *%" paravirt_opptr_call "[paravirt_opptr];"
 
 /*
  * These macros are intended to wrap calls through one of the paravirt
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 15/22] x86/boot/64: Use _text in a global for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (13 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 14/22] x86/paravirt: " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 16/22] x86/percpu: Adapt percpu " Thomas Garnier
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

By default PIE generated code create only relative references so _text
points to the temporary virtual address. Instead use a global variable
so the relocation is done as expected.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/kernel/head64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 46c3c73e7f43..4103e90ff128 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -45,7 +45,13 @@ static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
 	return ptr - (void *)_text + (void *)physaddr;
 }
 
-void __head __startup_64(unsigned long physaddr)
+/*
+ * Use a global variable to properly calculate _text delta on PIE. By default
+ * a PIE binary do a RIP relative difference instead of the relocated address.
+ */
+unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map);
+
+void __head notrace __startup_64(unsigned long physaddr)
 {
 	unsigned long load_delta, *p;
 	pgdval_t *pgd;
@@ -62,7 +68,7 @@ void __head __startup_64(unsigned long physaddr)
 	 * Compute the delta between the address I am compiled to run at
 	 * and the address I am actually running at.
 	 */
-	load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
+	load_delta = physaddr - _text_offset;
 
 	/* Is the address not 2M aligned? */
 	if (load_delta & ~PMD_PAGE_MASK)
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (14 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 15/22] x86/boot/64: Use _text in a global " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19  3:08   ` Brian Gerst
  2017-07-18 22:33 ` [RFC 17/22] compiler: Option to default to hidden symbols Thomas Garnier
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Perpcu uses a clever design where the .percu ELF section has a virtual
address of zero and the relocation code avoid relocating specific
symbols. It makes the code simple and easily adaptable with or without
SMP support.

This design is incompatible with PIE because generated code always try to
access the zero virtual address relative to the default mapping address.
It becomes impossible when KASLR is configured to go below -2G. This
patch solves this problem by removing the zero mapping and adapting the GS
base to be relative to the expected address. These changes are done only
when PIE is enabled. The original implementation is kept as-is
by default.

The assembly and PER_CPU macros are changed to use relative references
when PIE is enabled.

The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given
percpu symbols are not absolute in this case.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/entry/entry_64.S      |  4 ++--
 arch/x86/include/asm/percpu.h  | 25 +++++++++++++++++++------
 arch/x86/kernel/cpu/common.c   |  4 +++-
 arch/x86/kernel/head_64.S      |  4 ++++
 arch/x86/kernel/setup_percpu.c |  2 +-
 arch/x86/kernel/vmlinux.lds.S  | 13 +++++++++++--
 arch/x86/lib/cmpxchg16b_emu.S  |  8 ++++----
 arch/x86/xen/xen-asm.S         | 12 ++++++------
 init/Kconfig                   |  2 +-
 9 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 691c4755269b..be198c0a2a8c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -388,7 +388,7 @@ ENTRY(__switch_to_asm)
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	movq	TASK_stack_canary(%rsi), %rbx
-	movq	%rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
+	movq	%rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset)
 #endif
 
 	/* restore callee-saved registers */
@@ -739,7 +739,7 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 /*
  * Exception entry points.
  */
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
+#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss + (TSS_ist + ((x) - 1) * 8))
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 9fa03604b2b3..862eb771f0e5 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -4,9 +4,11 @@
 #ifdef CONFIG_X86_64
 #define __percpu_seg		gs
 #define __percpu_mov_op		movq
+#define __percpu_rel		(%rip)
 #else
 #define __percpu_seg		fs
 #define __percpu_mov_op		movl
+#define __percpu_rel
 #endif
 
 #ifdef __ASSEMBLY__
@@ -27,10 +29,14 @@
 #define PER_CPU(var, reg)						\
 	__percpu_mov_op %__percpu_seg:this_cpu_off, reg;		\
 	lea var(reg), reg
-#define PER_CPU_VAR(var)	%__percpu_seg:var
+/* Compatible with Position Independent Code */
+#define PER_CPU_VAR(var)		%__percpu_seg:(var)##__percpu_rel
+/* Rare absolute reference */
+#define PER_CPU_VAR_ABS(var)		%__percpu_seg:var
 #else /* ! SMP */
 #define PER_CPU(var, reg)	__percpu_mov_op $var, reg
-#define PER_CPU_VAR(var)	var
+#define PER_CPU_VAR(var)	(var)##__percpu_rel
+#define PER_CPU_VAR_ABS(var)	var
 #endif	/* SMP */
 
 #ifdef CONFIG_X86_64_SMP
@@ -208,27 +214,34 @@ do {									\
 	pfo_ret__;					\
 })
 
+/* Position Independent code uses relative addresses only */
+#ifdef CONFIG_X86_PIE
+#define __percpu_stable_arg __percpu_arg(a1)
+#else
+#define __percpu_stable_arg __percpu_arg(P1)
+#endif
+
 #define percpu_stable_op(op, var)			\
 ({							\
 	typeof(var) pfo_ret__;				\
 	switch (sizeof(var)) {				\
 	case 1:						\
-		asm(op "b "__percpu_arg(P1)",%0"	\
+		asm(op "b "__percpu_stable_arg ",%0"	\
 		    : "=q" (pfo_ret__)			\
 		    : "p" (&(var)));			\
 		break;					\
 	case 2:						\
-		asm(op "w "__percpu_arg(P1)",%0"	\
+		asm(op "w "__percpu_stable_arg ",%0"	\
 		    : "=r" (pfo_ret__)			\
 		    : "p" (&(var)));			\
 		break;					\
 	case 4:						\
-		asm(op "l "__percpu_arg(P1)",%0"	\
+		asm(op "l "__percpu_stable_arg ",%0"	\
 		    : "=r" (pfo_ret__)			\
 		    : "p" (&(var)));			\
 		break;					\
 	case 8:						\
-		asm(op "q "__percpu_arg(P1)",%0"	\
+		asm(op "q "__percpu_stable_arg ",%0"	\
 		    : "=r" (pfo_ret__)			\
 		    : "p" (&(var)));			\
 		break;					\
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b95cd94ca97b..31300767ec0f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -480,7 +480,9 @@ void load_percpu_segment(int cpu)
 	loadsegment(fs, __KERNEL_PERCPU);
 #else
 	__loadsegment_simple(gs, 0);
-	wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu));
+	wrmsrl(MSR_GS_BASE,
+	       (unsigned long)per_cpu(irq_stack_union.gs_base, cpu) -
+	       (unsigned long)__per_cpu_start);
 #endif
 	load_stack_canary_segment();
 }
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 7e4f7a83a15a..4d0a7e68bfe8 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -256,7 +256,11 @@ ENDPROC(start_cpu0)
 	GLOBAL(initial_code)
 	.quad	x86_64_start_kernel
 	GLOBAL(initial_gs)
+#ifdef CONFIG_X86_PIE
+	.quad	0
+#else
 	.quad	INIT_PER_CPU_VAR(irq_stack_union)
+#endif
 	GLOBAL(initial_stack)
 	/*
 	 * The SIZEOF_PTREGS gap is a convention which helps the in-kernel
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 10edd1e69a68..ce1c58a29def 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -25,7 +25,7 @@
 DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
 EXPORT_PER_CPU_SYMBOL(cpu_number);
 
-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && !defined(CONFIG_X86_PIE)
 #define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
 #else
 #define BOOT_PERCPU_OFFSET 0
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index c8a3b61be0aa..77f1b0622539 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -183,9 +183,14 @@ SECTIONS
 	/*
 	 * percpu offsets are zero-based on SMP.  PERCPU_VADDR() changes the
 	 * output PHDR, so the next output section - .init.text - should
-	 * start another segment - init.
+	 * start another segment - init. For Position Independent Code, the
+	 * per-cpu section cannot be zero-based because everything is relative.
 	 */
+#ifdef CONFIG_X86_PIE
+	PERCPU_SECTION(INTERNODE_CACHE_BYTES)
+#else
 	PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
+#endif
 	ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
 	       "per-CPU data too large - increase CONFIG_PHYSICAL_START")
 #endif
@@ -361,7 +366,11 @@ SECTIONS
  * Per-cpu symbols which need to be offset from __per_cpu_load
  * for the boot processor.
  */
+#ifdef CONFIG_X86_PIE
+#define INIT_PER_CPU(x) init_per_cpu__##x = x
+#else
 #define INIT_PER_CPU(x) init_per_cpu__##x = x + __per_cpu_load
+#endif
 INIT_PER_CPU(gdt_page);
 INIT_PER_CPU(irq_stack_union);
 
@@ -371,7 +380,7 @@ INIT_PER_CPU(irq_stack_union);
 . = ASSERT((_end - _text <= KERNEL_IMAGE_SIZE),
 	   "kernel image bigger than KERNEL_IMAGE_SIZE");
 
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && !defined(CONFIG_X86_PIE)
 . = ASSERT((irq_stack_union == 0),
            "irq_stack_union is not at start of per-cpu area");
 #endif
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 9b330242e740..254950604ae4 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -33,13 +33,13 @@ ENTRY(this_cpu_cmpxchg16b_emu)
 	pushfq
 	cli
 
-	cmpq PER_CPU_VAR((%rsi)), %rax
+	cmpq PER_CPU_VAR_ABS((%rsi)), %rax
 	jne .Lnot_same
-	cmpq PER_CPU_VAR(8(%rsi)), %rdx
+	cmpq PER_CPU_VAR_ABS(8(%rsi)), %rdx
 	jne .Lnot_same
 
-	movq %rbx, PER_CPU_VAR((%rsi))
-	movq %rcx, PER_CPU_VAR(8(%rsi))
+	movq %rbx, PER_CPU_VAR_ABS((%rsi))
+	movq %rcx, PER_CPU_VAR_ABS(8(%rsi))
 
 	popfq
 	mov $1, %al
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index eff224df813f..40410969fd3c 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -26,7 +26,7 @@
 ENTRY(xen_irq_enable_direct)
 	FRAME_BEGIN
 	/* Unmask events */
-	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 
 	/*
 	 * Preempt here doesn't matter because that will deal with any
@@ -35,7 +35,7 @@ ENTRY(xen_irq_enable_direct)
 	 */
 
 	/* Test for pending */
-	testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+	testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
 	jz 1f
 
 2:	call check_events
@@ -52,7 +52,7 @@ ENDPATCH(xen_irq_enable_direct)
  * non-zero.
  */
 ENTRY(xen_irq_disable_direct)
-	movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 ENDPATCH(xen_irq_disable_direct)
 	ret
 	ENDPROC(xen_irq_disable_direct)
@@ -68,7 +68,7 @@ ENDPATCH(xen_irq_disable_direct)
  * x86 use opposite senses (mask vs enable).
  */
 ENTRY(xen_save_fl_direct)
-	testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 	setz %ah
 	addb %ah, %ah
 ENDPATCH(xen_save_fl_direct)
@@ -91,7 +91,7 @@ ENTRY(xen_restore_fl_direct)
 #else
 	testb $X86_EFLAGS_IF>>8, %ah
 #endif
-	setz PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	setz PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 	/*
 	 * Preempt here doesn't matter because that will deal with any
 	 * pending interrupts.  The pending check may end up being run
@@ -99,7 +99,7 @@ ENTRY(xen_restore_fl_direct)
 	 */
 
 	/* check for unmasked and pending */
-	cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+	cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
 	jnz 1f
 2:	call check_events
 1:
diff --git a/init/Kconfig b/init/Kconfig
index 8514b25db21c..4fb5d6fc2c4f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1201,7 +1201,7 @@ config KALLSYMS_ALL
 config KALLSYMS_ABSOLUTE_PERCPU
 	bool
 	depends on KALLSYMS
-	default X86_64 && SMP
+	default X86_64 && SMP && !X86_PIE
 
 config KALLSYMS_BASE_RELATIVE
 	bool
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 17/22] compiler: Option to default to hidden symbols
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (15 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 16/22] x86/percpu: Adapt percpu " Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 18/22] x86/relocs: Handle DYN relocations for PIE support Thomas Garnier
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Provide an option to default visibility to hidden except for key
symbols. This option is disabled by default and will be used by x86_64
PIE support to remove errors between compilation units.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/boot/boot.h           |  2 +-
 arch/x86/include/asm/setup.h   |  2 +-
 include/asm-generic/sections.h |  6 ++++++
 include/linux/compiler.h       |  8 ++++++++
 init/Kconfig                   |  7 +++++++
 kernel/kallsyms.c              | 16 ++++++++--------
 6 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index ef5a9cc66fb8..d726c35bdd96 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -193,7 +193,7 @@ static inline bool memcmp_gs(const void *s1, addr_t s2, size_t len)
 }
 
 /* Heap -- available for dynamic lists. */
-extern char _end[];
+extern char _end[] __default_visibility;
 extern char *HEAP;
 extern char *heap_end;
 #define RESET_HEAP() ((void *)( HEAP = _end ))
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index e4585a393965..f3ffad82bdc0 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -66,7 +66,7 @@ static inline void x86_ce4100_early_setup(void) { }
  * This is set up by the setup-routine at boot-time
  */
 extern struct boot_params boot_params;
-extern char _text[];
+extern char _text[] __default_visibility;
 
 static inline bool kaslr_enabled(void)
 {
diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index 532372c6cf15..27c12f6dd6e2 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -28,6 +28,9 @@
  *	__entry_text_start, __entry_text_end
  *	__ctors_start, __ctors_end
  */
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(default)
+#endif
 extern char _text[], _stext[], _etext[];
 extern char _data[], _sdata[], _edata[];
 extern char __bss_start[], __bss_stop[];
@@ -42,6 +45,9 @@ extern char __start_rodata[], __end_rodata[];
 
 /* Start and end of .ctors section - used for constructor calls. */
 extern char __ctors_start[], __ctors_end[];
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility pop
+#endif
 
 extern __visible const void __nosave_begin, __nosave_end;
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index eca8ad75e28b..876b827fe4a7 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -78,6 +78,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);
 #include <linux/compiler-clang.h>
 #endif
 
+/* Useful for Position Independent Code to reduce global references */
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(hidden)
+#define __default_visibility  __attribute__((visibility ("default")))
+#else
+#define __default_visibility
+#endif
+
 /*
  * Generic compiler-dependent macros required for kernel
  * build go below this comment. Actual compiler/compiler version
diff --git a/init/Kconfig b/init/Kconfig
index 4fb5d6fc2c4f..a93626d40355 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1635,6 +1635,13 @@ config PROFILING
 config TRACEPOINTS
 	bool
 
+#
+# Default to hidden visibility for all symbols.
+# Useful for Position Independent Code to reduce global references.
+#
+config DEFAULT_HIDDEN
+	bool
+
 source "arch/Kconfig"
 
 endmenu		# General setup
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 127e7cfafa55..252019c8c3a9 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -32,24 +32,24 @@
  * These will be re-linked against their real values
  * during the second link stage.
  */
-extern const unsigned long kallsyms_addresses[] __weak;
-extern const int kallsyms_offsets[] __weak;
-extern const u8 kallsyms_names[] __weak;
+extern const unsigned long kallsyms_addresses[] __weak __default_visibility;
+extern const int kallsyms_offsets[] __weak __default_visibility;
+extern const u8 kallsyms_names[] __weak __default_visibility;
 
 /*
  * Tell the compiler that the count isn't in the small data section if the arch
  * has one (eg: FRV).
  */
 extern const unsigned long kallsyms_num_syms
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;
 
 extern const unsigned long kallsyms_relative_base
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;
 
-extern const u8 kallsyms_token_table[] __weak;
-extern const u16 kallsyms_token_index[] __weak;
+extern const u8 kallsyms_token_table[] __weak __default_visibility;
+extern const u16 kallsyms_token_index[] __weak __default_visibility;
 
-extern const unsigned long kallsyms_markers[] __weak;
+extern const unsigned long kallsyms_markers[] __weak __default_visibility;
 
 static inline int is_kernel_inittext(unsigned long addr)
 {
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 18/22] x86/relocs: Handle DYN relocations for PIE support
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (16 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 17/22] compiler: Option to default to hidden symbols Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 19/22] x86/pie: Add option to build the kernel as PIE for x86_64 Thomas Garnier
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Change the relocation tool to correctly handle DYN/PIE kernel where
the relocation table does not reference symbols and percpu support is
not needed. Also add support for R_X86_64_RELATIVE relocations that can
be handled like a 64-bit relocation due to the usage of -Bsymbolic.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/tools/relocs.c | 74 +++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 65 insertions(+), 9 deletions(-)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 73eb7fd4aec4..70f523dd68ff 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -642,6 +642,13 @@ static void add_reloc(struct relocs *r, uint32_t offset)
 	r->offset[r->count++] = offset;
 }
 
+/* Relocation found in a DYN binary, support only for 64-bit PIE */
+static int is_dyn_reloc(struct section *sec)
+{
+	return ELF_BITS == 64 && ehdr.e_type == ET_DYN &&
+		sec->shdr.sh_info == SHT_NULL;
+}
+
 static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
 			Elf_Sym *sym, const char *symname))
 {
@@ -652,6 +659,7 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
 		Elf_Sym *sh_symtab;
 		struct section *sec_applies, *sec_symtab;
 		int j;
+		int dyn_reloc = 0;
 		struct section *sec = &secs[i];
 
 		if (sec->shdr.sh_type != SHT_REL_TYPE) {
@@ -660,14 +668,20 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
 		sec_symtab  = sec->link;
 		sec_applies = &secs[sec->shdr.sh_info];
 		if (!(sec_applies->shdr.sh_flags & SHF_ALLOC)) {
-			continue;
+			if (!is_dyn_reloc(sec_applies))
+				continue;
+			dyn_reloc = 1;
 		}
 		sh_symtab = sec_symtab->symtab;
 		sym_strtab = sec_symtab->link->strtab;
 		for (j = 0; j < sec->shdr.sh_size/sizeof(Elf_Rel); j++) {
 			Elf_Rel *rel = &sec->reltab[j];
-			Elf_Sym *sym = &sh_symtab[ELF_R_SYM(rel->r_info)];
-			const char *symname = sym_name(sym_strtab, sym);
+			Elf_Sym *sym = NULL;
+			const char *symname = NULL;
+			if (!dyn_reloc) {
+				sym = &sh_symtab[ELF_R_SYM(rel->r_info)];
+				symname = sym_name(sym_strtab, sym);
+			}
 
 			process(sec, rel, sym, symname);
 		}
@@ -746,16 +760,21 @@ static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
 		strncmp(symname, "init_per_cpu_", 13);
 }
 
-
 static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 		      const char *symname)
 {
 	unsigned r_type = ELF64_R_TYPE(rel->r_info);
 	ElfW(Addr) offset = rel->r_offset;
-	int shn_abs = (sym->st_shndx == SHN_ABS) && !is_reloc(S_REL, symname);
+	int shn_abs = 0;
+	int dyn_reloc = is_dyn_reloc(sec);
 
-	if (sym->st_shndx == SHN_UNDEF)
-		return 0;
+	if (!dyn_reloc) {
+		shn_abs = (sym->st_shndx == SHN_ABS) &&
+			!is_reloc(S_REL, symname);
+
+		if (sym->st_shndx == SHN_UNDEF)
+			return 0;
+	}
 
 	/*
 	 * Adjust the offset if this reloc applies to the percpu section.
@@ -769,6 +788,9 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 		break;
 
 	case R_X86_64_PC32:
+		if (dyn_reloc)
+			die("PC32 reloc in PIE DYN binary");
+
 		/*
 		 * PC relative relocations don't need to be adjusted unless
 		 * referencing a percpu symbol.
@@ -783,7 +805,7 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 		/*
 		 * References to the percpu area don't need to be adjusted.
 		 */
-		if (is_percpu_sym(sym, symname))
+		if (!dyn_reloc && is_percpu_sym(sym, symname))
 			break;
 
 		if (shn_abs) {
@@ -814,6 +836,14 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 			add_reloc(&relocs32, offset);
 		break;
 
+	case R_X86_64_RELATIVE:
+		/*
+		 * -Bsymbolic means we don't need the addend and we can reuse
+		 * the original relocs64.
+		 */
+		add_reloc(&relocs64, offset);
+		break;
+
 	default:
 		die("Unsupported relocation type: %s (%d)\n",
 		    rel_type(r_type), r_type);
@@ -1044,6 +1074,21 @@ static void emit_relocs(int as_text, int use_real_mode)
 	}
 }
 
+/* Print a different header based on the type of relocation */
+static void print_reloc_header(struct section *sec) {
+	static int header_printed = 0;
+	int header_type = is_dyn_reloc(sec) ? 2 : 1;
+
+	if (header_printed == header_type)
+		return;
+	header_printed = header_type;
+
+	if (header_type == 2)
+		printf("reloc type\toffset\tvalue\n");
+	else
+		printf("reloc section\treloc type\tsymbol\tsymbol section\n");
+}
+
 /*
  * As an aid to debugging problems with different linkers
  * print summary information about the relocs.
@@ -1053,6 +1098,18 @@ static void emit_relocs(int as_text, int use_real_mode)
 static int do_reloc_info(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 				const char *symname)
 {
+
+	print_reloc_header(sec);
+
+#if ELF_BITS == 64
+	if (is_dyn_reloc(sec)) {
+		printf("%s\t0x%lx\t0x%lx\n",
+		       rel_type(ELF_R_TYPE(rel->r_info)),
+		       rel->r_offset,
+		       rel->r_addend);
+		return 0;
+	}
+#endif
 	printf("%s\t%s\t%s\t%s\n",
 		sec_name(sec->shdr.sh_info),
 		rel_type(ELF_R_TYPE(rel->r_info)),
@@ -1063,7 +1120,6 @@ static int do_reloc_info(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
 
 static void print_reloc_info(void)
 {
-	printf("reloc section\treloc type\tsymbol\tsymbol section\n");
 	walk_relocs(do_reloc_info);
 }
 
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 19/22] x86/pie: Add option to build the kernel as PIE for x86_64
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (17 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 18/22] x86/relocs: Handle DYN relocations for PIE support Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-18 22:33 ` [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations Thomas Garnier
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Add the CONFIG_X86_PIE option which builds the kernel as a Position
Independent Executable (PIE). The kernel is currently build with the
mcmodel=kernel option which forces it to stay on the top 2G of the
virtual address space. With PIE, the kernel will be able to move below
the -2G limit increasing the KASLR range from 1GB to 3GB.

The modules do not support PIE due to how they are linked. Disable PIE
for them and default to mcmodel=kernel for now.

The PIE configuration is not yet compatible with XEN_PVH. Xen PVH
generates 32-bit assembly and uses a long jump to transition to 64-bit.
A long jump require an absolute reference that is not compatible with
PIE.

Performance/Size impact:

Hackbench (50% and 1600% loads):
 - PIE disabled: no significant change (-0.50% / +0.50%)
 - PIE enabled: 7% to 8% on half load, 10% on heavy load.

These results are aligned with the different research on user-mode PIE
impact on cpu intensive benchmarks (around 10% on x86_64).

slab_test (average of 10 runs):
 - PIE disabled: no significant change (-1% / +1%)
 - PIE enabled: 3% to 4%

Kernbench (average of 10 Half and Optimal runs):
 Elapsed Time:
 - PIE disabled: no significant change (-0.22% / +0.06%)
 - PIE enabled: around 0.50%
 System Time:
 - PIE disabled: no significant change (-0.99% / -1.28%)
 - PIE enabled: 5% to 6%

Size of vmlinux (Ubuntu configuration):
 File size:
 - PIE disabled: 472928672 bytes (-0.000169% from baseline)
 - PIE enabled: 216878461 bytes (-54.14% from baseline)
 .text sections:
 - PIE disabled: 9373572 bytes (+0.04% from baseline)
 - PIE enabled: 9499138 bytes (+1.38% from baseline)

The big decrease in vmlinux file size is due to the lower number of
relocations appended to the file.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/Kconfig  | 6 ++++++
 arch/x86/Makefile | 9 +++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 781521b7cf9e..b26ee6751021 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2080,6 +2080,12 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
 
 	   If unsure, leave at the default value.
 
+config X86_PIE
+	bool
+	depends on X86_64 && !XEN_PVH
+	select DEFAULT_HIDDEN
+	select MODULE_REL_CRCS if MODVERSIONS
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1e902f926be3..452a9621af8f 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -45,8 +45,12 @@ export REALMODE_CFLAGS
 export BITS
 
 ifdef CONFIG_X86_NEED_RELOCS
+ifdef CONFIG_X86_PIE
+        LDFLAGS_vmlinux := -pie -shared -Bsymbolic
+else
         LDFLAGS_vmlinux := --emit-relocs
 endif
+endif
 
 #
 # Prevent GCC from generating any FP code by mistake.
@@ -132,7 +136,12 @@ else
         KBUILD_CFLAGS += $(cflags-y)
 
         KBUILD_CFLAGS += -mno-red-zone
+ifdef CONFIG_X86_PIE
+        KBUILD_CFLAGS += -fPIC
+        KBUILD_CFLAGS_MODULE += -fno-PIC -mcmodel=kernel
+else
         KBUILD_CFLAGS += -mcmodel=kernel
+endif
 
         # -funit-at-a-time shrinks the kernel .text considerably
         # unfortunately it makes reading oopses harder.
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (18 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 19/22] x86/pie: Add option to build the kernel as PIE for x86_64 Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19 22:33   ` H. Peter Anvin
  2017-07-18 22:33 ` [RFC 21/22] x86/module: Add support for mcmodel large and PLTs Thomas Garnier
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

The x86 relocation tool generates a list of 32-bit signed integers. There
was no need to use 64-bit integers because all addresses where above the 2G
top of the memory.

This change add a large-reloc option to generate 64-bit unsigned integers.
It can be used when the kernel plan to go below the top 2G and 32-bit
integers are not enough.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/tools/relocs.c        | 60 +++++++++++++++++++++++++++++++++---------
 arch/x86/tools/relocs.h        |  4 +--
 arch/x86/tools/relocs_common.c | 15 +++++++----
 3 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 70f523dd68ff..19b3e6c594b1 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -12,8 +12,14 @@
 
 static Elf_Ehdr ehdr;
 
+#if ELF_BITS == 64
+typedef uint64_t rel_off_t;
+#else
+typedef uint32_t rel_off_t;
+#endif
+
 struct relocs {
-	uint32_t	*offset;
+	rel_off_t	*offset;
 	unsigned long	count;
 	unsigned long	size;
 };
@@ -627,7 +633,7 @@ static void print_absolute_relocs(void)
 		printf("\n");
 }
 
-static void add_reloc(struct relocs *r, uint32_t offset)
+static void add_reloc(struct relocs *r, rel_off_t offset)
 {
 	if (r->count == r->size) {
 		unsigned long newsize = r->size + 50000;
@@ -983,26 +989,48 @@ static void sort_relocs(struct relocs *r)
 	qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
 }
 
-static int write32(uint32_t v, FILE *f)
+static int write32(rel_off_t rel, FILE *f)
 {
-	unsigned char buf[4];
+	unsigned char buf[sizeof(uint32_t)];
+	uint32_t v = (uint32_t)rel;
 
 	put_unaligned_le32(v, buf);
-	return fwrite(buf, 1, 4, f) == 4 ? 0 : -1;
+	return fwrite(buf, 1, sizeof(buf), f) == sizeof(buf) ? 0 : -1;
 }
 
-static int write32_as_text(uint32_t v, FILE *f)
+static int write32_as_text(rel_off_t rel, FILE *f)
 {
+	uint32_t v = (uint32_t)rel;
 	return fprintf(f, "\t.long 0x%08"PRIx32"\n", v) > 0 ? 0 : -1;
 }
 
-static void emit_relocs(int as_text, int use_real_mode)
+static int write64(rel_off_t rel, FILE *f)
+{
+	unsigned char buf[sizeof(uint64_t)];
+	uint64_t v = (uint64_t)rel;
+
+	put_unaligned_le64(v, buf);
+	return fwrite(buf, 1, sizeof(buf), f) == sizeof(buf) ? 0 : -1;
+}
+
+static int write64_as_text(rel_off_t rel, FILE *f)
+{
+	uint64_t v = (uint64_t)rel;
+	return fprintf(f, "\t.quad 0x%016"PRIx64"\n", v) > 0 ? 0 : -1;
+}
+
+static void emit_relocs(int as_text, int use_real_mode, int use_large_reloc)
 {
 	int i;
-	int (*write_reloc)(uint32_t, FILE *) = write32;
+	int (*write_reloc)(rel_off_t, FILE *);
 	int (*do_reloc)(struct section *sec, Elf_Rel *rel, Elf_Sym *sym,
 			const char *symname);
 
+	if (use_large_reloc)
+		write_reloc = write64;
+	else
+		write_reloc = write32;
+
 #if ELF_BITS == 64
 	if (!use_real_mode)
 		do_reloc = do_reloc64;
@@ -1013,6 +1041,9 @@ static void emit_relocs(int as_text, int use_real_mode)
 		do_reloc = do_reloc32;
 	else
 		do_reloc = do_reloc_real;
+
+	/* Large relocations only for 64-bit */
+	use_large_reloc = 0;
 #endif
 
 	/* Collect up the relocations */
@@ -1036,8 +1067,13 @@ static void emit_relocs(int as_text, int use_real_mode)
 		 * gas will like.
 		 */
 		printf(".section \".data.reloc\",\"a\"\n");
-		printf(".balign 4\n");
-		write_reloc = write32_as_text;
+		if (use_large_reloc) {
+			printf(".balign 8\n");
+			write_reloc = write64_as_text;
+		} else {
+			printf(".balign 4\n");
+			write_reloc = write32_as_text;
+		}
 	}
 
 	if (use_real_mode) {
@@ -1131,7 +1167,7 @@ static void print_reloc_info(void)
 
 void process(FILE *fp, int use_real_mode, int as_text,
 	     int show_absolute_syms, int show_absolute_relocs,
-	     int show_reloc_info)
+	     int show_reloc_info, int use_large_reloc)
 {
 	regex_init(use_real_mode);
 	read_ehdr(fp);
@@ -1153,5 +1189,5 @@ void process(FILE *fp, int use_real_mode, int as_text,
 		print_reloc_info();
 		return;
 	}
-	emit_relocs(as_text, use_real_mode);
+	emit_relocs(as_text, use_real_mode, use_large_reloc);
 }
diff --git a/arch/x86/tools/relocs.h b/arch/x86/tools/relocs.h
index 1d23bf953a4a..cb771cc4412d 100644
--- a/arch/x86/tools/relocs.h
+++ b/arch/x86/tools/relocs.h
@@ -30,8 +30,8 @@ enum symtype {
 
 void process_32(FILE *fp, int use_real_mode, int as_text,
 		int show_absolute_syms, int show_absolute_relocs,
-		int show_reloc_info);
+		int show_reloc_info, int use_large_reloc);
 void process_64(FILE *fp, int use_real_mode, int as_text,
 		int show_absolute_syms, int show_absolute_relocs,
-		int show_reloc_info);
+		int show_reloc_info, int use_large_reloc);
 #endif /* RELOCS_H */
diff --git a/arch/x86/tools/relocs_common.c b/arch/x86/tools/relocs_common.c
index acab636bcb34..9cf1391af50a 100644
--- a/arch/x86/tools/relocs_common.c
+++ b/arch/x86/tools/relocs_common.c
@@ -11,14 +11,14 @@ void die(char *fmt, ...)
 
 static void usage(void)
 {
-	die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode]" \
-	    " vmlinux\n");
+	die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode|" \
+	    "--large-reloc]  vmlinux\n");
 }
 
 int main(int argc, char **argv)
 {
 	int show_absolute_syms, show_absolute_relocs, show_reloc_info;
-	int as_text, use_real_mode;
+	int as_text, use_real_mode, use_large_reloc;
 	const char *fname;
 	FILE *fp;
 	int i;
@@ -29,6 +29,7 @@ int main(int argc, char **argv)
 	show_reloc_info = 0;
 	as_text = 0;
 	use_real_mode = 0;
+	use_large_reloc = 0;
 	fname = NULL;
 	for (i = 1; i < argc; i++) {
 		char *arg = argv[i];
@@ -53,6 +54,10 @@ int main(int argc, char **argv)
 				use_real_mode = 1;
 				continue;
 			}
+			if (strcmp(arg, "--large-reloc") == 0) {
+				use_large_reloc = 1;
+				continue;
+			}
 		}
 		else if (!fname) {
 			fname = arg;
@@ -74,11 +79,11 @@ int main(int argc, char **argv)
 	if (e_ident[EI_CLASS] == ELFCLASS64)
 		process_64(fp, use_real_mode, as_text,
 			   show_absolute_syms, show_absolute_relocs,
-			   show_reloc_info);
+			   show_reloc_info, use_large_reloc);
 	else
 		process_32(fp, use_real_mode, as_text,
 			   show_absolute_syms, show_absolute_relocs,
-			   show_reloc_info);
+			   show_reloc_info, use_large_reloc);
 	fclose(fp);
 	return 0;
 }
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (19 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19  1:35   ` H. Peter Anvin
  2017-07-18 22:33 ` [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB Thomas Garnier
  2017-07-19 14:08 ` x86: PIE support and option to extend KASLR randomization Christopher Lameter
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

With PIE support and KASLR extended range, the modules may be further
away from the kernel than before breaking mcmodel=kernel expectations.

Add an option to build modules with mcmodel=large. The modules generated
code will make no assumptions on placement in memory.

Despite this option, modules still expect kernel functions to be within
2G and generate relative calls. To solve this issue, the PLT arm64 code
was adapted for x86_64. When a relative relocation go outside its range,
a dynamic PLT entry is used to correctly jump to the destination.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/Kconfig              |  10 +++
 arch/x86/Makefile             |  10 ++-
 arch/x86/include/asm/module.h |  16 ++++
 arch/x86/kernel/Makefile      |   2 +
 arch/x86/kernel/module-plts.c | 198 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/module.c      |  18 ++--
 arch/x86/kernel/module.lds    |   4 +
 7 files changed, 251 insertions(+), 7 deletions(-)
 create mode 100644 arch/x86/kernel/module-plts.c
 create mode 100644 arch/x86/kernel/module.lds

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b26ee6751021..60d161391d5a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2086,6 +2086,16 @@ config X86_PIE
 	select DEFAULT_HIDDEN
 	select MODULE_REL_CRCS if MODVERSIONS
 
+config X86_MODULE_MODEL_LARGE
+	bool
+	depends on X86_64 && X86_PIE
+
+config X86_MODULE_PLTS
+	bool
+	depends on X86_64
+	select X86_MODULE_MODEL_LARGE
+	select HAVE_MOD_ARCH_SPECIFIC
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 452a9621af8f..72a90da0149a 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -138,10 +138,18 @@ else
         KBUILD_CFLAGS += -mno-red-zone
 ifdef CONFIG_X86_PIE
         KBUILD_CFLAGS += -fPIC
-        KBUILD_CFLAGS_MODULE += -fno-PIC -mcmodel=kernel
+        KBUILD_CFLAGS_MODULE += -fno-PIC
 else
         KBUILD_CFLAGS += -mcmodel=kernel
 endif
+ifdef CONFIG_X86_MODULE_MODEL_LARGE
+        KBUILD_CFLAGS_MODULE += -mcmodel=large
+else
+        KBUILD_CFLAGS_MODULE += -mcmodel=kernel
+endif
+ifdef CONFIG_X86_MODULE_PLTS
+        KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/x86/kernel/module.lds
+endif
 
         # -funit-at-a-time shrinks the kernel .text considerably
         # unfortunately it makes reading oopses harder.
diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h
index e3b7819caeef..d054c37656ea 100644
--- a/arch/x86/include/asm/module.h
+++ b/arch/x86/include/asm/module.h
@@ -61,4 +61,20 @@
 # define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY
 #endif
 
+#ifdef CONFIG_X86_MODULE_PLTS
+struct mod_plt_sec {
+	struct elf64_shdr	*plt;
+	int			plt_num_entries;
+	int			plt_max_entries;
+};
+
+struct mod_arch_specific {
+	struct mod_plt_sec	core;
+	struct mod_plt_sec	init;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+			  Elf64_Sym *sym);
+
 #endif /* _ASM_X86_MODULE_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index a01892bdd61a..e294aefb747c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -142,4 +142,6 @@ ifeq ($(CONFIG_X86_64),y)
 
 	obj-$(CONFIG_PCI_MMCONFIG)	+= mmconf-fam10h_64.o
 	obj-y				+= vsmp_64.o
+
+	obj-$(CONFIG_X86_MODULE_PLTS)	+= module-plts.o
 endif
diff --git a/arch/x86/kernel/module-plts.c b/arch/x86/kernel/module-plts.c
new file mode 100644
index 000000000000..bbf11771f424
--- /dev/null
+++ b/arch/x86/kernel/module-plts.c
@@ -0,0 +1,198 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Generate PLT entries for out-of-bound PC-relative relocations. It is required
+ * when a module can be mapped more than 2G away from the kernel.
+ *
+ * Based on arm64 module-plts implementation.
+ */
+
+#include <linux/elf.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+
+/* jmp    QWORD PTR [rip+0xfffffffffffffff2] */
+const u8 jmp_target[] = { 0xFF, 0x25, 0xF2, 0xFF, 0xFF, 0xFF };
+
+struct plt_entry {
+	u64 target;			/* Hold the target address */
+	u8 jmp[sizeof(jmp_target)];	/* jmp opcode to target */
+};
+
+static bool in_init(const struct module *mod, void *loc)
+{
+	return (u64)loc - (u64)mod->init_layout.base < mod->init_layout.size;
+}
+
+u64 module_emit_plt_entry(struct module *mod, void *loc, const Elf64_Rela *rela,
+			  Elf64_Sym *sym)
+{
+	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
+							  &mod->arch.init;
+	struct plt_entry *plt = (struct plt_entry *)pltsec->plt->sh_addr;
+	int i = pltsec->plt_num_entries;
+	u64 ret;
+
+	/*
+	 * <target address>
+	 * jmp    QWORD PTR [rip+0xfffffffffffffff2] # Target address
+	 */
+	plt[i].target = sym->st_value;
+	memcpy(plt[i].jmp, jmp_target, sizeof(jmp_target));
+
+	/*
+	 * Check if the entry we just created is a duplicate. Given that the
+	 * relocations are sorted, this will be the last entry we allocated.
+	 * (if one exists).
+	 */
+	if (i > 0 && plt[i].target == plt[i - 2].target) {
+		ret = (u64)&plt[i - 1].jmp;
+	} else {
+		pltsec->plt_num_entries++;
+		BUG_ON(pltsec->plt_num_entries > pltsec->plt_max_entries);
+		ret = (u64)&plt[i].jmp;
+	}
+
+	return ret + rela->r_addend;
+}
+
+#define cmp_3way(a,b)	((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+	const Elf64_Rela *x = a, *y = b;
+	int i;
+
+	/* sort by type, symbol index and addend */
+	i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+	if (i == 0)
+		i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+	if (i == 0)
+		i = cmp_3way(x->r_addend, y->r_addend);
+	return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+	/*
+	 * Entries are sorted by type, symbol index and addend. That means
+	 * that, if a duplicate entry exists, it must be in the preceding
+	 * slot.
+	 */
+	return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num,
+			       Elf64_Word dstidx)
+{
+	unsigned int ret = 0;
+	Elf64_Sym *s;
+	int i;
+
+	for (i = 0; i < num; i++) {
+		switch (ELF64_R_TYPE(rela[i].r_info)) {
+		case R_X86_64_PC32:
+			/*
+			 * We only have to consider branch targets that resolve
+			 * to symbols that are defined in a different section.
+			 * This is not simply a heuristic, it is a fundamental
+			 * limitation, since there is no guaranteed way to emit
+			 * PLT entries sufficiently close to the branch if the
+			 * section size exceeds the range of a branch
+			 * instruction. So ignore relocations against defined
+			 * symbols if they live in the same section as the
+			 * relocation target.
+			 */
+			s = syms + ELF64_R_SYM(rela[i].r_info);
+			if (s->st_shndx == dstidx)
+				break;
+
+			/*
+			 * Jump relocations with non-zero addends against
+			 * undefined symbols are supported by the ELF spec, but
+			 * do not occur in practice (e.g., 'jump n bytes past
+			 * the entry point of undefined function symbol f').
+			 * So we need to support them, but there is no need to
+			 * take them into consideration when trying to optimize
+			 * this code. So let's only check for duplicates when
+			 * the addend is zero: this allows us to record the PLT
+			 * entry address in the symbol table itself, rather than
+			 * having to search the list for duplicates each time we
+			 * emit one.
+			 */
+			if (rela[i].r_addend != 0 || !duplicate_rel(rela, i))
+				ret++;
+			break;
+		}
+	}
+	return ret;
+}
+
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+			      char *secstrings, struct module *mod)
+{
+	unsigned long core_plts = 0;
+	unsigned long init_plts = 0;
+	Elf64_Sym *syms = NULL;
+	int i;
+
+	/*
+	 * Find the empty .plt section so we can expand it to store the PLT
+	 * entries. Record the symtab address as well.
+	 */
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (!strcmp(secstrings + sechdrs[i].sh_name, ".plt"))
+			mod->arch.core.plt = sechdrs + i;
+		else if (!strcmp(secstrings + sechdrs[i].sh_name, ".init.plt"))
+			mod->arch.init.plt = sechdrs + i;
+		else if (sechdrs[i].sh_type == SHT_SYMTAB)
+			syms = (Elf64_Sym *)sechdrs[i].sh_addr;
+	}
+
+	if (!mod->arch.core.plt || !mod->arch.init.plt) {
+		pr_err("%s: module PLT section(s) missing\n", mod->name);
+		return -ENOEXEC;
+	}
+	if (!syms) {
+		pr_err("%s: module symtab section missing\n", mod->name);
+		return -ENOEXEC;
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+		int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+		Elf64_Shdr *dstsec = sechdrs + sechdrs[i].sh_info;
+
+		if (sechdrs[i].sh_type != SHT_RELA)
+			continue;
+
+		/* sort by type, symbol index and addend */
+		sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+		if (strncmp(secstrings + dstsec->sh_name, ".init", 5) != 0)
+			core_plts += count_plts(syms, rels, numrels,
+						sechdrs[i].sh_info);
+		else
+			init_plts += count_plts(syms, rels, numrels,
+						sechdrs[i].sh_info);
+	}
+
+	mod->arch.core.plt->sh_type = SHT_NOBITS;
+	mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.core.plt->sh_size = (core_plts  + 1) * sizeof(struct plt_entry);
+	mod->arch.core.plt_num_entries = 0;
+	mod->arch.core.plt_max_entries = core_plts;
+
+	mod->arch.init.plt->sh_type = SHT_NOBITS;
+	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+	mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;
+	mod->arch.init.plt->sh_size = (init_plts + 1) * sizeof(struct plt_entry);
+	mod->arch.init.plt_num_entries = 0;
+	mod->arch.init.plt_max_entries = init_plts;
+
+	return 0;
+}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index f67bd3205df7..a2b31973572b 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -186,10 +186,15 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_X86_64_PC32:
 			val -= (u64)loc;
 			*(u32 *)loc = val;
-#if 0
-			if ((s64)val != *(s32 *)loc)
-				goto overflow;
-#endif
+			if (IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) &&
+			    (s64)val != *(s32 *)loc) {
+				val = module_emit_plt_entry(me, loc, &rel[i],
+							    sym);
+				val -= (u64)loc;
+				*(u32 *)loc = val;
+				if ((s64)val != *(s32 *)loc)
+					goto overflow;
+			}
 			break;
 		default:
 			pr_err("%s: Unknown rela relocation: %llu\n",
@@ -202,8 +207,9 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 overflow:
 	pr_err("overflow in relocation type %d val %Lx\n",
 	       (int)ELF64_R_TYPE(rel[i].r_info), val);
-	pr_err("`%s' likely not compiled with -mcmodel=kernel\n",
-	       me->name);
+	pr_err("`%s' likely not compiled with -mcmodel=%s\n",
+	       me->name,
+	       IS_ENABLED(CONFIG_X86_MODULE_MODEL_LARGE) ? "large" : "kernel");
 	return -ENOEXEC;
 }
 #endif
diff --git a/arch/x86/kernel/module.lds b/arch/x86/kernel/module.lds
new file mode 100644
index 000000000000..f7c9781a9d48
--- /dev/null
+++ b/arch/x86/kernel/module.lds
@@ -0,0 +1,4 @@
+SECTIONS {
+	.plt (NOLOAD) : { BYTE(0) }
+	.init.plt (NOLOAD) : { BYTE(0) }
+}
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (20 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 21/22] x86/module: Add support for mcmodel large and PLTs Thomas Garnier
@ 2017-07-18 22:33 ` Thomas Garnier
  2017-07-19 12:10   ` Baoquan He
  2017-07-19 14:08 ` x86: PIE support and option to extend KASLR randomization Christopher Lameter
  22 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-18 22:33 UTC (permalink / raw)
  To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Thomas Garnier,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki
  Cc: linux-arch, kvm, linux-pm, x86, linux-kernel, linux-sparse,
	linux-crypto, kernel-hardening, xen-devel

Add a new CONFIG_RANDOMIZE_BASE_LARGE option to benefit from PIE
support. It increases the KASLR range from 1GB to 3GB. The new range
stars at 0xffffffff00000000 just above the EFI memory region. This
option is off by default.

The boot code is adapted to create the appropriate page table spanning
three PUD pages.

The relocation table uses 64-bit integers generated with the updated
relocation tool with the large-reloc option.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
 arch/x86/Kconfig                     | 21 +++++++++++++++++++++
 arch/x86/boot/compressed/Makefile    |  5 +++++
 arch/x86/boot/compressed/misc.c      | 10 +++++++++-
 arch/x86/include/asm/page_64_types.h |  9 +++++++++
 arch/x86/kernel/head64.c             | 18 ++++++++++++++----
 arch/x86/kernel/head_64.S            | 11 ++++++++++-
 6 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 60d161391d5a..8054eef76dfc 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2096,6 +2096,27 @@ config X86_MODULE_PLTS
 	select X86_MODULE_MODEL_LARGE
 	select HAVE_MOD_ARCH_SPECIFIC
 
+config RANDOMIZE_BASE_LARGE
+	bool "Increase the randomization range of the kernel image"
+	depends on X86_64 && RANDOMIZE_BASE
+	select X86_PIE
+	select X86_MODULE_PLTS if MODULES
+	default n
+	---help---
+	  Build the kernel as a Position Independent Executable (PIE) and
+	  increase the available randomization range from 1GB to 3GB.
+
+	  This option impacts performance on kernel CPU intensive workloads up
+	  to 10% due to PIE generated code. Impact on user-mode processes and
+	  typical usage would be significantly less (0.50% when you build the
+	  kernel).
+
+	  The kernel and modules will generate slightly more assembly (1 to 2%
+	  increase on the .text sections). The vmlinux binary will be
+	  significantly smaller due to less relocations.
+
+	  If unsure say N
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 2c860ad4fe06..8f4317864e98 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -111,7 +111,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE
 
 targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs
 
+# Large randomization require bigger relocation table
+ifeq ($(CONFIG_RANDOMIZE_BASE_LARGE),y)
+CMD_RELOCS = arch/x86/tools/relocs --large-reloc
+else
 CMD_RELOCS = arch/x86/tools/relocs
+endif
 quiet_cmd_relocs = RELOCS  $@
       cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
 $(obj)/vmlinux.relocs: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index a0838ab929f2..0a0c80ab1842 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -170,10 +170,18 @@ void __puthex(unsigned long value)
 }
 
 #if CONFIG_X86_NEED_RELOCS
+
+/* Large randomization go lower than -2G and use large relocation table */
+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+typedef long rel_t;
+#else
+typedef int rel_t;
+#endif
+
 static void handle_relocations(void *output, unsigned long output_len,
 			       unsigned long virt_addr)
 {
-	int *reloc;
+	rel_t *reloc;
 	unsigned long delta, map, ptr;
 	unsigned long min_addr = (unsigned long)output;
 	unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 3f5f08b010d0..6b65f846dd64 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -48,7 +48,11 @@
 #define __PAGE_OFFSET           __PAGE_OFFSET_BASE
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 
+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+#define __START_KERNEL_map	_AC(0xffffffff00000000, UL)
+#else
 #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
+#endif /* CONFIG_RANDOMIZE_BASE_LARGE */
 
 /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
 #ifdef CONFIG_X86_5LEVEL
@@ -65,9 +69,14 @@
  * 512MiB by default, leaving 1.5GiB for modules once the page tables
  * are fully set up. If kernel ASLR is configured, it can extend the
  * kernel page table mapping, reducing the size of the modules area.
+ * On PIE, we relocate the binary 2G lower so add this extra space.
  */
 #if defined(CONFIG_RANDOMIZE_BASE)
+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+#define KERNEL_IMAGE_SIZE	(_AC(3, UL) * 1024 * 1024 * 1024)
+#else
 #define KERNEL_IMAGE_SIZE	(1024 * 1024 * 1024)
+#endif
 #else
 #define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 4103e90ff128..235c3f7b46c7 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
 
 #define __head	__section(.head.text)
+#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
 
 static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
 {
@@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map);
 void __head notrace __startup_64(unsigned long physaddr)
 {
 	unsigned long load_delta, *p;
+	unsigned long level3_kernel_start, level3_kernel_count;
+	unsigned long level3_fixmap_start;
 	pgdval_t *pgd;
 	p4dval_t *p4d;
 	pudval_t *pud;
@@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr)
 	if (load_delta & ~PMD_PAGE_MASK)
 		for (;;);
 
+	/* Look at the randomization spread to adapt page table used */
+	level3_kernel_start = pud_index(__START_KERNEL_map);
+	level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
+	level3_fixmap_start = level3_kernel_start + level3_kernel_count;
+
 	/* Fixup the physical addresses in the page table */
 
 	pgd = fixup_pointer(&early_top_pgt, physaddr);
@@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr)
 	}
 
 	pud = fixup_pointer(&level3_kernel_pgt, physaddr);
-	pud[510] += load_delta;
-	pud[511] += load_delta;
+	for (i = 0; i < level3_kernel_count; i++)
+		pud[level3_kernel_start + i] += load_delta;
+	pud[level3_fixmap_start] += load_delta;
 
 	pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
 	pmd[506] += load_delta;
@@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr)
 	 */
 
 	pmd = fixup_pointer(level2_kernel_pgt, physaddr);
-	for (i = 0; i < PTRS_PER_PMD; i++) {
+	for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
 		if (pmd[i] & _PAGE_PRESENT)
 			pmd[i] += load_delta;
 	}
@@ -268,7 +277,8 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 	 */
 	BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
 	BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
-	BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_RANDOMIZE_BASE_LARGE) &&
+		     MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
 	BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
 	BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
 	BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 4d0a7e68bfe8..e8b2d6706eca 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -39,11 +39,15 @@
 
 #define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
 
 PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
+/* Adapt page table L3 space based on range of randomization */
+L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE)
+
 	.text
 	__HEAD
 	.code64
@@ -396,7 +400,12 @@ NEXT_PAGE(level4_kernel_pgt)
 NEXT_PAGE(level3_kernel_pgt)
 	.fill	L3_START_KERNEL,8,0
 	/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
-	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
+	i = 0
+	.rept	L3_KERNEL_ENTRY_COUNT
+	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE \
+		+ PAGE_SIZE*i
+	i = i + 1
+	.endr
 	.quad	level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
 
 NEXT_PAGE(level2_kernel_pgt)
-- 
2.13.2.932.g7449e964c-goog


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-18 22:33 ` [RFC 21/22] x86/module: Add support for mcmodel large and PLTs Thomas Garnier
@ 2017-07-19  1:35   ` H. Peter Anvin
  2017-07-19  3:59     ` Brian Gerst
  0 siblings, 1 reply; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19  1:35 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger
  Cc: x86, linux-crypto, linux-kernel, xen-devel, kvm, linux-pm,
	linux-arch, linux-sparse, kernel-hardening

On 07/18/17 15:33, Thomas Garnier wrote:
> With PIE support and KASLR extended range, the modules may be further
> away from the kernel than before breaking mcmodel=kernel expectations.
> 
> Add an option to build modules with mcmodel=large. The modules generated
> code will make no assumptions on placement in memory.
> 
> Despite this option, modules still expect kernel functions to be within
> 2G and generate relative calls. To solve this issue, the PLT arm64 code
> was adapted for x86_64. When a relative relocation go outside its range,
> a dynamic PLT entry is used to correctly jump to the destination.

Why large as opposed to medium or medium-PIC?

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-18 22:33 ` [RFC 06/22] kvm: " Thomas Garnier
@ 2017-07-19  2:49   ` Brian Gerst
  2017-07-19 15:40     ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: Brian Gerst @ 2017-07-19  2:49 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible. The new __ASM_GET_PTR_PRE macro is used to
> get the address of a symbol on both 32 and 64-bit with PIE support.
>
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
>
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 6 ++++--
>  arch/x86/kernel/kvm.c           | 6 ++++--
>  arch/x86/kvm/svm.c              | 4 ++--
>  3 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 87ac4fba6d8e..3041201a3aeb 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1352,9 +1352,11 @@ asmlinkage void kvm_spurious_fault(void);
>         ".pushsection .fixup, \"ax\" \n" \
>         "667: \n\t" \
>         cleanup_insn "\n\t"                   \
> -       "cmpb $0, kvm_rebooting \n\t"         \
> +       "cmpb $0, kvm_rebooting" __ASM_SEL(,(%%rip)) " \n\t" \
>         "jne 668b \n\t"                       \
> -       __ASM_SIZE(push) " $666b \n\t"        \
> +       __ASM_SIZE(push) "%%" _ASM_AX " \n\t"           \
> +       __ASM_GET_PTR_PRE(666b) "%%" _ASM_AX "\n\t"     \
> +       "xchg %%" _ASM_AX ", (%%" _ASM_SP ") \n\t"      \
>         "call kvm_spurious_fault \n\t"        \
>         ".popsection \n\t" \
>         _ASM_EXTABLE(666b, 667b)
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 71c17a5be983..53b8ad162589 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -618,8 +618,10 @@ asm(
>  ".global __raw_callee_save___kvm_vcpu_is_preempted;"
>  ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
>  "__raw_callee_save___kvm_vcpu_is_preempted:"
> -"movq  __per_cpu_offset(,%rdi,8), %rax;"
> -"cmpb  $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
> +"leaq  __per_cpu_offset(%rip), %rax;"
> +"movq  (%rax,%rdi,8), %rax;"
> +"addq  " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;"

This doesn't look right.  It's accessing a per-cpu variable.  The
per-cpu section is an absolute, zero-based section and not subject to
relocation.

> +"cmpb  $0, (%rax);
>  "setne %al;"
>  "ret;"
>  ".popsection");

--
Brian Gerst

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-18 22:33 ` [RFC 16/22] x86/percpu: Adapt percpu " Thomas Garnier
@ 2017-07-19  3:08   ` Brian Gerst
  2017-07-19 18:26     ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: Brian Gerst @ 2017-07-19  3:08 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
> Perpcu uses a clever design where the .percu ELF section has a virtual
> address of zero and the relocation code avoid relocating specific
> symbols. It makes the code simple and easily adaptable with or without
> SMP support.
>
> This design is incompatible with PIE because generated code always try to
> access the zero virtual address relative to the default mapping address.
> It becomes impossible when KASLR is configured to go below -2G. This
> patch solves this problem by removing the zero mapping and adapting the GS
> base to be relative to the expected address. These changes are done only
> when PIE is enabled. The original implementation is kept as-is
> by default.

The reason the per-cpu section is zero-based on x86-64 is to
workaround GCC hardcoding the stack protector canary at %gs:40.  So
this patch is incompatible with CONFIG_STACK_PROTECTOR.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-19  1:35   ` H. Peter Anvin
@ 2017-07-19  3:59     ` Brian Gerst
  2017-07-19 15:58       ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: Brian Gerst @ 2017-07-19  3:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/18/17 15:33, Thomas Garnier wrote:
>> With PIE support and KASLR extended range, the modules may be further
>> away from the kernel than before breaking mcmodel=kernel expectations.
>>
>> Add an option to build modules with mcmodel=large. The modules generated
>> code will make no assumptions on placement in memory.
>>
>> Despite this option, modules still expect kernel functions to be within
>> 2G and generate relative calls. To solve this issue, the PLT arm64 code
>> was adapted for x86_64. When a relative relocation go outside its range,
>> a dynamic PLT entry is used to correctly jump to the destination.
>
> Why large as opposed to medium or medium-PIC?

Or for that matter, why not small-PIC?  We aren't changing the size of
the kernel to be larger than 2G text or data.  Small-PIC would still
allow it to be placed anywhere in the address space, and would
generate far better code.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB
  2017-07-18 22:33 ` [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB Thomas Garnier
@ 2017-07-19 12:10   ` Baoquan He
  2017-07-19 13:49     ` Baoquan He
  0 siblings, 1 reply; 55+ messages in thread
From: Baoquan He @ 2017-07-19 12:10 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown,
	Pavel Machek

On 07/18/17 at 03:33pm, Thomas Garnier wrote:

>  quiet_cmd_relocs = RELOCS  $@
>        cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
>  $(obj)/vmlinux.relocs: vmlinux FORCE
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index a0838ab929f2..0a0c80ab1842 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -170,10 +170,18 @@ void __puthex(unsigned long value)
>  }
>  
>  #if CONFIG_X86_NEED_RELOCS
> +
> +/* Large randomization go lower than -2G and use large relocation table */
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +typedef long rel_t;
> +#else
> +typedef int rel_t;
> +#endif
> +
>  static void handle_relocations(void *output, unsigned long output_len,
>  			       unsigned long virt_addr)
>  {
> -	int *reloc;
> +	rel_t *reloc;
>  	unsigned long delta, map, ptr;
>  	unsigned long min_addr = (unsigned long)output;
>  	unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 3f5f08b010d0..6b65f846dd64 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -48,7 +48,11 @@
>  #define __PAGE_OFFSET           __PAGE_OFFSET_BASE
>  #endif /* CONFIG_RANDOMIZE_MEMORY */
>  
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +#define __START_KERNEL_map	_AC(0xffffffff00000000, UL)
> +#else
>  #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
> +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */
>  
>  /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
>  #ifdef CONFIG_X86_5LEVEL
> @@ -65,9 +69,14 @@
>   * 512MiB by default, leaving 1.5GiB for modules once the page tables
>   * are fully set up. If kernel ASLR is configured, it can extend the
>   * kernel page table mapping, reducing the size of the modules area.
> + * On PIE, we relocate the binary 2G lower so add this extra space.
>   */
>  #if defined(CONFIG_RANDOMIZE_BASE)
> +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> +#define KERNEL_IMAGE_SIZE	(_AC(3, UL) * 1024 * 1024 * 1024)
> +#else
>  #define KERNEL_IMAGE_SIZE	(1024 * 1024 * 1024)
> +#endif
>  #else
>  #define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
>  #endif
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 4103e90ff128..235c3f7b46c7 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
>  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
>  
>  #define __head	__section(.head.text)
> +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
>  
>  static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
>  {
> @@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map);
>  void __head notrace __startup_64(unsigned long physaddr)
>  {
>  	unsigned long load_delta, *p;
> +	unsigned long level3_kernel_start, level3_kernel_count;
> +	unsigned long level3_fixmap_start;
>  	pgdval_t *pgd;
>  	p4dval_t *p4d;
>  	pudval_t *pud;
> @@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr)
>  	if (load_delta & ~PMD_PAGE_MASK)
>  		for (;;);
>  
> +	/* Look at the randomization spread to adapt page table used */
> +	level3_kernel_start = pud_index(__START_KERNEL_map);
> +	level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
> +	level3_fixmap_start = level3_kernel_start + level3_kernel_count;
> +
>  	/* Fixup the physical addresses in the page table */
>  
>  	pgd = fixup_pointer(&early_top_pgt, physaddr);
> @@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr)
>  	}
>  
>  	pud = fixup_pointer(&level3_kernel_pgt, physaddr);
> -	pud[510] += load_delta;
> -	pud[511] += load_delta;
> +	for (i = 0; i < level3_kernel_count; i++)
> +		pud[level3_kernel_start + i] += load_delta;
> +	pud[level3_fixmap_start] += load_delta;
>  
>  	pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
>  	pmd[506] += load_delta;
> @@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr)
>  	 */
>  
>  	pmd = fixup_pointer(level2_kernel_pgt, physaddr);
> -	for (i = 0; i < PTRS_PER_PMD; i++) {
> +	for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
>  		if (pmd[i] & _PAGE_PRESENT)
>  			pmd[i] += load_delta;

Wow, this is dangerous. Three pud entries of level3_kernel_pgt all point
to level2_kernel_pgt, it's out of bound of level2_kernel_pgt and
overwrite the next data.

And if only use one page for level2_kernel_pgt, and kernel is randomized
to cross the pud entry of -4G to -1G, it won't work well.

>  	}
> @@ -268,7 +277,8 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
>  	 */
>  	BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
>  	BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
> -	BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
> +	BUILD_BUG_ON(!IS_ENABLED(CONFIG_RANDOMIZE_BASE_LARGE) &&
> +		     MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
>  	BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
>  	BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
>  	BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 4d0a7e68bfe8..e8b2d6706eca 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -39,11 +39,15 @@
>  
>  #define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
>  #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
> +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
>  
>  PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
>  PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
>  L3_START_KERNEL = pud_index(__START_KERNEL_map)
>  
> +/* Adapt page table L3 space based on range of randomization */
> +L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE)
> +
>  	.text
>  	__HEAD
>  	.code64
> @@ -396,7 +400,12 @@ NEXT_PAGE(level4_kernel_pgt)
>  NEXT_PAGE(level3_kernel_pgt)
>  	.fill	L3_START_KERNEL,8,0
>  	/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
> -	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
> +	i = 0
> +	.rept	L3_KERNEL_ENTRY_COUNT
> +	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE \
> +		+ PAGE_SIZE*i
> +	i = i + 1
> +	.endr
>  	.quad	level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
>  
>  NEXT_PAGE(level2_kernel_pgt)
> -- 
> 2.13.2.932.g7449e964c-goog
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB
  2017-07-19 12:10   ` Baoquan He
@ 2017-07-19 13:49     ` Baoquan He
  0 siblings, 0 replies; 55+ messages in thread
From: Baoquan He @ 2017-07-19 13:49 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown,
	Pavel Machek

On 07/19/17 at 08:10pm, Baoquan He wrote:
> On 07/18/17 at 03:33pm, Thomas Garnier wrote:
> 
> >  quiet_cmd_relocs = RELOCS  $@
> >        cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
> >  $(obj)/vmlinux.relocs: vmlinux FORCE
> > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> > index a0838ab929f2..0a0c80ab1842 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -170,10 +170,18 @@ void __puthex(unsigned long value)
> >  }
> >  
> >  #if CONFIG_X86_NEED_RELOCS
> > +
> > +/* Large randomization go lower than -2G and use large relocation table */
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +typedef long rel_t;
> > +#else
> > +typedef int rel_t;
> > +#endif
> > +
> >  static void handle_relocations(void *output, unsigned long output_len,
> >  			       unsigned long virt_addr)
> >  {
> > -	int *reloc;
> > +	rel_t *reloc;
> >  	unsigned long delta, map, ptr;
> >  	unsigned long min_addr = (unsigned long)output;
> >  	unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
> > diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> > index 3f5f08b010d0..6b65f846dd64 100644
> > --- a/arch/x86/include/asm/page_64_types.h
> > +++ b/arch/x86/include/asm/page_64_types.h
> > @@ -48,7 +48,11 @@
> >  #define __PAGE_OFFSET           __PAGE_OFFSET_BASE
> >  #endif /* CONFIG_RANDOMIZE_MEMORY */
> >  
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +#define __START_KERNEL_map	_AC(0xffffffff00000000, UL)
> > +#else
> >  #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
> > +#endif /* CONFIG_RANDOMIZE_BASE_LARGE */
> >  
> >  /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
> >  #ifdef CONFIG_X86_5LEVEL
> > @@ -65,9 +69,14 @@
> >   * 512MiB by default, leaving 1.5GiB for modules once the page tables
> >   * are fully set up. If kernel ASLR is configured, it can extend the
> >   * kernel page table mapping, reducing the size of the modules area.
> > + * On PIE, we relocate the binary 2G lower so add this extra space.
> >   */
> >  #if defined(CONFIG_RANDOMIZE_BASE)
> > +#ifdef CONFIG_RANDOMIZE_BASE_LARGE
> > +#define KERNEL_IMAGE_SIZE	(_AC(3, UL) * 1024 * 1024 * 1024)
> > +#else
> >  #define KERNEL_IMAGE_SIZE	(1024 * 1024 * 1024)
> > +#endif
> >  #else
> >  #define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
> >  #endif
> > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> > index 4103e90ff128..235c3f7b46c7 100644
> > --- a/arch/x86/kernel/head64.c
> > +++ b/arch/x86/kernel/head64.c
> > @@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
> >  pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
> >  
> >  #define __head	__section(.head.text)
> > +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
> >  
> >  static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
> >  {
> > @@ -54,6 +55,8 @@ unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map);
> >  void __head notrace __startup_64(unsigned long physaddr)
> >  {
> >  	unsigned long load_delta, *p;
> > +	unsigned long level3_kernel_start, level3_kernel_count;
> > +	unsigned long level3_fixmap_start;
> >  	pgdval_t *pgd;
> >  	p4dval_t *p4d;
> >  	pudval_t *pud;
> > @@ -74,6 +77,11 @@ void __head notrace __startup_64(unsigned long physaddr)
> >  	if (load_delta & ~PMD_PAGE_MASK)
> >  		for (;;);
> >  
> > +	/* Look at the randomization spread to adapt page table used */
> > +	level3_kernel_start = pud_index(__START_KERNEL_map);
> > +	level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
> > +	level3_fixmap_start = level3_kernel_start + level3_kernel_count;
> > +
> >  	/* Fixup the physical addresses in the page table */
> >  
> >  	pgd = fixup_pointer(&early_top_pgt, physaddr);
> > @@ -85,8 +93,9 @@ void __head notrace __startup_64(unsigned long physaddr)
> >  	}
> >  
> >  	pud = fixup_pointer(&level3_kernel_pgt, physaddr);
> > -	pud[510] += load_delta;
> > -	pud[511] += load_delta;
> > +	for (i = 0; i < level3_kernel_count; i++)
> > +		pud[level3_kernel_start + i] += load_delta;
> > +	pud[level3_fixmap_start] += load_delta;
> >  
> >  	pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
> >  	pmd[506] += load_delta;
> > @@ -137,7 +146,7 @@ void __head notrace __startup_64(unsigned long physaddr)
> >  	 */
> >  
> >  	pmd = fixup_pointer(level2_kernel_pgt, physaddr);
> > -	for (i = 0; i < PTRS_PER_PMD; i++) {
> > +	for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
> >  		if (pmd[i] & _PAGE_PRESENT)
> >  			pmd[i] += load_delta;
> 
> Wow, this is dangerous. Three pud entries of level3_kernel_pgt all point
> to level2_kernel_pgt, it's out of bound of level2_kernel_pgt and
> overwrite the next data.
> 
> And if only use one page for level2_kernel_pgt, and kernel is randomized
> to cross the pud entry of -4G to -1G, it won't work well.

Sorry, I was wrong, the size of level2_kernel_pgt is decided by
KERNEL_IMAGE_SIZE. So it's not a problem, please ignore this comment.

> 
> >  	}
> > @@ -268,7 +277,8 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
> >  	 */
> >  	BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
> >  	BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
> > -	BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
> > +	BUILD_BUG_ON(!IS_ENABLED(CONFIG_RANDOMIZE_BASE_LARGE) &&
> > +		     MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
> >  	BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
> >  	BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
> >  	BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
> > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > index 4d0a7e68bfe8..e8b2d6706eca 100644
> > --- a/arch/x86/kernel/head_64.S
> > +++ b/arch/x86/kernel/head_64.S
> > @@ -39,11 +39,15 @@
> >  
> >  #define p4d_index(x)	(((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
> >  #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
> > +#define pud_count(x)   (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)
> >  
> >  PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
> >  PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
> >  L3_START_KERNEL = pud_index(__START_KERNEL_map)
> >  
> > +/* Adapt page table L3 space based on range of randomization */
> > +L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE)
> > +
> >  	.text
> >  	__HEAD
> >  	.code64
> > @@ -396,7 +400,12 @@ NEXT_PAGE(level4_kernel_pgt)
> >  NEXT_PAGE(level3_kernel_pgt)
> >  	.fill	L3_START_KERNEL,8,0
> >  	/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
> > -	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
> > +	i = 0
> > +	.rept	L3_KERNEL_ENTRY_COUNT
> > +	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE \
> > +		+ PAGE_SIZE*i
> > +	i = i + 1
> > +	.endr
> >  	.quad	level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
> >  
> >  NEXT_PAGE(level2_kernel_pgt)
> > -- 
> > 2.13.2.932.g7449e964c-goog
> > 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: x86: PIE support and option to extend KASLR randomization
  2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
                   ` (21 preceding siblings ...)
  2017-07-18 22:33 ` [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB Thomas Garnier
@ 2017-07-19 14:08 ` Christopher Lameter
  2017-07-19 19:21   ` Kees Cook
  22 siblings, 1 reply; 55+ messages in thread
From: Christopher Lameter @ 2017-07-19 14:08 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki

On Tue, 18 Jul 2017, Thomas Garnier wrote:

> Performance/Size impact:
> Hackbench (50% and 1600% loads):
>  - PIE enabled: 7% to 8% on half load, 10% on heavy load.
> slab_test (average of 10 runs):
>  - PIE enabled: 3% to 4%
> Kernbench (average of 10 Half and Optimal runs):
>  - PIE enabled: 5% to 6%
>
> Size of vmlinux (Ubuntu configuration):
>  File size:
>  - PIE disabled: 472928672 bytes (-0.000169% from baseline)
>  - PIE enabled: 216878461 bytes (-54.14% from baseline)

Maybe we need something like CONFIG_PARANOIA so that we can determine at
build time how much performance we want to sacrifice for performance?

Its going to be difficult to understand what all these hardening config
options do.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-19  2:49   ` Brian Gerst
@ 2017-07-19 15:40     ` Thomas Garnier
  2017-07-19 22:27       ` H. Peter Anvin
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 15:40 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 7:49 PM, Brian Gerst <brgerst@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible. The new __ASM_GET_PTR_PRE macro is used to
>> get the address of a symbol on both 32 and 64-bit with PIE support.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>>
>> Signed-off-by: Thomas Garnier <thgarnie@google.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h | 6 ++++--
>>  arch/x86/kernel/kvm.c           | 6 ++++--
>>  arch/x86/kvm/svm.c              | 4 ++--
>>  3 files changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 87ac4fba6d8e..3041201a3aeb 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1352,9 +1352,11 @@ asmlinkage void kvm_spurious_fault(void);
>>         ".pushsection .fixup, \"ax\" \n" \
>>         "667: \n\t" \
>>         cleanup_insn "\n\t"                   \
>> -       "cmpb $0, kvm_rebooting \n\t"         \
>> +       "cmpb $0, kvm_rebooting" __ASM_SEL(,(%%rip)) " \n\t" \
>>         "jne 668b \n\t"                       \
>> -       __ASM_SIZE(push) " $666b \n\t"        \
>> +       __ASM_SIZE(push) "%%" _ASM_AX " \n\t"           \
>> +       __ASM_GET_PTR_PRE(666b) "%%" _ASM_AX "\n\t"     \
>> +       "xchg %%" _ASM_AX ", (%%" _ASM_SP ") \n\t"      \
>>         "call kvm_spurious_fault \n\t"        \
>>         ".popsection \n\t" \
>>         _ASM_EXTABLE(666b, 667b)
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index 71c17a5be983..53b8ad162589 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -618,8 +618,10 @@ asm(
>>  ".global __raw_callee_save___kvm_vcpu_is_preempted;"
>>  ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
>>  "__raw_callee_save___kvm_vcpu_is_preempted:"
>> -"movq  __per_cpu_offset(,%rdi,8), %rax;"
>> -"cmpb  $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
>> +"leaq  __per_cpu_offset(%rip), %rax;"
>> +"movq  (%rax,%rdi,8), %rax;"
>> +"addq  " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;"
>
> This doesn't look right.  It's accessing a per-cpu variable.  The
> per-cpu section is an absolute, zero-based section and not subject to
> relocation.
>

PIE does not respect the zero-based section, it tries to have
everything relative. Patch 16/22 also adapt per-cpu to work with PIE
(while keeping the zero absolute design by default).

>> +"cmpb  $0, (%rax);
>>  "setne %al;"
>>  "ret;"
>>  ".popsection");
>
> --
> Brian Gerst



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-19  3:59     ` Brian Gerst
@ 2017-07-19 15:58       ` Thomas Garnier
  2017-07-19 17:34         ` Brian Gerst
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 15:58 UTC (permalink / raw)
  To: Brian Gerst
  Cc: H. Peter Anvin, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 8:59 PM, Brian Gerst <brgerst@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 07/18/17 15:33, Thomas Garnier wrote:
>>> With PIE support and KASLR extended range, the modules may be further
>>> away from the kernel than before breaking mcmodel=kernel expectations.
>>>
>>> Add an option to build modules with mcmodel=large. The modules generated
>>> code will make no assumptions on placement in memory.
>>>
>>> Despite this option, modules still expect kernel functions to be within
>>> 2G and generate relative calls. To solve this issue, the PLT arm64 code
>>> was adapted for x86_64. When a relative relocation go outside its range,
>>> a dynamic PLT entry is used to correctly jump to the destination.
>>
>> Why large as opposed to medium or medium-PIC?
>
> Or for that matter, why not small-PIC?  We aren't changing the size of
> the kernel to be larger than 2G text or data.  Small-PIC would still
> allow it to be placed anywhere in the address space, and would
> generate far better code.

My understanding was that small=PIC and medium=PIC assume that the
module code is in the lower 2G of memory. I will do additional testing
on the modules to confirm that.

>
> --
> Brian Gerst



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-19 15:58       ` Thomas Garnier
@ 2017-07-19 17:34         ` Brian Gerst
  2017-07-24 16:32           ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: Brian Gerst @ 2017-07-19 17:34 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: H. Peter Anvin, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 11:58 AM, Thomas Garnier <thgarnie@google.com> wrote:
> On Tue, Jul 18, 2017 at 8:59 PM, Brian Gerst <brgerst@gmail.com> wrote:
>> On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>>> On 07/18/17 15:33, Thomas Garnier wrote:
>>>> With PIE support and KASLR extended range, the modules may be further
>>>> away from the kernel than before breaking mcmodel=kernel expectations.
>>>>
>>>> Add an option to build modules with mcmodel=large. The modules generated
>>>> code will make no assumptions on placement in memory.
>>>>
>>>> Despite this option, modules still expect kernel functions to be within
>>>> 2G and generate relative calls. To solve this issue, the PLT arm64 code
>>>> was adapted for x86_64. When a relative relocation go outside its range,
>>>> a dynamic PLT entry is used to correctly jump to the destination.
>>>
>>> Why large as opposed to medium or medium-PIC?
>>
>> Or for that matter, why not small-PIC?  We aren't changing the size of
>> the kernel to be larger than 2G text or data.  Small-PIC would still
>> allow it to be placed anywhere in the address space, and would
>> generate far better code.
>
> My understanding was that small=PIC and medium=PIC assume that the
> module code is in the lower 2G of memory. I will do additional testing
> on the modules to confirm that.

That is only for small/medium absolute (non-PIC) code.  Think about
userspace shared libraries.  They are not limited to being mapped in
the lower 2G of the address space.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-19  3:08   ` Brian Gerst
@ 2017-07-19 18:26     ` Thomas Garnier
  2017-07-19 23:33       ` H. Peter Anvin
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 18:26 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst <brgerst@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
>> Perpcu uses a clever design where the .percu ELF section has a virtual
>> address of zero and the relocation code avoid relocating specific
>> symbols. It makes the code simple and easily adaptable with or without
>> SMP support.
>>
>> This design is incompatible with PIE because generated code always try to
>> access the zero virtual address relative to the default mapping address.
>> It becomes impossible when KASLR is configured to go below -2G. This
>> patch solves this problem by removing the zero mapping and adapting the GS
>> base to be relative to the expected address. These changes are done only
>> when PIE is enabled. The original implementation is kept as-is
>> by default.
>
> The reason the per-cpu section is zero-based on x86-64 is to
> workaround GCC hardcoding the stack protector canary at %gs:40.  So
> this patch is incompatible with CONFIG_STACK_PROTECTOR.

Ok, that make sense. I don't want this feature to not work with
CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
entry for gs so gs:40 points to the correct memory address and
gs:[rip+XX] works correctly through the MSR. Given the separate
discussion on mcmodel, I am going first to check if we can move from
PIE to PIC with a mcmodel=small or medium that would remove the percpu
change requirement. I tried before without success but I understand
better percpu and other components so maybe I can make it work.

Thanks a lot for the feedback.

>
> --
> Brian Gerst



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 13/22] x86/power/64: Adapt assembly for PIE support
  2017-07-18 22:33 ` [RFC 13/22] x86/power/64: " Thomas Garnier
@ 2017-07-19 18:41   ` Pavel Machek
  0 siblings, 0 replies; 55+ messages in thread
From: Pavel Machek @ 2017-07-19 18:41 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

On Tue 2017-07-18 15:33:24, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
> 
> Signed-off-by: Thomas Garnier <thgarnie@google.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

(But not tested; testing it would be nice).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: Re: x86: PIE support and option to extend KASLR randomization
  2017-07-19 14:08 ` x86: PIE support and option to extend KASLR randomization Christopher Lameter
@ 2017-07-19 19:21   ` Kees Cook
  0 siblings, 0 replies; 55+ messages in thread
From: Kees Cook @ 2017-07-19 19:21 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown

On Wed, Jul 19, 2017 at 7:08 AM, Christopher Lameter <cl@linux.com> wrote:
> On Tue, 18 Jul 2017, Thomas Garnier wrote:
>
>> Performance/Size impact:
>> Hackbench (50% and 1600% loads):
>>  - PIE enabled: 7% to 8% on half load, 10% on heavy load.
>> slab_test (average of 10 runs):
>>  - PIE enabled: 3% to 4%
>> Kernbench (average of 10 Half and Optimal runs):
>>  - PIE enabled: 5% to 6%
>>
>> Size of vmlinux (Ubuntu configuration):
>>  File size:
>>  - PIE disabled: 472928672 bytes (-0.000169% from baseline)
>>  - PIE enabled: 216878461 bytes (-54.14% from baseline)
>
> Maybe we need something like CONFIG_PARANOIA so that we can determine at
> build time how much performance we want to sacrifice for performance?
>
> Its going to be difficult to understand what all these hardening config
> options do.

This kind of thing got discussed recently, and like
CONFIG_EXPERIMENTAL, a global config doesn't really work. The best
thing to do is to document each config as well as possible and system
builders can decide.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-19 15:40     ` Thomas Garnier
@ 2017-07-19 22:27       ` H. Peter Anvin
  2017-07-19 22:44         ` Thomas Garnier
  2017-07-19 22:58         ` Ard Biesheuvel
  0 siblings, 2 replies; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 22:27 UTC (permalink / raw)
  To: Thomas Garnier, Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown

On 07/19/17 08:40, Thomas Garnier wrote:
>>
>> This doesn't look right.  It's accessing a per-cpu variable.  The
>> per-cpu section is an absolute, zero-based section and not subject to
>> relocation.
> 
> PIE does not respect the zero-based section, it tries to have
> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
> (while keeping the zero absolute design by default).
> 

This is silly.  The right thing is for PIE is to be explicitly absolute,
without (%rip).  The use of (%rip) memory references for percpu is just
an optimization.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-18 22:33 ` [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations Thomas Garnier
@ 2017-07-19 22:33   ` H. Peter Anvin
  2017-07-19 22:47     ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 22:33 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger
  Cc: x86, linux-crypto, linux-kernel, xen-devel, kvm, linux-pm,
	linux-arch, linux-sparse, kernel-hardening

On 07/18/17 15:33, Thomas Garnier wrote:
> The x86 relocation tool generates a list of 32-bit signed integers. There
> was no need to use 64-bit integers because all addresses where above the 2G
> top of the memory.
> 
> This change add a large-reloc option to generate 64-bit unsigned integers.
> It can be used when the kernel plan to go below the top 2G and 32-bit
> integers are not enough.

Why on Earth?  This would only be necessary if the *kernel itself* was
more than 2G, which isn't going to happen for the forseeable future.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-19 22:27       ` H. Peter Anvin
@ 2017-07-19 22:44         ` Thomas Garnier
  2017-07-19 22:58         ` Ard Biesheuvel
  1 sibling, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 22:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Brian Gerst, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 3:27 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/19/17 08:40, Thomas Garnier wrote:
>>>
>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>> per-cpu section is an absolute, zero-based section and not subject to
>>> relocation.
>>
>> PIE does not respect the zero-based section, it tries to have
>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>> (while keeping the zero absolute design by default).
>>
>
> This is silly.  The right thing is for PIE is to be explicitly absolute,
> without (%rip).  The use of (%rip) memory references for percpu is just
> an optimization.

I agree that it is odd but that's how the compiler generates code. I
will re-explore PIC options with mcmodel=small or medium, as mentioned
on other threads.

>
>         -hpa
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-19 22:33   ` H. Peter Anvin
@ 2017-07-19 22:47     ` Thomas Garnier
  2017-07-19 23:08       ` H. Peter Anvin
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 22:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown,
	Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/18/17 15:33, Thomas Garnier wrote:
>> The x86 relocation tool generates a list of 32-bit signed integers. There
>> was no need to use 64-bit integers because all addresses where above the 2G
>> top of the memory.
>>
>> This change add a large-reloc option to generate 64-bit unsigned integers.
>> It can be used when the kernel plan to go below the top 2G and 32-bit
>> integers are not enough.
>
> Why on Earth?  This would only be necessary if the *kernel itself* was
> more than 2G, which isn't going to happen for the forseeable future.

Because the relocation integer is an absolute address, not an offset
in the binary. Next iteration, I can try using a 32-bit offset for
everyone.
>
>         -hpa
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-19 22:27       ` H. Peter Anvin
  2017-07-19 22:44         ` Thomas Garnier
@ 2017-07-19 22:58         ` Ard Biesheuvel
  2017-07-19 23:47           ` H. Peter Anvin
  1 sibling, 1 reply; 55+ messages in thread
From: Ard Biesheuvel @ 2017-07-19 22:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Garnier, Brian Gerst, Herbert Xu, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek

On 19 July 2017 at 23:27, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/19/17 08:40, Thomas Garnier wrote:
>>>
>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>> per-cpu section is an absolute, zero-based section and not subject to
>>> relocation.
>>
>> PIE does not respect the zero-based section, it tries to have
>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>> (while keeping the zero absolute design by default).
>>
>
> This is silly.  The right thing is for PIE is to be explicitly absolute,
> without (%rip).  The use of (%rip) memory references for percpu is just
> an optimization.
>

Sadly, there is an issue in binutils that may prevent us from doing
this as cleanly as we would want.

For historical reasons, bfd.ld emits special symbols like
__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of
SHN_ABS, even though it is quite obvious that they are relative like
any other symbol that points into the image. Unfortunately, this means
that binutils needs to emit R_X86_64_RELATIVE relocations even for
SHN_ABS symbols, which means we lose the ability to use both absolute
and relocatable symbols in the same PIE image (unless the reloc tool
can filter them out)

More info here:
https://sourceware.org/bugzilla/show_bug.cgi?id=19818

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 07/22] x86: relocate_kernel - Adapt assembly for PIE support
  2017-07-18 22:33 ` [RFC 07/22] x86: relocate_kernel - " Thomas Garnier
@ 2017-07-19 22:58   ` H. Peter Anvin
  2017-07-19 23:23     ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 22:58 UTC (permalink / raw)
  To: Thomas Garnier, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger
  Cc: x86, linux-crypto, linux-kernel, xen-devel, kvm, linux-pm,
	linux-arch, linux-sparse, kernel-hardening

On 07/18/17 15:33, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
> 
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> ---
>  arch/x86/kernel/relocate_kernel_64.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> index 98111b38ebfd..da817d1628ac 100644
> --- a/arch/x86/kernel/relocate_kernel_64.S
> +++ b/arch/x86/kernel/relocate_kernel_64.S
> @@ -186,7 +186,7 @@ identity_mapped:
>  	movq	%rax, %cr3
>  	lea	PAGE_SIZE(%r8), %rsp
>  	call	swap_pages
> -	movq	$virtual_mapped, %rax
> +	leaq	virtual_mapped(%rip), %rax
>  	pushq	%rax
>  	ret
>  

This is completely wrong.  The whole point is that %rip here is on an
identity-mapped page, which means that its offset to the actual symbol
is ill-defined.

The use of pushq/ret to do an indirect jump is bizarre, though, instead of:

	pushq %r8
	ret

one ought to simply do

	jmpq *%r8

I think the author of this code was confused by the fact that we have to
use this construct to do a *far* jump.

There are some other very bizarre constructs in this file, that I can
only assume comes from clumsy porting from 32 bits, for example:

	call 1f
1:
	popq %r8
	subq $(1b - relocate_kernel), %r8

... instead of the much simpler ...

	leaq relocate_kernel(%rip), %r8

With this value in %r8 anyway, you can simply do:

	leaq (virtual_mapped - relocate_kernel)(%r8), %rax
	jmpq *%rax

This patchset scares me.  There seems to be a lot of places where you
have not been very aware of what is actually happening in the code, nor
have done research about how the ABIs actually work and affect things.

Sorry.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-19 22:47     ` Thomas Garnier
@ 2017-07-19 23:08       ` H. Peter Anvin
  2017-07-19 23:25         ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 23:08 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki

On 07/19/17 15:47, Thomas Garnier wrote:
> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 07/18/17 15:33, Thomas Garnier wrote:
>>> The x86 relocation tool generates a list of 32-bit signed integers. There
>>> was no need to use 64-bit integers because all addresses where above the 2G
>>> top of the memory.
>>>
>>> This change add a large-reloc option to generate 64-bit unsigned integers.
>>> It can be used when the kernel plan to go below the top 2G and 32-bit
>>> integers are not enough.
>>
>> Why on Earth?  This would only be necessary if the *kernel itself* was
>> more than 2G, which isn't going to happen for the forseeable future.
> 
> Because the relocation integer is an absolute address, not an offset
> in the binary. Next iteration, I can try using a 32-bit offset for
> everyone.

It is an absolute address *as the kernel was originally linked*, for
obvious reasons.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 07/22] x86: relocate_kernel - Adapt assembly for PIE support
  2017-07-19 22:58   ` H. Peter Anvin
@ 2017-07-19 23:23     ` Thomas Garnier
  0 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 23:23 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown,
	Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 3:58 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/18/17 15:33, Thomas Garnier wrote:
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>>
>> Signed-off-by: Thomas Garnier <thgarnie@google.com>
>> ---
>>  arch/x86/kernel/relocate_kernel_64.S | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
>> index 98111b38ebfd..da817d1628ac 100644
>> --- a/arch/x86/kernel/relocate_kernel_64.S
>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>> @@ -186,7 +186,7 @@ identity_mapped:
>>       movq    %rax, %cr3
>>       lea     PAGE_SIZE(%r8), %rsp
>>       call    swap_pages
>> -     movq    $virtual_mapped, %rax
>> +     leaq    virtual_mapped(%rip), %rax
>>       pushq   %rax
>>       ret
>>
>
> This is completely wrong.  The whole point is that %rip here is on an
> identity-mapped page, which means that its offset to the actual symbol
> is ill-defined.
>
> The use of pushq/ret to do an indirect jump is bizarre, though, instead of:
>
>         pushq %r8
>         ret
>
> one ought to simply do
>
>         jmpq *%r8
>
> I think the author of this code was confused by the fact that we have to
> use this construct to do a *far* jump.
>
> There are some other very bizarre constructs in this file, that I can
> only assume comes from clumsy porting from 32 bits, for example:
>
>         call 1f
> 1:
>         popq %r8
>         subq $(1b - relocate_kernel), %r8
>
> ... instead of the much simpler ...
>
>         leaq relocate_kernel(%rip), %r8
>
> With this value in %r8 anyway, you can simply do:
>
>         leaq (virtual_mapped - relocate_kernel)(%r8), %rax
>         jmpq *%rax
>

Thanks I will look into that.

> This patchset scares me.  There seems to be a lot of places where you
> have not been very aware of what is actually happening in the code, nor
> have done research about how the ABIs actually work and affect things.

There is a lot of assembly that needed to be change. It was easier to
understand parts that are directly exercised like boot or percpu.
That's why I value people's feedback and will improve the patchset.

Thanks!

>
> Sorry.
>
>         -hpa



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-19 23:08       ` H. Peter Anvin
@ 2017-07-19 23:25         ` Thomas Garnier
  2017-07-19 23:45           ` H. Peter Anvin
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-19 23:25 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger, Rafael J . Wysocki, Len Brown,
	Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/19/17 15:47, Thomas Garnier wrote:
>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>>> On 07/18/17 15:33, Thomas Garnier wrote:
>>>> The x86 relocation tool generates a list of 32-bit signed integers. There
>>>> was no need to use 64-bit integers because all addresses where above the 2G
>>>> top of the memory.
>>>>
>>>> This change add a large-reloc option to generate 64-bit unsigned integers.
>>>> It can be used when the kernel plan to go below the top 2G and 32-bit
>>>> integers are not enough.
>>>
>>> Why on Earth?  This would only be necessary if the *kernel itself* was
>>> more than 2G, which isn't going to happen for the forseeable future.
>>
>> Because the relocation integer is an absolute address, not an offset
>> in the binary. Next iteration, I can try using a 32-bit offset for
>> everyone.
>
> It is an absolute address *as the kernel was originally linked*, for
> obvious reasons.

Sure when the kernel was just above 0xffffffff80000000, it doesn't
work when it goes down to 0xffffffff00000000. That's why using an
offset might make more sense in general.

>
>         -hpa
>



-- 
Thomas.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-19 18:26     ` Thomas Garnier
@ 2017-07-19 23:33       ` H. Peter Anvin
  2017-07-20  2:21         ` H. Peter Anvin
  2017-07-20 14:26         ` Thomas Garnier
  0 siblings, 2 replies; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 23:33 UTC (permalink / raw)
  To: Thomas Garnier, Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown

On 07/19/17 11:26, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst <brgerst@gmail.com> wrote:
>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>> address of zero and the relocation code avoid relocating specific
>>> symbols. It makes the code simple and easily adaptable with or without
>>> SMP support.
>>>
>>> This design is incompatible with PIE because generated code always try to
>>> access the zero virtual address relative to the default mapping address.
>>> It becomes impossible when KASLR is configured to go below -2G. This
>>> patch solves this problem by removing the zero mapping and adapting the GS
>>> base to be relative to the expected address. These changes are done only
>>> when PIE is enabled. The original implementation is kept as-is
>>> by default.
>>
>> The reason the per-cpu section is zero-based on x86-64 is to
>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
> 
> Ok, that make sense. I don't want this feature to not work with
> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
> entry for gs so gs:40 points to the correct memory address and
> gs:[rip+XX] works correctly through the MSR.

What are you talking about?  A GDT entry and the MSR do the same thing,
except that a GDT entry is limited to an offset of 0-0xffffffff (which
doesn't work for us, obviously.)

> Given the separate
> discussion on mcmodel, I am going first to check if we can move from
> PIE to PIC with a mcmodel=small or medium that would remove the percpu
> change requirement. I tried before without success but I understand
> better percpu and other components so maybe I can make it work.

>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is just
>> an optimization.
> 
> I agree that it is odd but that's how the compiler generates code. I
> will re-explore PIC options with mcmodel=small or medium, as mentioned
> on other threads.

Why should the way compiler generates code affect the way we do things
in assembly?

That being said, the compiler now has support for generating this kind
of code explicitly via the __seg_gs pointer modifier.  That should let
us drop the __percpu_prefix and just use variables directly.  I suspect
we want to declare percpu variables as "volatile __seg_gs" to account
for the possibility of CPU switches.

Older compilers won't be able to work with this, of course, but I think
that it is acceptable for those older compilers to not be able to
support PIE.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations
  2017-07-19 23:25         ` Thomas Garnier
@ 2017-07-19 23:45           ` H. Peter Anvin
  0 siblings, 0 replies; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 23:45 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Brian Gerst, Borislav Petkov,
	Christian Borntraeger

<cmetcalf@mellanox.com>,"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,Andrew Morton <akpm@linux-foundation.org>,Christopher Li <sparse@chrisli.org>,Dou Liyang <douly.fnst@cn.fujitsu.com>,Masahiro Yamada <yamada.masahiro@socionext.com>,Daniel Borkmann <daniel@iogearbox.net>,Markus Trippelsdorf <markus@trippelsdorf.de>,Peter Foley <pefoley2@pefoley.com>,Steven Rostedt <rostedt@goodmis.org>,Tim Chen <tim.c.chen@linux.intel.com>,Ard Biesheuvel <ard.biesheuvel@linaro.org>,Catalin Marinas <catalin.marinas@arm.com>,Matthew Wilcox <mawilcox@microsoft.com>,Michal Hocko <mhocko@suse.com>,Rob Landley <rob@landley.net>,Jiri Kosina <jkosina@suse.cz>,"H . J . Lu" <hjl.tools@gmail.com>,Paul Bolle <pebolle@tiscali.nl>,Baoquan He <bhe@redhat.com>,Daniel Micay <danielmicay@gmail.com>,the arch/x86 maintai
 ners <x86@kernel.org>,linux-crypto@vger.kernel.org,LKML <linux-kernel@vger.kernel.org>,xen-devel@lists.xenproject.org,kvm list <kvm@vger.kernel.org>,Linux PM list
<linux-pm@vger.kernel.org>,linux-arch <linux-arch@vger.kernel.org>,linux-sparse@vger.kernel.org,Kernel Hardening <kernel-hardening@lists.openwall.com>
From: hpa@zytor.com
Message-ID: <0EF6FAAA-A99C-4F0D-9E4A-AD25E93957FB@zytor.com>

On July 19, 2017 4:25:56 PM PDT, Thomas Garnier <thgarnie@google.com> wrote:
>On Wed, Jul 19, 2017 at 4:08 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 07/19/17 15:47, Thomas Garnier wrote:
>>> On Wed, Jul 19, 2017 at 3:33 PM, H. Peter Anvin <hpa@zytor.com>
>wrote:
>>>> On 07/18/17 15:33, Thomas Garnier wrote:
>>>>> The x86 relocation tool generates a list of 32-bit signed
>integers. There
>>>>> was no need to use 64-bit integers because all addresses where
>above the 2G
>>>>> top of the memory.
>>>>>
>>>>> This change add a large-reloc option to generate 64-bit unsigned
>integers.
>>>>> It can be used when the kernel plan to go below the top 2G and
>32-bit
>>>>> integers are not enough.
>>>>
>>>> Why on Earth?  This would only be necessary if the *kernel itself*
>was
>>>> more than 2G, which isn't going to happen for the forseeable
>future.
>>>
>>> Because the relocation integer is an absolute address, not an offset
>>> in the binary. Next iteration, I can try using a 32-bit offset for
>>> everyone.
>>
>> It is an absolute address *as the kernel was originally linked*, for
>> obvious reasons.
>
>Sure when the kernel was just above 0xffffffff80000000, it doesn't
>work when it goes down to 0xffffffff00000000. That's why using an
>offset might make more sense in general.
>
>>
>>         -hpa
>>

What is the motivation for changing the pre linked address at all?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 06/22] kvm: Adapt assembly for PIE support
  2017-07-19 22:58         ` Ard Biesheuvel
@ 2017-07-19 23:47           ` H. Peter Anvin
  0 siblings, 0 replies; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-19 23:47 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Thomas Garnier, Brian Gerst, Herbert Xu, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov

<paul.gortmaker@windriver.com>,Chris Metcalf <cmetcalf@mellanox.com>,"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,Andrew Morton <akpm@linux-foundation.org>,Christopher Li <sparse@chrisli.org>,Dou Liyang <douly.fnst@cn.fujitsu.com>,Masahiro Yamada <yamada.masahiro@socionext.com>,Daniel Borkmann <daniel@iogearbox.net>,Markus Trippelsdorf <markus@trippelsdorf.de>,Peter Foley <pefoley2@pefoley.com>,Steven Rostedt <rostedt@goodmis.org>,Tim Chen <tim.c.chen@linux.intel.com>,Catalin Marinas <catalin.marinas@arm.com>,Matthew Wilcox <mawilcox@microsoft.com>,Michal Hocko <mhocko@suse.com>,Rob Landley <rob@landley.net>,Jiri Kosina <jkosina@suse.cz>,"H . J . Lu" <hjl.tools@gmail.com>,Paul Bolle <pebolle@tiscali.nl>,Baoquan He <bhe@redhat.com>,Daniel Micay <danielmicay@gmail.com>,the arch/x86 maint
 ainers <x86@kernel.org>,"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,xen-devel@lists.xenproject.org,kvm list
<kvm@vger.kernel.org>,linux-pm <linux-pm@vger.kernel.org>,linux-arch <linux-arch@vger.kernel.org>,Linux-Sparse <linux-sparse@vger.kernel.org>,Kernel Hardening <kernel-hardening@lists.openwall.com>
From: hpa@zytor.com
Message-ID: <83BA7600-BC8D-4C91-812C-DD2A0BF4474B@zytor.com>

On July 19, 2017 3:58:07 PM PDT, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>On 19 July 2017 at 23:27, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 07/19/17 08:40, Thomas Garnier wrote:
>>>>
>>>> This doesn't look right.  It's accessing a per-cpu variable.  The
>>>> per-cpu section is an absolute, zero-based section and not subject
>to
>>>> relocation.
>>>
>>> PIE does not respect the zero-based section, it tries to have
>>> everything relative. Patch 16/22 also adapt per-cpu to work with PIE
>>> (while keeping the zero absolute design by default).
>>>
>>
>> This is silly.  The right thing is for PIE is to be explicitly
>absolute,
>> without (%rip).  The use of (%rip) memory references for percpu is
>just
>> an optimization.
>>
>
>Sadly, there is an issue in binutils that may prevent us from doing
>this as cleanly as we would want.
>
>For historical reasons, bfd.ld emits special symbols like
>__GLOBAL_OFFSET_TABLE__ as absolute symbols with a section index of
>SHN_ABS, even though it is quite obvious that they are relative like
>any other symbol that points into the image. Unfortunately, this means
>that binutils needs to emit R_X86_64_RELATIVE relocations even for
>SHN_ABS symbols, which means we lose the ability to use both absolute
>and relocatable symbols in the same PIE image (unless the reloc tool
>can filter them out)
>
>More info here:
>https://sourceware.org/bugzilla/show_bug.cgi?id=19818

The reloc tool already has the ability to filter symbols.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-19 23:33       ` H. Peter Anvin
@ 2017-07-20  2:21         ` H. Peter Anvin
  2017-07-20  3:03           ` H. Peter Anvin
  2017-07-20 14:26         ` Thomas Garnier
  1 sibling, 1 reply; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-20  2:21 UTC (permalink / raw)
  To: Thomas Garnier, Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown

On 07/19/17 16:33, H. Peter Anvin wrote:
>>
>> I agree that it is odd but that's how the compiler generates code. I
>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>> on other threads.
> 
> Why should the way compiler generates code affect the way we do things
> in assembly?
> 
> That being said, the compiler now has support for generating this kind
> of code explicitly via the __seg_gs pointer modifier.  That should let
> us drop the __percpu_prefix and just use variables directly.  I suspect
> we want to declare percpu variables as "volatile __seg_gs" to account
> for the possibility of CPU switches.
> 
> Older compilers won't be able to work with this, of course, but I think
> that it is acceptable for those older compilers to not be able to
> support PIE.
> 

Grump.  It turns out that the compiler doesn't do the right thing for
symbols marked with the __seg_[fg]s markers.  __thread does the right
thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
still cache %seg:0 arbitrarily long.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-20  2:21         ` H. Peter Anvin
@ 2017-07-20  3:03           ` H. Peter Anvin
  0 siblings, 0 replies; 55+ messages in thread
From: H. Peter Anvin @ 2017-07-20  3:03 UTC (permalink / raw)
  To: Thomas Garnier, Brian Gerst
  Cc: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann, Matthias Kaehlcke,
	Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown

On 07/19/17 19:21, H. Peter Anvin wrote:
> On 07/19/17 16:33, H. Peter Anvin wrote:
>>>
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>>
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>>
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>>
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>
> 
> Grump.  It turns out that the compiler doesn't do the right thing for
> symbols marked with the __seg_[fg]s markers.  __thread does the right
> thing, but __thread a) has %fs: hard-coded, still, and b) I believe can
> still cache %seg:0 arbitrarily long.

I filed this bug report for gcc:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81490

It might still be possible to work around this by playing really ugly
games with __thread, but I haven't yet figured out how best to do that.

	-hpa

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-19 23:33       ` H. Peter Anvin
  2017-07-20  2:21         ` H. Peter Anvin
@ 2017-07-20 14:26         ` Thomas Garnier
  2017-08-02 16:42           ` Thomas Garnier
  1 sibling, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-07-20 14:26 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Brian Gerst, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/19/17 11:26, Thomas Garnier wrote:
>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst <brgerst@gmail.com> wrote:
>>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
>>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>>> address of zero and the relocation code avoid relocating specific
>>>> symbols. It makes the code simple and easily adaptable with or without
>>>> SMP support.
>>>>
>>>> This design is incompatible with PIE because generated code always try to
>>>> access the zero virtual address relative to the default mapping address.
>>>> It becomes impossible when KASLR is configured to go below -2G. This
>>>> patch solves this problem by removing the zero mapping and adapting the GS
>>>> base to be relative to the expected address. These changes are done only
>>>> when PIE is enabled. The original implementation is kept as-is
>>>> by default.
>>>
>>> The reason the per-cpu section is zero-based on x86-64 is to
>>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
>>
>> Ok, that make sense. I don't want this feature to not work with
>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
>> entry for gs so gs:40 points to the correct memory address and
>> gs:[rip+XX] works correctly through the MSR.
>
> What are you talking about?  A GDT entry and the MSR do the same thing,
> except that a GDT entry is limited to an offset of 0-0xffffffff (which
> doesn't work for us, obviously.)
>

A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX]
addresses uses the MSR.

I didn't tested it but that was used on the RFG mitigation [1]. The fs
segment register was used for both thread storage and shadow stack.

[1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/

>> Given the separate
>> discussion on mcmodel, I am going first to check if we can move from
>> PIE to PIC with a mcmodel=small or medium that would remove the percpu
>> change requirement. I tried before without success but I understand
>> better percpu and other components so maybe I can make it work.
>
>>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>>> without (%rip).  The use of (%rip) memory references for percpu is just
>>> an optimization.
>>
>> I agree that it is odd but that's how the compiler generates code. I
>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>> on other threads.
>
> Why should the way compiler generates code affect the way we do things
> in assembly?
>
> That being said, the compiler now has support for generating this kind
> of code explicitly via the __seg_gs pointer modifier.  That should let
> us drop the __percpu_prefix and just use variables directly.  I suspect
> we want to declare percpu variables as "volatile __seg_gs" to account
> for the possibility of CPU switches.
>
> Older compilers won't be able to work with this, of course, but I think
> that it is acceptable for those older compilers to not be able to
> support PIE.
>
>         -hpa
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 21/22] x86/module: Add support for mcmodel large and PLTs
  2017-07-19 17:34         ` Brian Gerst
@ 2017-07-24 16:32           ` Thomas Garnier
  0 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-07-24 16:32 UTC (permalink / raw)
  To: Brian Gerst
  Cc: H. Peter Anvin, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Wed, Jul 19, 2017 at 10:34 AM, Brian Gerst <brgerst@gmail.com> wrote:
> On Wed, Jul 19, 2017 at 11:58 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> On Tue, Jul 18, 2017 at 8:59 PM, Brian Gerst <brgerst@gmail.com> wrote:
>>> On Tue, Jul 18, 2017 at 9:35 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>>>> On 07/18/17 15:33, Thomas Garnier wrote:
>>>>> With PIE support and KASLR extended range, the modules may be further
>>>>> away from the kernel than before breaking mcmodel=kernel expectations.
>>>>>
>>>>> Add an option to build modules with mcmodel=large. The modules generated
>>>>> code will make no assumptions on placement in memory.
>>>>>
>>>>> Despite this option, modules still expect kernel functions to be within
>>>>> 2G and generate relative calls. To solve this issue, the PLT arm64 code
>>>>> was adapted for x86_64. When a relative relocation go outside its range,
>>>>> a dynamic PLT entry is used to correctly jump to the destination.
>>>>
>>>> Why large as opposed to medium or medium-PIC?
>>>
>>> Or for that matter, why not small-PIC?  We aren't changing the size of
>>> the kernel to be larger than 2G text or data.  Small-PIC would still
>>> allow it to be placed anywhere in the address space, and would
>>> generate far better code.
>>
>> My understanding was that small=PIC and medium=PIC assume that the
>> module code is in the lower 2G of memory. I will do additional testing
>> on the modules to confirm that.
>
> That is only for small/medium absolute (non-PIC) code.  Think about
> userspace shared libraries.  They are not limited to being mapped in
> the lower 2G of the address space.

I built lkdtm with mcmodel=(kernel, small, medium & large).

Comparing the same instruction and its relocation in lkdtm
(lkdtm_register_cpoint).

On mcmodel=kernel:

     1b8:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        1bb: R_X86_64_32S       .rodata.str1.8+0x50

On mcmodel=small and mcmodel=medium:

     1b8:       bf 00 00 00 00          mov    $0x0,%edi
     1b9: R_X86_64_32        .rodata.str1.8+0x50

On mcmodel=large:

     235:       48 bf 00 00 00 00 00    movabs $0x0,%rdi
     23c:       00 00 00
                        237: R_X86_64_64        .rodata.str1.8+0x50

The kernel mcmodel extends the sign of the address. It assumes you are
in the top 2G of the address space. So the relocated pointer
0x8XXXXXXX becomes 0xFFFFFFFF8XXXXXXX.

The small and medium mcmodels assume the pointer is within the lower
part of the address space. The generate pointer has the 32-bit high
part to zero. You can only map the module between 0 and 0xFFFFFFFF.

The large mcmodel can handle a full 64-bit pointer.

That's why I use the large mcmodel on modules. I cannot use PIE due to
how the modules are linked.

>
> --
> Brian Gerst



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-07-20 14:26         ` Thomas Garnier
@ 2017-08-02 16:42           ` Thomas Garnier
  2017-08-02 16:56             ` Kees Cook
  0 siblings, 1 reply; 55+ messages in thread
From: Thomas Garnier @ 2017-08-02 16:42 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Brian Gerst, Herbert Xu, David S . Miller, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Josh Poimboeuf, Arnd Bergmann,
	Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross, Paolo Bonzini,
	Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek, Tejun Heo

On Thu, Jul 20, 2017 at 7:26 AM, Thomas Garnier <thgarnie@google.com> wrote:
> On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 07/19/17 11:26, Thomas Garnier wrote:
>>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst <brgerst@gmail.com> wrote:
>>>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <thgarnie@google.com> wrote:
>>>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>>>> address of zero and the relocation code avoid relocating specific
>>>>> symbols. It makes the code simple and easily adaptable with or without
>>>>> SMP support.
>>>>>
>>>>> This design is incompatible with PIE because generated code always try to
>>>>> access the zero virtual address relative to the default mapping address.
>>>>> It becomes impossible when KASLR is configured to go below -2G. This
>>>>> patch solves this problem by removing the zero mapping and adapting the GS
>>>>> base to be relative to the expected address. These changes are done only
>>>>> when PIE is enabled. The original implementation is kept as-is
>>>>> by default.
>>>>
>>>> The reason the per-cpu section is zero-based on x86-64 is to
>>>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>>>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
>>>
>>> Ok, that make sense. I don't want this feature to not work with
>>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
>>> entry for gs so gs:40 points to the correct memory address and
>>> gs:[rip+XX] works correctly through the MSR.
>>
>> What are you talking about?  A GDT entry and the MSR do the same thing,
>> except that a GDT entry is limited to an offset of 0-0xffffffff (which
>> doesn't work for us, obviously.)
>>
>
> A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX]
> addresses uses the MSR.
>
> I didn't tested it but that was used on the RFG mitigation [1]. The fs
> segment register was used for both thread storage and shadow stack.
>
> [1] http://xlab.tencent.com/en/2016/11/02/return-flow-guard/
>

Small update on that.

I noticed that not only we have the problem of gs:0x40 not being
accessible. The compiler will default to the fs register if
mcmodel=kernel is not set.

On the next patch set, I am going to add support for
-mstack-protector-guard=global so a global variable can be used
instead of the segment register. Similar approach than ARM/ARM64.

Following this patch, I will work with gcc and llvm to add
-mstack-protector-reg=<segment register> support similar to PowerPC.
This way we can have gs used even without mcmodel=kernel. Once that's
an option, I can setup the GDT as described in the previous email
(similar to RFG).

Let me know what you think about this approach.

>>> Given the separate
>>> discussion on mcmodel, I am going first to check if we can move from
>>> PIE to PIC with a mcmodel=small or medium that would remove the percpu
>>> change requirement. I tried before without success but I understand
>>> better percpu and other components so maybe I can make it work.
>>
>>>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>>>> without (%rip).  The use of (%rip) memory references for percpu is just
>>>> an optimization.
>>>
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>>
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>>
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>>
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>
>>         -hpa
>>
>
>
>
> --
> Thomas



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-08-02 16:42           ` Thomas Garnier
@ 2017-08-02 16:56             ` Kees Cook
  2017-08-02 18:05               ` Thomas Garnier
  0 siblings, 1 reply; 55+ messages in thread
From: Kees Cook @ 2017-08-02 16:56 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: H. Peter Anvin, Brian Gerst, Herbert Xu, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek

On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier <thgarnie@google.com> wrote:
> I noticed that not only we have the problem of gs:0x40 not being
> accessible. The compiler will default to the fs register if
> mcmodel=kernel is not set.
>
> On the next patch set, I am going to add support for
> -mstack-protector-guard=global so a global variable can be used
> instead of the segment register. Similar approach than ARM/ARM64.

While this is probably understood, I have to point out that this would
be a major regression for the stack protection on x86.

> Following this patch, I will work with gcc and llvm to add
> -mstack-protector-reg=<segment register> support similar to PowerPC.
> This way we can have gs used even without mcmodel=kernel. Once that's
> an option, I can setup the GDT as described in the previous email
> (similar to RFG).

It would be much nicer if we could teach gcc about the percpu area
instead. This would let us solve the global stack protector problem on
the other architectures:
http://www.openwall.com/lists/kernel-hardening/2017/06/27/6

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support
  2017-08-02 16:56             ` Kees Cook
@ 2017-08-02 18:05               ` Thomas Garnier
  0 siblings, 0 replies; 55+ messages in thread
From: Thomas Garnier @ 2017-08-02 18:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: H. Peter Anvin, Brian Gerst, Herbert Xu, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Arnd Bergmann, Matthias Kaehlcke, Boris Ostrovsky, Juergen Gross,
	Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Andy Lutomirski, Borislav Petkov,
	Kirill A . Shutemov, Borislav Petkov, Christian Borntraeger,
	Rafael J . Wysocki, Len Brown, Pavel Machek

On Wed, Aug 2, 2017 at 9:56 AM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Aug 2, 2017 at 9:42 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> I noticed that not only we have the problem of gs:0x40 not being
>> accessible. The compiler will default to the fs register if
>> mcmodel=kernel is not set.
>>
>> On the next patch set, I am going to add support for
>> -mstack-protector-guard=global so a global variable can be used
>> instead of the segment register. Similar approach than ARM/ARM64.
>
> While this is probably understood, I have to point out that this would
> be a major regression for the stack protection on x86.

I agree, the optimal solution will be using updated gcc/clang.

>
>> Following this patch, I will work with gcc and llvm to add
>> -mstack-protector-reg=<segment register> support similar to PowerPC.
>> This way we can have gs used even without mcmodel=kernel. Once that's
>> an option, I can setup the GDT as described in the previous email
>> (similar to RFG).
>
> It would be much nicer if we could teach gcc about the percpu area
> instead. This would let us solve the global stack protector problem on
> the other architectures:
> http://www.openwall.com/lists/kernel-hardening/2017/06/27/6

Yes, while I am looking at gcc I will take a look at other
architecture to see if I can help there too.

>
> -Kees
>
> --
> Kees Cook
> Pixel Security



-- 
Thomas

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2017-08-02 18:05 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18 22:33 x86: PIE support and option to extend KASLR randomization Thomas Garnier
2017-07-18 22:33 ` [RFC 01/22] x86/crypto: Adapt assembly for PIE support Thomas Garnier
2017-07-18 22:33 ` [RFC 02/22] x86: Use symbol name on bug table " Thomas Garnier
2017-07-18 22:33 ` [RFC 03/22] x86: Use symbol name in jump " Thomas Garnier
2017-07-18 22:33 ` [RFC 04/22] x86: Add macro to get symbol address " Thomas Garnier
2017-07-18 22:33 ` [RFC 05/22] xen: Adapt assembly " Thomas Garnier
2017-07-18 22:33 ` [RFC 06/22] kvm: " Thomas Garnier
2017-07-19  2:49   ` Brian Gerst
2017-07-19 15:40     ` Thomas Garnier
2017-07-19 22:27       ` H. Peter Anvin
2017-07-19 22:44         ` Thomas Garnier
2017-07-19 22:58         ` Ard Biesheuvel
2017-07-19 23:47           ` H. Peter Anvin
2017-07-18 22:33 ` [RFC 07/22] x86: relocate_kernel - " Thomas Garnier
2017-07-19 22:58   ` H. Peter Anvin
2017-07-19 23:23     ` Thomas Garnier
2017-07-18 22:33 ` [RFC 08/22] x86/entry/64: " Thomas Garnier
2017-07-18 22:33 ` [RFC 09/22] x86: pm-trace - " Thomas Garnier
2017-07-18 22:33 ` [RFC 10/22] x86/CPU: " Thomas Garnier
2017-07-18 22:33 ` [RFC 11/22] x86/acpi: " Thomas Garnier
2017-07-18 22:33 ` [RFC 12/22] x86/boot/64: " Thomas Garnier
2017-07-18 22:33 ` [RFC 13/22] x86/power/64: " Thomas Garnier
2017-07-19 18:41   ` Pavel Machek
2017-07-18 22:33 ` [RFC 14/22] x86/paravirt: " Thomas Garnier
2017-07-18 22:33 ` [RFC 15/22] x86/boot/64: Use _text in a global " Thomas Garnier
2017-07-18 22:33 ` [RFC 16/22] x86/percpu: Adapt percpu " Thomas Garnier
2017-07-19  3:08   ` Brian Gerst
2017-07-19 18:26     ` Thomas Garnier
2017-07-19 23:33       ` H. Peter Anvin
2017-07-20  2:21         ` H. Peter Anvin
2017-07-20  3:03           ` H. Peter Anvin
2017-07-20 14:26         ` Thomas Garnier
2017-08-02 16:42           ` Thomas Garnier
2017-08-02 16:56             ` Kees Cook
2017-08-02 18:05               ` Thomas Garnier
2017-07-18 22:33 ` [RFC 17/22] compiler: Option to default to hidden symbols Thomas Garnier
2017-07-18 22:33 ` [RFC 18/22] x86/relocs: Handle DYN relocations for PIE support Thomas Garnier
2017-07-18 22:33 ` [RFC 19/22] x86/pie: Add option to build the kernel as PIE for x86_64 Thomas Garnier
2017-07-18 22:33 ` [RFC 20/22] x86/relocs: Add option to generate 64-bit relocations Thomas Garnier
2017-07-19 22:33   ` H. Peter Anvin
2017-07-19 22:47     ` Thomas Garnier
2017-07-19 23:08       ` H. Peter Anvin
2017-07-19 23:25         ` Thomas Garnier
2017-07-19 23:45           ` H. Peter Anvin
2017-07-18 22:33 ` [RFC 21/22] x86/module: Add support for mcmodel large and PLTs Thomas Garnier
2017-07-19  1:35   ` H. Peter Anvin
2017-07-19  3:59     ` Brian Gerst
2017-07-19 15:58       ` Thomas Garnier
2017-07-19 17:34         ` Brian Gerst
2017-07-24 16:32           ` Thomas Garnier
2017-07-18 22:33 ` [RFC 22/22] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB Thomas Garnier
2017-07-19 12:10   ` Baoquan He
2017-07-19 13:49     ` Baoquan He
2017-07-19 14:08 ` x86: PIE support and option to extend KASLR randomization Christopher Lameter
2017-07-19 19:21   ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).