Kernel-hardening archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v9 00/11] x86: PIE support to extend KASLR randomization
@ 2019-07-30 19:12 Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support Thomas Garnier
                   ` (11 more replies)
  0 siblings, 12 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Herbert Xu, David S. Miller, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86,
	Andy Lutomirski, Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Rafael J. Wysocki, Len Brown, Pavel Machek, Peter Zijlstra,
	Nadav Amit, Thomas Garnier, Jann Horn, Feng Tang, Maran Wilson,
	Enrico Weigelt, Allison Randal, Alexios Zavras, linux-crypto,
	linux-kernel, virtualization, linux-pm

Minor changes based on feedback and rebase from v8.

Splitting the previous serie in two. This part contains assembly code
changes required for PIE but without any direct dependencies with the
rest of the patchset.

Changes:
 - patch v9 (assembly):
   - Moved to relative reference for sync_core based on feedback.
   - x86/crypto had multiple algorithms deleted, removed PIE changes to them.
   - fix typo on comment end line.
 - patch v8 (assembly):
   - Fix issues in crypto changes (thanks to Eric Biggers).
   - Remove unnecessary jump table change.
   - Change author and signoff to chromium email address.
 - patch v7 (assembly):
   - Split patchset and reorder changes.
 - patch v6:
   - Rebase on latest changes in jump tables and crypto.
   - Fix wording on couple commits.
   - Revisit checkpatch warnings.
   - Moving to @chromium.org.
 - patch v5:
   - Adapt new crypto modules for PIE.
   - Improve per-cpu commit message.
   - Fix xen 32-bit build error with .quad.
   - Remove extra code for ftrace.
 - patch v4:
   - Simplify early boot by removing global variables.
   - Modify the mcount location script for __mcount_loc intead of the address
     read in the ftrace implementation.
   - Edit commit description to explain better where the kernel can be located.
   - Streamlined the testing done on each patch proposal. Always testing
     hibernation, suspend, ftrace and kprobe to ensure no regressions.
 - patch v3:
   - Update on message to describe longer term PIE goal.
   - Minor change on ftrace if condition.
   - Changed code using xchgq.
 - patch v2:
   - Adapt patch to work post KPTI and compiler changes
   - Redo all performance testing with latest configs and compilers
   - Simplify mov macro on PIE (MOVABS now)
   - Reduce GOT footprint
 - patch v1:
   - Simplify ftrace implementation.
   - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
 - rfc v3:
   - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
     mapped memory. It also simplifies the relocation process.
   - Move the start the module section next to the kernel. Remove the need for
     -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
   - Support for XEN PVH as 32-bit relocations can be ignored with
     --emit-relocs.
   - Support for GOT relocations previously done automatically with -pie.
   - Remove need for dynamic PLT in modules.
   - Support dymamic GOT for modules.
 - rfc v2:
   - Add support for global stack cookie while compiler default to fs without
     mcmodel=kernel
   - Change patch 7 to correctly jump out of the identity mapping on kexec load
     preserve.

These patches make some of the changes necessary to build the kernel as
Position Independent Executable (PIE) on x86_64. Another patchset will
add the PIE option and larger architecture changes.

The patches:
 - 1, 3-11: Change in assembly code to be PIE compliant.
 - 2: Add a new _ASM_MOVABS macro to fetch a symbol address generically.

diffstat:
 crypto/aegis128-aesni-asm.S         |    6 +-
 crypto/aesni-intel_asm.S            |    8 +--
 crypto/aesni-intel_avx-x86_64.S     |    3 -
 crypto/camellia-aesni-avx-asm_64.S  |   42 +++++++--------
 crypto/camellia-aesni-avx2-asm_64.S |   44 ++++++++--------
 crypto/camellia-x86_64-asm_64.S     |    8 +--
 crypto/cast5-avx-x86_64-asm_64.S    |   50 ++++++++++--------
 crypto/cast6-avx-x86_64-asm_64.S    |   44 +++++++++-------
 crypto/des3_ede-asm_64.S            |   96 ++++++++++++++++++++++++------------
 crypto/ghash-clmulni-intel_asm.S    |    4 -
 crypto/glue_helper-asm-avx.S        |    4 -
 crypto/glue_helper-asm-avx2.S       |    6 +-
 crypto/sha256-avx2-asm.S            |   18 ++++--
 entry/entry_64.S                    |   16 ++++--
 include/asm/alternative.h           |    6 +-
 include/asm/asm.h                   |    1 
 include/asm/paravirt_types.h        |   25 +++++++--
 include/asm/pm-trace.h              |    2 
 include/asm/processor.h             |    6 +-
 kernel/acpi/wakeup_64.S             |   31 ++++++-----
 kernel/head_64.S                    |   16 +++---
 kernel/relocate_kernel_64.S         |    2 
 power/hibernate_asm_64.S            |    4 -
 23 files changed, 261 insertions(+), 181 deletions(-)

Patchset is based on next-20190729.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
@ 2019-07-30 19:12 ` Thomas Garnier
  2019-08-05 16:32   ` Borislav Petkov
  2019-07-30 19:12 ` [PATCH v9 02/11] x86: Add macro to get symbol address " Thomas Garnier
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Herbert Xu, David S. Miller,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, linux-crypto, linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
---
 arch/x86/crypto/aegis128-aesni-asm.S         |  6 +-
 arch/x86/crypto/aesni-intel_asm.S            |  8 +-
 arch/x86/crypto/aesni-intel_avx-x86_64.S     |  3 +-
 arch/x86/crypto/camellia-aesni-avx-asm_64.S  | 42 ++++-----
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 ++++-----
 arch/x86/crypto/camellia-x86_64-asm_64.S     |  8 +-
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S    | 50 +++++-----
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S    | 44 +++++----
 arch/x86/crypto/des3_ede-asm_64.S            | 96 +++++++++++++-------
 arch/x86/crypto/ghash-clmulni-intel_asm.S    |  4 +-
 arch/x86/crypto/glue_helper-asm-avx.S        |  4 +-
 arch/x86/crypto/glue_helper-asm-avx2.S       |  6 +-
 arch/x86/crypto/sha256-avx2-asm.S            | 18 ++--
 13 files changed, 191 insertions(+), 142 deletions(-)

diff --git a/arch/x86/crypto/aegis128-aesni-asm.S b/arch/x86/crypto/aegis128-aesni-asm.S
index 4434607e366d..00aff3321c16 100644
--- a/arch/x86/crypto/aegis128-aesni-asm.S
+++ b/arch/x86/crypto/aegis128-aesni-asm.S
@@ -200,8 +200,8 @@ ENTRY(crypto_aegis128_aesni_init)
 	movdqa KEY, STATE4
 
 	/* load the constants: */
-	movdqa .Laegis128_const_0, STATE2
-	movdqa .Laegis128_const_1, STATE1
+	movdqa .Laegis128_const_0(%rip), STATE2
+	movdqa .Laegis128_const_1(%rip), STATE1
 	pxor STATE2, STATE3
 	pxor STATE1, STATE4
 
@@ -681,7 +681,7 @@ ENTRY(crypto_aegis128_aesni_dec_tail)
 	punpcklbw T0, T0
 	punpcklbw T0, T0
 	punpcklbw T0, T0
-	movdqa .Laegis128_counter, T1
+	movdqa .Laegis128_counter(%rip), T1
 	pcmpgtb T1, T0
 	pand T0, MSG
 
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index e40bdf024ba7..36e2cff7fb19 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2606,7 +2606,7 @@ ENDPROC(aesni_cbc_dec)
  */
 .align 4
 _aesni_inc_init:
-	movaps .Lbswap_mask, BSWAP_MASK
+	movaps .Lbswap_mask(%rip), BSWAP_MASK
 	movaps IV, CTR
 	PSHUFB_XMM BSWAP_MASK CTR
 	mov $1, TCTR_LOW
@@ -2734,12 +2734,12 @@ ENTRY(aesni_xts_crypt8)
 	cmpb $0, %cl
 	movl $0, %ecx
 	movl $240, %r10d
-	leaq _aesni_enc4, %r11
-	leaq _aesni_dec4, %rax
+	leaq _aesni_enc4(%rip), %r11
+	leaq _aesni_dec4(%rip), %rax
 	cmovel %r10d, %ecx
 	cmoveq %rax, %r11
 
-	movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
+	movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
 	movups (IVP), IV
 
 	mov 480(KEYP), KLEN
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 91c039ab5699..210ac0e61eaf 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -660,7 +660,8 @@ _get_AAD_rest0\@:
 	vpshufb and an array of shuffle masks */
 	movq    %r12, %r11
 	salq    $4, %r11
-	vmovdqu  aad_shift_arr(%r11), \T1
+	leaq    aad_shift_arr(%rip), %rax
+	vmovdqu  (%rax,%r11,), \T1
 	vpshufb \T1, \T7, \T7
 _get_AAD_rest_final\@:
 	vpshufb SHUF_MASK(%rip), \T7, \T7
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index a14af6eb09cb..f94ec9a5552b 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -53,10 +53,10 @@
 	/* \
 	 * S-function with AES subbytes \
 	 */ \
-	vmovdqa .Linv_shift_row, t4; \
-	vbroadcastss .L0f0f0f0f, t7; \
-	vmovdqa .Lpre_tf_lo_s1, t0; \
-	vmovdqa .Lpre_tf_hi_s1, t1; \
+	vmovdqa .Linv_shift_row(%rip), t4; \
+	vbroadcastss .L0f0f0f0f(%rip), t7; \
+	vmovdqa .Lpre_tf_lo_s1(%rip), t0; \
+	vmovdqa .Lpre_tf_hi_s1(%rip), t1; \
 	\
 	/* AES inverse shift rows */ \
 	vpshufb t4, x0, x0; \
@@ -69,8 +69,8 @@
 	vpshufb t4, x6, x6; \
 	\
 	/* prefilter sboxes 1, 2 and 3 */ \
-	vmovdqa .Lpre_tf_lo_s4, t2; \
-	vmovdqa .Lpre_tf_hi_s4, t3; \
+	vmovdqa .Lpre_tf_lo_s4(%rip), t2; \
+	vmovdqa .Lpre_tf_hi_s4(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x1, t0, t1, t7, t6); \
@@ -84,8 +84,8 @@
 	filter_8bit(x6, t2, t3, t7, t6); \
 	\
 	/* AES subbytes + AES shift rows */ \
-	vmovdqa .Lpost_tf_lo_s1, t0; \
-	vmovdqa .Lpost_tf_hi_s1, t1; \
+	vmovdqa .Lpost_tf_lo_s1(%rip), t0; \
+	vmovdqa .Lpost_tf_hi_s1(%rip), t1; \
 	vaesenclast t4, x0, x0; \
 	vaesenclast t4, x7, x7; \
 	vaesenclast t4, x1, x1; \
@@ -96,16 +96,16 @@
 	vaesenclast t4, x6, x6; \
 	\
 	/* postfilter sboxes 1 and 4 */ \
-	vmovdqa .Lpost_tf_lo_s3, t2; \
-	vmovdqa .Lpost_tf_hi_s3, t3; \
+	vmovdqa .Lpost_tf_lo_s3(%rip), t2; \
+	vmovdqa .Lpost_tf_hi_s3(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x3, t0, t1, t7, t6); \
 	filter_8bit(x6, t0, t1, t7, t6); \
 	\
 	/* postfilter sbox 3 */ \
-	vmovdqa .Lpost_tf_lo_s2, t4; \
-	vmovdqa .Lpost_tf_hi_s2, t5; \
+	vmovdqa .Lpost_tf_lo_s2(%rip), t4; \
+	vmovdqa .Lpost_tf_hi_s2(%rip), t5; \
 	filter_8bit(x2, t2, t3, t7, t6); \
 	filter_8bit(x5, t2, t3, t7, t6); \
 	\
@@ -444,7 +444,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	transpose_4x4(c0, c1, c2, c3, a0, a1); \
 	transpose_4x4(d0, d1, d2, d3, a0, a1); \
 	\
-	vmovdqu .Lshufb_16x16b, a0; \
+	vmovdqu .Lshufb_16x16b(%rip), a0; \
 	vmovdqu st1, a1; \
 	vpshufb a0, a2, a2; \
 	vpshufb a0, a3, a3; \
@@ -483,7 +483,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 #define inpack16_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
 		     y6, y7, rio, key) \
 	vmovq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor 0 * 16(rio), x0, y7; \
 	vpxor 1 * 16(rio), x0, y6; \
@@ -534,7 +534,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	vmovdqu x0, stack_tmp0; \
 	\
 	vmovq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor x0, y7, y7; \
 	vpxor x0, y6, y6; \
@@ -1017,7 +1017,7 @@ ENTRY(camellia_ctr_16way)
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
 
-	vmovdqa .Lbswap128_mask, %xmm14;
+	vmovdqa .Lbswap128_mask(%rip), %xmm14;
 
 	/* load IV and byteswap */
 	vmovdqu (%rcx), %xmm0;
@@ -1066,7 +1066,7 @@ ENTRY(camellia_ctr_16way)
 
 	/* inpack16_pre: */
 	vmovq (key_table)(CTX), %xmm15;
-	vpshufb .Lpack_bswap, %xmm15, %xmm15;
+	vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
 	vpxor %xmm0, %xmm15, %xmm0;
 	vpxor %xmm1, %xmm15, %xmm1;
 	vpxor %xmm2, %xmm15, %xmm2;
@@ -1134,7 +1134,7 @@ camellia_xts_crypt_16way:
 	subq $(16 * 16), %rsp;
 	movq %rsp, %rax;
 
-	vmovdqa .Lxts_gf128mul_and_shl1_mask, %xmm14;
+	vmovdqa .Lxts_gf128mul_and_shl1_mask(%rip), %xmm14;
 
 	/* load IV */
 	vmovdqu (%rcx), %xmm0;
@@ -1210,7 +1210,7 @@ camellia_xts_crypt_16way:
 
 	/* inpack16_pre: */
 	vmovq (key_table)(CTX, %r8, 8), %xmm15;
-	vpshufb .Lpack_bswap, %xmm15, %xmm15;
+	vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
 	vpxor 0 * 16(%rax), %xmm15, %xmm0;
 	vpxor %xmm1, %xmm15, %xmm1;
 	vpxor %xmm2, %xmm15, %xmm2;
@@ -1265,7 +1265,7 @@ ENTRY(camellia_xts_enc_16way)
 	 */
 	xorl %r8d, %r8d; /* input whitening key, 0 for enc */
 
-	leaq __camellia_enc_blk16, %r9;
+	leaq __camellia_enc_blk16(%rip), %r9;
 
 	jmp camellia_xts_crypt_16way;
 ENDPROC(camellia_xts_enc_16way)
@@ -1283,7 +1283,7 @@ ENTRY(camellia_xts_dec_16way)
 	movl $24, %eax;
 	cmovel %eax, %r8d;  /* input whitening key, last for dec */
 
-	leaq __camellia_dec_blk16, %r9;
+	leaq __camellia_dec_blk16(%rip), %r9;
 
 	jmp camellia_xts_crypt_16way;
 ENDPROC(camellia_xts_dec_16way)
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 4be4c7c3ba27..545ff16a196b 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -65,12 +65,12 @@
 	/* \
 	 * S-function with AES subbytes \
 	 */ \
-	vbroadcasti128 .Linv_shift_row, t4; \
-	vpbroadcastd .L0f0f0f0f, t7; \
-	vbroadcasti128 .Lpre_tf_lo_s1, t5; \
-	vbroadcasti128 .Lpre_tf_hi_s1, t6; \
-	vbroadcasti128 .Lpre_tf_lo_s4, t2; \
-	vbroadcasti128 .Lpre_tf_hi_s4, t3; \
+	vbroadcasti128 .Linv_shift_row(%rip), t4; \
+	vpbroadcastd .L0f0f0f0f(%rip), t7; \
+	vbroadcasti128 .Lpre_tf_lo_s1(%rip), t5; \
+	vbroadcasti128 .Lpre_tf_hi_s1(%rip), t6; \
+	vbroadcasti128 .Lpre_tf_lo_s4(%rip), t2; \
+	vbroadcasti128 .Lpre_tf_hi_s4(%rip), t3; \
 	\
 	/* AES inverse shift rows */ \
 	vpshufb t4, x0, x0; \
@@ -116,8 +116,8 @@
 	vinserti128 $1, t2##_x, x6, x6; \
 	vextracti128 $1, x1, t3##_x; \
 	vextracti128 $1, x4, t2##_x; \
-	vbroadcasti128 .Lpost_tf_lo_s1, t0; \
-	vbroadcasti128 .Lpost_tf_hi_s1, t1; \
+	vbroadcasti128 .Lpost_tf_lo_s1(%rip), t0; \
+	vbroadcasti128 .Lpost_tf_hi_s1(%rip), t1; \
 	vaesenclast t4##_x, x2##_x, x2##_x; \
 	vaesenclast t4##_x, t6##_x, t6##_x; \
 	vinserti128 $1, t6##_x, x2, x2; \
@@ -132,16 +132,16 @@
 	vinserti128 $1, t2##_x, x4, x4; \
 	\
 	/* postfilter sboxes 1 and 4 */ \
-	vbroadcasti128 .Lpost_tf_lo_s3, t2; \
-	vbroadcasti128 .Lpost_tf_hi_s3, t3; \
+	vbroadcasti128 .Lpost_tf_lo_s3(%rip), t2; \
+	vbroadcasti128 .Lpost_tf_hi_s3(%rip), t3; \
 	filter_8bit(x0, t0, t1, t7, t6); \
 	filter_8bit(x7, t0, t1, t7, t6); \
 	filter_8bit(x3, t0, t1, t7, t6); \
 	filter_8bit(x6, t0, t1, t7, t6); \
 	\
 	/* postfilter sbox 3 */ \
-	vbroadcasti128 .Lpost_tf_lo_s2, t4; \
-	vbroadcasti128 .Lpost_tf_hi_s2, t5; \
+	vbroadcasti128 .Lpost_tf_lo_s2(%rip), t4; \
+	vbroadcasti128 .Lpost_tf_hi_s2(%rip), t5; \
 	filter_8bit(x2, t2, t3, t7, t6); \
 	filter_8bit(x5, t2, t3, t7, t6); \
 	\
@@ -478,7 +478,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	transpose_4x4(c0, c1, c2, c3, a0, a1); \
 	transpose_4x4(d0, d1, d2, d3, a0, a1); \
 	\
-	vbroadcasti128 .Lshufb_16x16b, a0; \
+	vbroadcasti128 .Lshufb_16x16b(%rip), a0; \
 	vmovdqu st1, a1; \
 	vpshufb a0, a2, a2; \
 	vpshufb a0, a3, a3; \
@@ -517,7 +517,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 #define inpack32_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
 		     y6, y7, rio, key) \
 	vpbroadcastq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor 0 * 32(rio), x0, y7; \
 	vpxor 1 * 32(rio), x0, y6; \
@@ -568,7 +568,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
 	vmovdqu x0, stack_tmp0; \
 	\
 	vpbroadcastq key, x0; \
-	vpshufb .Lpack_bswap, x0, x0; \
+	vpshufb .Lpack_bswap(%rip), x0, x0; \
 	\
 	vpxor x0, y7, y7; \
 	vpxor x0, y6, y6; \
@@ -1108,7 +1108,7 @@ ENTRY(camellia_ctr_32way)
 	vmovdqu (%rcx), %xmm0;
 	vmovdqa %xmm0, %xmm1;
 	inc_le128(%xmm0, %xmm15, %xmm14);
-	vbroadcasti128 .Lbswap128_mask, %ymm14;
+	vbroadcasti128 .Lbswap128_mask(%rip), %ymm14;
 	vinserti128 $1, %xmm0, %ymm1, %ymm0;
 	vpshufb %ymm14, %ymm0, %ymm13;
 	vmovdqu %ymm13, 15 * 32(%rax);
@@ -1154,7 +1154,7 @@ ENTRY(camellia_ctr_32way)
 
 	/* inpack32_pre: */
 	vpbroadcastq (key_table)(CTX), %ymm15;
-	vpshufb .Lpack_bswap, %ymm15, %ymm15;
+	vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
 	vpxor %ymm0, %ymm15, %ymm0;
 	vpxor %ymm1, %ymm15, %ymm1;
 	vpxor %ymm2, %ymm15, %ymm2;
@@ -1238,13 +1238,13 @@ camellia_xts_crypt_32way:
 	subq $(16 * 32), %rsp;
 	movq %rsp, %rax;
 
-	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0, %ymm12;
+	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0(%rip), %ymm12;
 
 	/* load IV and construct second IV */
 	vmovdqu (%rcx), %xmm0;
 	vmovdqa %xmm0, %xmm15;
 	gf128mul_x_ble(%xmm0, %xmm12, %xmm13);
-	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1, %ymm13;
+	vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1(%rip), %ymm13;
 	vinserti128 $1, %xmm0, %ymm15, %ymm0;
 	vpxor 0 * 32(%rdx), %ymm0, %ymm15;
 	vmovdqu %ymm15, 15 * 32(%rax);
@@ -1321,7 +1321,7 @@ camellia_xts_crypt_32way:
 
 	/* inpack32_pre: */
 	vpbroadcastq (key_table)(CTX, %r8, 8), %ymm15;
-	vpshufb .Lpack_bswap, %ymm15, %ymm15;
+	vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
 	vpxor 0 * 32(%rax), %ymm15, %ymm0;
 	vpxor %ymm1, %ymm15, %ymm1;
 	vpxor %ymm2, %ymm15, %ymm2;
@@ -1379,7 +1379,7 @@ ENTRY(camellia_xts_enc_32way)
 
 	xorl %r8d, %r8d; /* input whitening key, 0 for enc */
 
-	leaq __camellia_enc_blk32, %r9;
+	leaq __camellia_enc_blk32(%rip), %r9;
 
 	jmp camellia_xts_crypt_32way;
 ENDPROC(camellia_xts_enc_32way)
@@ -1397,7 +1397,7 @@ ENTRY(camellia_xts_dec_32way)
 	movl $24, %eax;
 	cmovel %eax, %r8d;  /* input whitening key, last for dec */
 
-	leaq __camellia_dec_blk32, %r9;
+	leaq __camellia_dec_blk32(%rip), %r9;
 
 	jmp camellia_xts_crypt_32way;
 ENDPROC(camellia_xts_dec_32way)
diff --git a/arch/x86/crypto/camellia-x86_64-asm_64.S b/arch/x86/crypto/camellia-x86_64-asm_64.S
index 23528bc18fc6..021b0f0090f4 100644
--- a/arch/x86/crypto/camellia-x86_64-asm_64.S
+++ b/arch/x86/crypto/camellia-x86_64-asm_64.S
@@ -77,11 +77,13 @@
 #define RXORbl %r9b
 
 #define xor2ror16(T0, T1, tmp1, tmp2, ab, dst) \
+	leaq T0(%rip), 			tmp1; \
 	movzbl ab ## bl,		tmp2 ## d; \
+	xorq (tmp1, tmp2, 8),		dst; \
+	leaq T1(%rip), 			tmp2; \
 	movzbl ab ## bh,		tmp1 ## d; \
-	rorq $16,			ab; \
-	xorq T0(, tmp2, 8),		dst; \
-	xorq T1(, tmp1, 8),		dst;
+	xorq (tmp2, tmp1, 8),		dst; \
+	rorq $16,			ab;
 
 /**********************************************************************
   1-way camellia
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index dc55c3332fcc..213b5d8a9d08 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -83,16 +83,20 @@
 
 
 #define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	shrq $16,	src;                     \
-	movl		s1(, RID1, 4), dst ## d; \
-	op1		s2(, RID2, 4), dst ## d; \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	interleave_op(il_reg);			 \
-	op2		s3(, RID1, 4), dst ## d; \
-	op3		s4(, RID2, 4), dst ## d;
+	movzbl		src ## bh,       RID1d;    \
+	leaq		s1(%rip),        RID2;     \
+	movl		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,       RID2d;    \
+	leaq		s2(%rip),        RID1;     \
+	op1		(RID1, RID2, 4), dst ## d; \
+	shrq $16,	src;                       \
+	movzbl		src ## bh,     RID1d;      \
+	leaq		s3(%rip),        RID2;     \
+	op2		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,     RID2d;      \
+	leaq		s4(%rip),        RID1;     \
+	op3		(RID1, RID2, 4), dst ## d; \
+	interleave_op(il_reg);
 
 #define dummy(d) /* do nothing */
 
@@ -151,15 +155,15 @@
 	subround(l ## 3, r ## 3, l ## 4, r ## 4, f);
 
 #define enc_preload_rkr() \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		kr(CTX),                  RKR, RKR;
 
 #define dec_preload_rkr() \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		kr(CTX),                  RKR, RKR; \
-	vpshufb		.Lbswap128_mask,          RKR, RKR;
+	vpshufb		.Lbswap128_mask(%rip),    RKR, RKR;
 
 #define transpose_2x4(x0, x1, t0, t1) \
 	vpunpckldq		x1, x0, t0; \
@@ -236,9 +240,9 @@ __cast5_enc_blk16:
 
 	movq %rdi, CTX;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 	enc_preload_rkr();
 
 	inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -272,7 +276,7 @@ __cast5_enc_blk16:
 	popq %rbx;
 	popq %r15;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 
 	outunpack_blocks(RR1, RL1, RTMP, RX, RKM);
 	outunpack_blocks(RR2, RL2, RTMP, RX, RKM);
@@ -310,9 +314,9 @@ __cast5_dec_blk16:
 
 	movq %rdi, CTX;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 	dec_preload_rkr();
 
 	inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -343,7 +347,7 @@ __cast5_dec_blk16:
 	round(RL, RR, 1, 2);
 	round(RR, RL, 0, 1);
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 	popq %rbx;
 	popq %r15;
 
@@ -506,8 +510,8 @@ ENTRY(cast5_ctr_16way)
 
 	vpcmpeqd RKR, RKR, RKR;
 	vpaddq RKR, RKR, RKR; /* low: -2, high: -2 */
-	vmovdqa .Lbswap_iv_mask, R1ST;
-	vmovdqa .Lbswap128_mask, RKM;
+	vmovdqa .Lbswap_iv_mask(%rip), R1ST;
+	vmovdqa .Lbswap128_mask(%rip), RKM;
 
 	/* load IV and byteswap */
 	vmovq (%rcx), RX;
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index 4f0a7cdb94d9..9879a12c243a 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -83,16 +83,20 @@
 
 
 #define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	shrq $16,	src;                     \
-	movl		s1(, RID1, 4), dst ## d; \
-	op1		s2(, RID2, 4), dst ## d; \
-	movzbl		src ## bh,     RID1d;    \
-	movzbl		src ## bl,     RID2d;    \
-	interleave_op(il_reg);			 \
-	op2		s3(, RID1, 4), dst ## d; \
-	op3		s4(, RID2, 4), dst ## d;
+	movzbl		src ## bh,       RID1d;    \
+	leaq		s1(%rip),        RID2;     \
+	movl		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,       RID2d;    \
+	leaq		s2(%rip),        RID1;     \
+	op1		(RID1, RID2, 4), dst ## d; \
+	shrq $16,	src;                       \
+	movzbl		src ## bh,     RID1d;      \
+	leaq		s3(%rip),        RID2;     \
+	op2		(RID2, RID1, 4), dst ## d; \
+	movzbl		src ## bl,     RID2d;      \
+	leaq		s4(%rip),        RID1;     \
+	op3		(RID1, RID2, 4), dst ## d; \
+	interleave_op(il_reg);
 
 #define dummy(d) /* do nothing */
 
@@ -175,10 +179,10 @@
 	qop(RD, RC, 1);
 
 #define shuffle(mask) \
-	vpshufb		mask,            RKR, RKR;
+	vpshufb		mask(%rip),            RKR, RKR;
 
 #define preload_rkr(n, do_mask, mask) \
-	vbroadcastss	.L16_mask,                RKR;      \
+	vbroadcastss	.L16_mask(%rip),          RKR;      \
 	/* add 16-bit rotation to key rotations (mod 32) */ \
 	vpxor		(kr+n*16)(CTX),           RKR, RKR; \
 	do_mask(mask);
@@ -260,9 +264,9 @@ __cast6_enc_blk8:
 
 	movq %rdi, CTX;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 
 	inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -286,7 +290,7 @@ __cast6_enc_blk8:
 	popq %rbx;
 	popq %r15;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 
 	outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -308,9 +312,9 @@ __cast6_dec_blk8:
 
 	movq %rdi, CTX;
 
-	vmovdqa .Lbswap_mask, RKM;
-	vmovd .Lfirst_mask, R1ST;
-	vmovd .L32_mask, R32;
+	vmovdqa .Lbswap_mask(%rip), RKM;
+	vmovd .Lfirst_mask(%rip), R1ST;
+	vmovd .L32_mask(%rip), R32;
 
 	inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -334,7 +338,7 @@ __cast6_dec_blk8:
 	popq %rbx;
 	popq %r15;
 
-	vmovdqa .Lbswap_mask, RKM;
+	vmovdqa .Lbswap_mask(%rip), RKM;
 	outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
 	outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
 
diff --git a/arch/x86/crypto/des3_ede-asm_64.S b/arch/x86/crypto/des3_ede-asm_64.S
index 7fca43099a5f..e51dcf8c7eb7 100644
--- a/arch/x86/crypto/des3_ede-asm_64.S
+++ b/arch/x86/crypto/des3_ede-asm_64.S
@@ -129,21 +129,29 @@
 	movzbl RW0bl, RT2d; \
 	movzbl RW0bh, RT3d; \
 	shrq $16, RW0; \
-	movq s8(, RT0, 8), RT0; \
-	xorq s6(, RT1, 8), to; \
+	leaq s8(%rip), RW1; \
+	movq (RW1, RT0, 8), RT0; \
+	leaq s6(%rip), RW1; \
+	xorq (RW1, RT1, 8), to; \
 	movzbl RW0bl, RL1d; \
 	movzbl RW0bh, RT1d; \
 	shrl $16, RW0d; \
-	xorq s4(, RT2, 8), RT0; \
-	xorq s2(, RT3, 8), to; \
+	leaq s4(%rip), RW1; \
+	xorq (RW1, RT2, 8), RT0; \
+	leaq s2(%rip), RW1; \
+	xorq (RW1, RT3, 8), to; \
 	movzbl RW0bl, RT2d; \
 	movzbl RW0bh, RT3d; \
-	xorq s7(, RL1, 8), RT0; \
-	xorq s5(, RT1, 8), to; \
-	xorq s3(, RT2, 8), RT0; \
+	leaq s7(%rip), RW1; \
+	xorq (RW1, RL1, 8), RT0; \
+	leaq s5(%rip), RW1; \
+	xorq (RW1, RT1, 8), to; \
+	leaq s3(%rip), RW1; \
+	xorq (RW1, RT2, 8), RT0; \
 	load_next_key(n, RW0); \
 	xorq RT0, to; \
-	xorq s1(, RT3, 8), to; \
+	leaq s1(%rip), RW1; \
+	xorq (RW1, RT3, 8), to; \
 
 #define load_next_key(n, RWx) \
 	movq (((n) + 1) * 8)(CTX), RWx;
@@ -355,65 +363,89 @@ ENDPROC(des3_ede_x86_64_crypt_blk)
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrq $16, RW0; \
-	xorq s8(, RT3, 8), to##0; \
-	xorq s6(, RT1, 8), to##0; \
+	leaq s8(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s6(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrq $16, RW0; \
-	xorq s4(, RT3, 8), to##0; \
-	xorq s2(, RT1, 8), to##0; \
+	leaq s4(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s2(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	shrl $16, RW0d; \
-	xorq s7(, RT3, 8), to##0; \
-	xorq s5(, RT1, 8), to##0; \
+	leaq s7(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s5(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 	movzbl RW0bl, RT3d; \
 	movzbl RW0bh, RT1d; \
 	load_next_key(n, RW0); \
-	xorq s3(, RT3, 8), to##0; \
-	xorq s1(, RT1, 8), to##0; \
+	leaq s3(%rip), RT2; \
+	xorq (RT2, RT3, 8), to##0; \
+	leaq s1(%rip), RT2; \
+	xorq (RT2, RT1, 8), to##0; \
 		xorq from##1, RW1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrq $16, RW1; \
-		xorq s8(, RT3, 8), to##1; \
-		xorq s6(, RT1, 8), to##1; \
+		leaq s8(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s6(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrq $16, RW1; \
-		xorq s4(, RT3, 8), to##1; \
-		xorq s2(, RT1, 8), to##1; \
+		leaq s4(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s2(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		shrl $16, RW1d; \
-		xorq s7(, RT3, 8), to##1; \
-		xorq s5(, RT1, 8), to##1; \
+		leaq s7(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s5(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 		movzbl RW1bl, RT3d; \
 		movzbl RW1bh, RT1d; \
 		do_movq(RW0, RW1); \
-		xorq s3(, RT3, 8), to##1; \
-		xorq s1(, RT1, 8), to##1; \
+		leaq s3(%rip), RT2; \
+		xorq (RT2, RT3, 8), to##1; \
+		leaq s1(%rip), RT2; \
+		xorq (RT2, RT1, 8), to##1; \
 			xorq from##2, RW2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrq $16, RW2; \
-			xorq s8(, RT3, 8), to##2; \
-			xorq s6(, RT1, 8), to##2; \
+			leaq s8(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s6(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrq $16, RW2; \
-			xorq s4(, RT3, 8), to##2; \
-			xorq s2(, RT1, 8), to##2; \
+			leaq s4(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s2(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			shrl $16, RW2d; \
-			xorq s7(, RT3, 8), to##2; \
-			xorq s5(, RT1, 8), to##2; \
+			leaq s7(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s5(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2; \
 			movzbl RW2bl, RT3d; \
 			movzbl RW2bh, RT1d; \
 			do_movq(RW0, RW2); \
-			xorq s3(, RT3, 8), to##2; \
-			xorq s1(, RT1, 8), to##2;
+			leaq s3(%rip), RT2; \
+			xorq (RT2, RT3, 8), to##2; \
+			leaq s1(%rip), RT2; \
+			xorq (RT2, RT1, 8), to##2;
 
 #define __movq(src, dst) \
 	movq src, dst;
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 5d53effe8abe..f8029074a99e 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -94,7 +94,7 @@ ENTRY(clmul_ghash_mul)
 	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
-	movaps .Lbswap_mask, BSWAP
+	movaps .Lbswap_mask(%rip), BSWAP
 	PSHUFB_XMM BSWAP DATA
 	call __clmul_gf128mul_ble
 	PSHUFB_XMM BSWAP DATA
@@ -111,7 +111,7 @@ ENTRY(clmul_ghash_update)
 	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
-	movaps .Lbswap_mask, BSWAP
+	movaps .Lbswap_mask(%rip), BSWAP
 	movups (%rdi), DATA
 	movups (%rcx), SHASH
 	PSHUFB_XMM BSWAP DATA
diff --git a/arch/x86/crypto/glue_helper-asm-avx.S b/arch/x86/crypto/glue_helper-asm-avx.S
index d08fc575ef7f..a9736f85fef0 100644
--- a/arch/x86/crypto/glue_helper-asm-avx.S
+++ b/arch/x86/crypto/glue_helper-asm-avx.S
@@ -44,7 +44,7 @@
 #define load_ctr_8way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t1, t2) \
 	vpcmpeqd t0, t0, t0; \
 	vpsrldq $8, t0, t0; /* low: -1, high: 0 */ \
-	vmovdqa bswap, t1; \
+	vmovdqa bswap(%rip), t1; \
 	\
 	/* load IV and byteswap */ \
 	vmovdqu (iv), x7; \
@@ -89,7 +89,7 @@
 
 #define load_xts_8way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, t0, \
 		      t1, xts_gf128mul_and_shl1_mask) \
-	vmovdqa xts_gf128mul_and_shl1_mask, t0; \
+	vmovdqa xts_gf128mul_and_shl1_mask(%rip), t0; \
 	\
 	/* load IV */ \
 	vmovdqu (iv), tiv; \
diff --git a/arch/x86/crypto/glue_helper-asm-avx2.S b/arch/x86/crypto/glue_helper-asm-avx2.S
index d84508c85c13..efbf4953707e 100644
--- a/arch/x86/crypto/glue_helper-asm-avx2.S
+++ b/arch/x86/crypto/glue_helper-asm-avx2.S
@@ -62,7 +62,7 @@
 	vmovdqu (iv), t2x; \
 	vmovdqa t2x, t3x; \
 	inc_le128(t2x, t0x, t1x); \
-	vbroadcasti128 bswap, t1; \
+	vbroadcasti128 bswap(%rip), t1; \
 	vinserti128 $1, t2x, t3, t2; /* ab: le0 ; cd: le1 */ \
 	vpshufb t1, t2, x0; \
 	\
@@ -119,13 +119,13 @@
 		       tivx, t0, t0x, t1, t1x, t2, t2x, t3, \
 		       xts_gf128mul_and_shl1_mask_0, \
 		       xts_gf128mul_and_shl1_mask_1) \
-	vbroadcasti128 xts_gf128mul_and_shl1_mask_0, t1; \
+	vbroadcasti128 xts_gf128mul_and_shl1_mask_0(%rip), t1; \
 	\
 	/* load IV and construct second IV */ \
 	vmovdqu (iv), tivx; \
 	vmovdqa tivx, t0x; \
 	gf128mul_x_ble(tivx, t1x, t2x); \
-	vbroadcasti128 xts_gf128mul_and_shl1_mask_1, t2; \
+	vbroadcasti128 xts_gf128mul_and_shl1_mask_1(%rip), t2; \
 	vinserti128 $1, tivx, t0, tiv; \
 	vpxor (0*32)(src), tiv, x0; \
 	vmovdqu tiv, (0*32)(dst); \
diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S
index 1420db15dcdd..e7730d93cceb 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -592,19 +592,23 @@ last_block_enter:
 
 .align 16
 loop1:
-	vpaddd	K256+0*32(SRND), X0, XFER
+	leaq	K256(%rip), INP
+	vpaddd	0*32(INP, SRND), X0, XFER
 	vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 0*32
 
-	vpaddd	K256+1*32(SRND), X0, XFER
+	leaq	K256(%rip), INP
+	vpaddd	1*32(INP, SRND), X0, XFER
 	vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 1*32
 
-	vpaddd	K256+2*32(SRND), X0, XFER
+	leaq	K256(%rip), INP
+	vpaddd	2*32(INP, SRND), X0, XFER
 	vmovdqa XFER, 2*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 2*32
 
-	vpaddd	K256+3*32(SRND), X0, XFER
+	leaq	K256(%rip), INP
+	vpaddd	3*32(INP, SRND), X0, XFER
 	vmovdqa XFER, 3*32+_XFER(%rsp, SRND)
 	FOUR_ROUNDS_AND_SCHED	_XFER + 3*32
 
@@ -614,11 +618,13 @@ loop1:
 
 loop2:
 	## Do last 16 rounds with no scheduling
-	vpaddd	K256+0*32(SRND), X0, XFER
+	leaq	K256(%rip), INP
+	vpaddd	0*32(INP, SRND), X0, XFER
 	vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
 	DO_4ROUNDS	_XFER + 0*32
 
-	vpaddd	K256+1*32(SRND), X1, XFER
+	leaq	K256(%rip), INP
+	vpaddd	1*32(INP, SRND), X1, XFER
 	vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
 	DO_4ROUNDS	_XFER + 1*32
 	add	$2*32, SRND
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 02/11] x86: Add macro to get symbol address for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 03/11] x86: relocate_kernel - Adapt assembly " Thomas Garnier
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Peter Zijlstra, Nadav Amit,
	Jann Horn, linux-kernel

Add a new _ASM_MOVABS macro to fetch a symbol address. It will be used
to replace "_ASM_MOV $<symbol>, %dst" code construct that are not
compatible with PIE.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
---
 arch/x86/include/asm/asm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 3ff577c0b102..3a686057e882 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -30,6 +30,7 @@
 #define _ASM_ALIGN	__ASM_SEL(.balign 4, .balign 8)
 
 #define _ASM_MOV	__ASM_SIZE(mov)
+#define _ASM_MOVABS	__ASM_SEL(movl, movabsq)
 #define _ASM_INC	__ASM_SIZE(inc)
 #define _ASM_DEC	__ASM_SIZE(dec)
 #define _ASM_ADD	__ASM_SIZE(add)
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 03/11] x86: relocate_kernel - Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 02/11] x86: Add macro to get symbol address " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 04/11] x86/entry/64: " Thomas Garnier
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Allison Randal,
	Alexios Zavras, linux-kernel

Change the assembly code to use only absolute references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/kernel/relocate_kernel_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index c51ccff5cd01..c72889b09840 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -206,7 +206,7 @@ identity_mapped:
 	movq	%rax, %cr3
 	lea	PAGE_SIZE(%r8), %rsp
 	call	swap_pages
-	movq	$virtual_mapped, %rax
+	movabsq	$virtual_mapped, %rax
 	pushq	%rax
 	ret
 
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (2 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 03/11] x86: relocate_kernel - Adapt assembly " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-08-05 17:28   ` Borislav Petkov
  2019-07-30 19:12 ` [PATCH v9 05/11] x86: pm-trace - " Thomas Garnier
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/entry/entry_64.S | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3f5a978a02a7..4b588a902009 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1317,7 +1317,8 @@ ENTRY(error_entry)
 	movl	%ecx, %eax			/* zero extend */
 	cmpq	%rax, RIP+8(%rsp)
 	je	.Lbstep_iret
-	cmpq	$.Lgs_change, RIP+8(%rsp)
+	leaq	.Lgs_change(%rip), %rcx
+	cmpq	%rcx, RIP+8(%rsp)
 	jne	.Lerror_entry_done
 
 	/*
@@ -1514,10 +1515,10 @@ ENTRY(nmi)
 	 * resume the outer NMI.
 	 */
 
-	movq	$repeat_nmi, %rdx
+	leaq	repeat_nmi(%rip), %rdx
 	cmpq	8(%rsp), %rdx
 	ja	1f
-	movq	$end_repeat_nmi, %rdx
+	leaq	end_repeat_nmi(%rip), %rdx
 	cmpq	8(%rsp), %rdx
 	ja	nested_nmi_out
 1:
@@ -1571,7 +1572,8 @@ nested_nmi:
 	pushq	%rdx
 	pushfq
 	pushq	$__KERNEL_CS
-	pushq	$repeat_nmi
+	leaq	repeat_nmi(%rip), %rdx
+	pushq	%rdx
 
 	/* Put stack back */
 	addq	$(6*8), %rsp
@@ -1610,7 +1612,11 @@ first_nmi:
 	addq	$8, (%rsp)	/* Fix up RSP */
 	pushfq			/* RFLAGS */
 	pushq	$__KERNEL_CS	/* CS */
-	pushq	$1f		/* RIP */
+	pushq	$0		/* Future return address */
+	pushq	%rax		/* Save RAX */
+	leaq	1f(%rip), %rax	/* RIP */
+	movq    %rax, 8(%rsp)   /* Put 1f on return address */
+	popq	%rax		/* Restore RAX */
 	iretq			/* continues at repeat_nmi below */
 	UNWIND_HINT_IRET_REGS
 1:
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 05/11] x86: pm-trace - Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (3 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 04/11] x86/entry/64: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 06/11] x86/CPU: " Thomas Garnier
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, linux-kernel

Change assembly to use the new _ASM_MOVABS macro instead of _ASM_MOV for
the assembly to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/pm-trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pm-trace.h b/arch/x86/include/asm/pm-trace.h
index bfa32aa428e5..972070806ce9 100644
--- a/arch/x86/include/asm/pm-trace.h
+++ b/arch/x86/include/asm/pm-trace.h
@@ -8,7 +8,7 @@
 do {								\
 	if (pm_trace_enabled) {					\
 		const void *tracedata;				\
-		asm volatile(_ASM_MOV " $1f,%0\n"		\
+		asm volatile(_ASM_MOVABS " $1f,%0\n"		\
 			     ".section .tracedata,\"a\"\n"	\
 			     "1:\t.word %c1\n\t"		\
 			     _ASM_PTR " %c2\n"			\
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 06/11] x86/CPU: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (4 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 05/11] x86: pm-trace - " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 07/11] x86/acpi: " Thomas Garnier
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Peter Zijlstra (Intel),
	Andrew Morton, Len Brown, Andy Lutomirski, linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
---
 arch/x86/include/asm/processor.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6e0a3b43d027..bf333d62889e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -713,11 +713,13 @@ static inline void sync_core(void)
 		"pushfq\n\t"
 		"mov %%cs, %0\n\t"
 		"pushq %q0\n\t"
-		"pushq $1f\n\t"
+		"leaq 1f(%%rip), %q0\n\t"
+		"pushq %q0\n\t"
 		"iretq\n\t"
 		UNWIND_HINT_RESTORE
 		"1:"
-		: "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
+		: "=&r" (tmp), ASM_CALL_CONSTRAINT
+		: : "cc", "memory");
 #endif
 }
 
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 07/11] x86/acpi: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (5 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 06/11] x86/CPU: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 08/11] x86/boot/64: " Thomas Garnier
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Pavel Machek,
	Rafael J . Wysocki, Rafael J. Wysocki, Len Brown,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, linux-pm, linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/kernel/acpi/wakeup_64.S | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index b0715c3ac18d..3ec6c1b74ad4 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -15,7 +15,7 @@
 	 * Hooray, we are in Long 64-bit mode (but still running in low memory)
 	 */
 ENTRY(wakeup_long64)
-	movq	saved_magic, %rax
+	movq	saved_magic(%rip), %rax
 	movq	$0x123456789abcdef0, %rdx
 	cmpq	%rdx, %rax
 	jne	bogus_64_magic
@@ -26,14 +26,14 @@ ENTRY(wakeup_long64)
 	movw	%ax, %es
 	movw	%ax, %fs
 	movw	%ax, %gs
-	movq	saved_rsp, %rsp
+	movq	saved_rsp(%rip), %rsp
 
-	movq	saved_rbx, %rbx
-	movq	saved_rdi, %rdi
-	movq	saved_rsi, %rsi
-	movq	saved_rbp, %rbp
+	movq	saved_rbx(%rip), %rbx
+	movq	saved_rdi(%rip), %rdi
+	movq	saved_rsi(%rip), %rsi
+	movq	saved_rbp(%rip), %rbp
 
-	movq	saved_rip, %rax
+	movq	saved_rip(%rip), %rax
 	jmp	*%rax
 ENDPROC(wakeup_long64)
 
@@ -46,7 +46,7 @@ ENTRY(do_suspend_lowlevel)
 	xorl	%eax, %eax
 	call	save_processor_state
 
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
 	movq	%rsi, pt_regs_si(%rax)
@@ -65,13 +65,14 @@ ENTRY(do_suspend_lowlevel)
 	pushfq
 	popq	pt_regs_flags(%rax)
 
-	movq	$.Lresume_point, saved_rip(%rip)
+	leaq	.Lresume_point(%rip), %rax
+	movq	%rax, saved_rip(%rip)
 
-	movq	%rsp, saved_rsp
-	movq	%rbp, saved_rbp
-	movq	%rbx, saved_rbx
-	movq	%rdi, saved_rdi
-	movq	%rsi, saved_rsi
+	movq	%rsp, saved_rsp(%rip)
+	movq	%rbp, saved_rbp(%rip)
+	movq	%rbx, saved_rbx(%rip)
+	movq	%rdi, saved_rdi(%rip)
+	movq	%rsi, saved_rsi(%rip)
 
 	addq	$8, %rsp
 	movl	$3, %edi
@@ -83,7 +84,7 @@ ENTRY(do_suspend_lowlevel)
 	.align 4
 .Lresume_point:
 	/* We don't restore %rax, it must be 0 anyway */
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	saved_context_cr4(%rax), %rbx
 	movq	%rbx, %cr4
 	movq	saved_context_cr3(%rax), %rbx
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 08/11] x86/boot/64: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (6 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 07/11] x86/acpi: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-08-09 17:30   ` Borislav Petkov
  2019-07-30 19:12 ` [PATCH v9 09/11] x86/power/64: " Thomas Garnier
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Juergen Gross,
	Peter Zijlstra, Boris Ostrovsky, Josh Poimboeuf, Maran Wilson,
	Feng Tang, linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Early at boot, the kernel is mapped at a temporary address while preparing
the page table. To know the changes needed for the page table with KASLR,
the boot code calculate the difference between the expected address of the
kernel and the one chosen by KASLR. It does not work with PIE because all
symbols in code are relatives. Instead of getting the future relocated
virtual address, you will get the current temporary mapping.
Instructions were changed to have absolute 64-bit references.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/kernel/head_64.S | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index f3d3e9646a99..9a3f96566eb2 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -88,8 +88,10 @@ startup_64:
 	popq	%rsi
 
 	/* Form the CR3 value being sure to include the CR3 modifier */
-	addq	$(early_top_pgt - __START_KERNEL_map), %rax
+	movabs  $(early_top_pgt - __START_KERNEL_map), %rcx
+	addq    %rcx, %rax
 	jmp 1f
+
 ENTRY(secondary_startup_64)
 	UNWIND_HINT_EMPTY
 	/*
@@ -118,7 +120,8 @@ ENTRY(secondary_startup_64)
 	popq	%rsi
 
 	/* Form the CR3 value being sure to include the CR3 modifier */
-	addq	$(init_top_pgt - __START_KERNEL_map), %rax
+	movabs	$(init_top_pgt - __START_KERNEL_map), %rcx
+	addq    %rcx, %rax
 1:
 
 	/* Enable PAE mode, PGE and LA57 */
@@ -136,7 +139,7 @@ ENTRY(secondary_startup_64)
 	movq	%rax, %cr3
 
 	/* Ensure I am executing from virtual addresses */
-	movq	$1f, %rax
+	movabs  $1f, %rax
 	ANNOTATE_RETPOLINE_SAFE
 	jmp	*%rax
 1:
@@ -233,11 +236,12 @@ ENTRY(secondary_startup_64)
 	 *	REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
 	 *		address given in m16:64.
 	 */
-	pushq	$.Lafter_lret	# put return address on stack for unwinder
+	movabs  $.Lafter_lret, %rax
+	pushq	%rax		# put return address on stack for unwinder
 	xorl	%ebp, %ebp	# clear frame pointer
-	movq	initial_code(%rip), %rax
+	leaq	initial_code(%rip), %rax
 	pushq	$__KERNEL_CS	# set correct cs
-	pushq	%rax		# target address in negative space
+	pushq	(%rax)		# target address in negative space
 	lretq
 .Lafter_lret:
 END(secondary_startup_64)
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 09/11] x86/power/64: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (7 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 08/11] x86/boot/64: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-30 19:12 ` [PATCH v9 10/11] x86/paravirt: " Thomas Garnier
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Pavel Machek,
	Rafael J . Wysocki, Rafael J. Wysocki, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-pm,
	linux-kernel

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/power/hibernate_asm_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index a4d5eb0a7ece..796cd19d575b 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -23,7 +23,7 @@
 #include <asm/frame.h>
 
 ENTRY(swsusp_arch_suspend)
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
 	movq	%rsi, pt_regs_si(%rax)
@@ -114,7 +114,7 @@ ENTRY(restore_registers)
 	movq	%rax, %cr4;  # turn PGE back on
 
 	/* We don't restore %rax, it must be 0 anyway */
-	movq	$saved_context, %rax
+	leaq	saved_context(%rip), %rax
 	movq	pt_regs_sp(%rax), %rsp
 	movq	pt_regs_bp(%rax), %rbp
 	movq	pt_regs_si(%rax), %rsi
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 10/11] x86/paravirt: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (8 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 09/11] x86/power/64: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-07-31 12:53   ` Peter Zijlstra
  2019-07-30 19:12 ` [PATCH v9 11/11] x86/alternatives: " Thomas Garnier
  2019-08-06 15:43 ` [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Borislav Petkov
  11 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Juergen Gross,
	Thomas Hellstrom, VMware, Inc.,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, virtualization, linux-kernel

if PIE is enabled, switch the paravirt assembly constraints to be
compatible. The %c/i constrains generate smaller code so is kept by
default.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
Acked-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/include/asm/paravirt_types.h | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 70b654f3ffe5..fd7dc37d0010 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -338,9 +338,25 @@ extern struct paravirt_patch_template pv_ops;
 #define PARAVIRT_PATCH(x)					\
 	(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
 
+#ifdef CONFIG_X86_PIE
+#define paravirt_opptr_call "a"
+#define paravirt_opptr_type "p"
+
+/*
+ * Alternative patching requires a maximum of 7 bytes but the relative call is
+ * only 6 bytes. If PIE is enabled, add an additional nop to the call
+ * instruction to ensure patching is possible.
+ */
+#define PARAVIRT_CALL_POST  "nop;"
+#else
+#define paravirt_opptr_call "c"
+#define paravirt_opptr_type "i"
+#define PARAVIRT_CALL_POST  ""
+#endif
+
 #define paravirt_type(op)				\
 	[paravirt_typenum] "i" (PARAVIRT_PATCH(op)),	\
-	[paravirt_opptr] "i" (&(pv_ops.op))
+	[paravirt_opptr] paravirt_opptr_type (&(pv_ops.op))
 #define paravirt_clobber(clobber)		\
 	[paravirt_clobber] "i" (clobber)
 
@@ -379,9 +395,10 @@ int paravirt_disable_iospace(void);
  * offset into the paravirt_patch_template structure, and can therefore be
  * freely converted back into a structure offset.
  */
-#define PARAVIRT_CALL					\
-	ANNOTATE_RETPOLINE_SAFE				\
-	"call *%c[paravirt_opptr];"
+#define PARAVIRT_CALL						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"call *%" paravirt_opptr_call "[paravirt_opptr];"	\
+	PARAVIRT_CALL_POST
 
 /*
  * These macros are intended to wrap calls through one of the paravirt
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v9 11/11] x86/alternatives: Adapt assembly for PIE support
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (9 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 10/11] x86/paravirt: " Thomas Garnier
@ 2019-07-30 19:12 ` " Thomas Garnier
  2019-08-12 13:57   ` Borislav Petkov
  2019-08-06 15:43 ` [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Borislav Petkov
  11 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-07-30 19:12 UTC (permalink / raw)
  To: kernel-hardening
  Cc: kristen, keescook, Thomas Garnier, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Peter Zijlstra, Nadav Amit,
	linux-kernel

Change the assembly options to work with pointers instead of integers.

Position Independent Executable (PIE) support will allow to extend the
KASLR randomization range below 0xffffffff80000000.

Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
---
 arch/x86/include/asm/alternative.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 094fbc9c0b1c..28a838106e5f 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -243,7 +243,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 /* Like alternative_io, but for replacing a direct call with another one. */
 #define alternative_call(oldfunc, newfunc, feature, output, input...)	\
 	asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
-		: output : [old] "i" (oldfunc), [new] "i" (newfunc), ## input)
+		: output : [old] "X" (oldfunc), [new] "X" (newfunc), ## input)
 
 /*
  * Like alternative_call, but there are two features and respective functions.
@@ -256,8 +256,8 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
 		"call %P[new2]", feature2)				      \
 		: output, ASM_CALL_CONSTRAINT				      \
-		: [old] "i" (oldfunc), [new1] "i" (newfunc1),		      \
-		  [new2] "i" (newfunc2), ## input)
+		: [old] "X" (oldfunc), [new1] "X" (newfunc1),		      \
+		  [new2] "X" (newfunc2), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 10/11] x86/paravirt: Adapt assembly for PIE support
  2019-07-30 19:12 ` [PATCH v9 10/11] x86/paravirt: " Thomas Garnier
@ 2019-07-31 12:53   ` Peter Zijlstra
  2019-08-12 12:55     ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2019-07-31 12:53 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Juergen Gross,
	Thomas Hellstrom, VMware, Inc.,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, virtualization, linux-kernel

On Tue, Jul 30, 2019 at 12:12:54PM -0700, Thomas Garnier wrote:
> if PIE is enabled, switch the paravirt assembly constraints to be
> compatible. The %c/i constrains generate smaller code so is kept by
> default.
> 
> Position Independent Executable (PIE) support will allow to extend the
> KASLR randomization range below 0xffffffff80000000.
> 
> Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
> Acked-by: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/include/asm/paravirt_types.h | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index 70b654f3ffe5..fd7dc37d0010 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -338,9 +338,25 @@ extern struct paravirt_patch_template pv_ops;
>  #define PARAVIRT_PATCH(x)					\
>  	(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
>  
> +#ifdef CONFIG_X86_PIE
> +#define paravirt_opptr_call "a"
> +#define paravirt_opptr_type "p"
> +
> +/*
> + * Alternative patching requires a maximum of 7 bytes but the relative call is
> + * only 6 bytes. If PIE is enabled, add an additional nop to the call
> + * instruction to ensure patching is possible.
> + */
> +#define PARAVIRT_CALL_POST  "nop;"

I'm confused; where does the 7 come from? The relative call is 6 bytes,
a normal call is 5 bytes (which is what we normally replace them with),
and the longest 'native' sequence we seem to have is also 6 bytes
(.cpu_usergs_sysret64).

> +#else
> +#define paravirt_opptr_call "c"
> +#define paravirt_opptr_type "i"
> +#define PARAVIRT_CALL_POST  ""
> +#endif
> +
>  #define paravirt_type(op)				\
>  	[paravirt_typenum] "i" (PARAVIRT_PATCH(op)),	\
> -	[paravirt_opptr] "i" (&(pv_ops.op))
> +	[paravirt_opptr] paravirt_opptr_type (&(pv_ops.op))
>  #define paravirt_clobber(clobber)		\
>  	[paravirt_clobber] "i" (clobber)
>  
> @@ -379,9 +395,10 @@ int paravirt_disable_iospace(void);
>   * offset into the paravirt_patch_template structure, and can therefore be
>   * freely converted back into a structure offset.
>   */
> -#define PARAVIRT_CALL					\
> -	ANNOTATE_RETPOLINE_SAFE				\
> -	"call *%c[paravirt_opptr];"
> +#define PARAVIRT_CALL						\
> +	ANNOTATE_RETPOLINE_SAFE					\
> +	"call *%" paravirt_opptr_call "[paravirt_opptr];"	\
> +	PARAVIRT_CALL_POST

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support
  2019-07-30 19:12 ` [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support Thomas Garnier
@ 2019-08-05 16:32   ` Borislav Petkov
  2019-08-05 16:54     ` Kees Cook
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-05 16:32 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Herbert Xu, David S. Miller,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-crypto,
	linux-kernel

On Tue, Jul 30, 2019 at 12:12:45PM -0700, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extend the
> KASLR randomization range below 0xffffffff80000000.

I believe in previous reviews I asked about why this sentence is being
replicated in every commit message and now it is still in every commit
message except in 2/11.

Why do you need it everywhere and not once in the 0th mail?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support
  2019-08-05 16:32   ` Borislav Petkov
@ 2019-08-05 16:54     ` Kees Cook
  2019-08-05 17:27       ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Kees Cook @ 2019-08-05 16:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Garnier, kernel-hardening, kristen, Herbert Xu,
	David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-crypto, linux-kernel

On Mon, Aug 05, 2019 at 06:32:02PM +0200, Borislav Petkov wrote:
> On Tue, Jul 30, 2019 at 12:12:45PM -0700, Thomas Garnier wrote:
> > Change the assembly code to use only relative references of symbols for the
> > kernel to be PIE compatible.
> > 
> > Position Independent Executable (PIE) support will allow to extend the
> > KASLR randomization range below 0xffffffff80000000.
> 
> I believe in previous reviews I asked about why this sentence is being
> replicated in every commit message and now it is still in every commit
> message except in 2/11.
> 
> Why do you need it everywhere and not once in the 0th mail?

I think there was some long-ago feedback from someone (Ingo?) about
giving context for the patch so looking at one individually would let
someone know that it was part of a larger series. This is a distant
memory, though. Do you think it should just be dropped in each patch?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support
  2019-08-05 16:54     ` Kees Cook
@ 2019-08-05 17:27       ` Borislav Petkov
  2019-08-05 17:53         ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-05 17:27 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Garnier, kernel-hardening, kristen, Herbert Xu,
	David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-crypto, linux-kernel

On Mon, Aug 05, 2019 at 09:54:44AM -0700, Kees Cook wrote:
> I think there was some long-ago feedback from someone (Ingo?) about
> giving context for the patch so looking at one individually would let
> someone know that it was part of a larger series.

Strange. But then we'd have to "mark" all patches which belong to a
larger series this way, no? And we don't do that...

> Do you think it should just be dropped in each patch?

I think reading it once is enough. If the change alone in some commit
message is not clear why it is being done - to support PIE - then sure,
by all means. But slapping it everywhere...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-07-30 19:12 ` [PATCH v9 04/11] x86/entry/64: " Thomas Garnier
@ 2019-08-05 17:28   ` Borislav Petkov
  2019-08-05 17:50     ` Thomas Garnier
  2019-08-06 13:59     ` Steven Rostedt
  0 siblings, 2 replies; 34+ messages in thread
From: Borislav Petkov @ 2019-08-05 17:28 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel

On Tue, Jul 30, 2019 at 12:12:48PM -0700, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Position Independent Executable (PIE) support will allow to extend the
> KASLR randomization range below 0xffffffff80000000.
> 
> Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> ---
>  arch/x86/entry/entry_64.S | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 3f5a978a02a7..4b588a902009 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1317,7 +1317,8 @@ ENTRY(error_entry)
>  	movl	%ecx, %eax			/* zero extend */
>  	cmpq	%rax, RIP+8(%rsp)
>  	je	.Lbstep_iret
> -	cmpq	$.Lgs_change, RIP+8(%rsp)
> +	leaq	.Lgs_change(%rip), %rcx
> +	cmpq	%rcx, RIP+8(%rsp)
>  	jne	.Lerror_entry_done
>  
>  	/*
> @@ -1514,10 +1515,10 @@ ENTRY(nmi)
>  	 * resume the outer NMI.
>  	 */
>  
> -	movq	$repeat_nmi, %rdx
> +	leaq	repeat_nmi(%rip), %rdx
>  	cmpq	8(%rsp), %rdx
>  	ja	1f
> -	movq	$end_repeat_nmi, %rdx
> +	leaq	end_repeat_nmi(%rip), %rdx
>  	cmpq	8(%rsp), %rdx
>  	ja	nested_nmi_out
>  1:
> @@ -1571,7 +1572,8 @@ nested_nmi:
>  	pushq	%rdx
>  	pushfq
>  	pushq	$__KERNEL_CS
> -	pushq	$repeat_nmi
> +	leaq	repeat_nmi(%rip), %rdx
> +	pushq	%rdx
>  
>  	/* Put stack back */
>  	addq	$(6*8), %rsp
> @@ -1610,7 +1612,11 @@ first_nmi:
>  	addq	$8, (%rsp)	/* Fix up RSP */
>  	pushfq			/* RFLAGS */
>  	pushq	$__KERNEL_CS	/* CS */
> -	pushq	$1f		/* RIP */
> +	pushq	$0		/* Future return address */
> +	pushq	%rax		/* Save RAX */
> +	leaq	1f(%rip), %rax	/* RIP */
> +	movq    %rax, 8(%rsp)   /* Put 1f on return address */
> +	popq	%rax		/* Restore RAX */

Can't you just use a callee-clobbered reg here instead of preserving
%rax?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-05 17:28   ` Borislav Petkov
@ 2019-08-05 17:50     ` Thomas Garnier
  2019-08-06  5:08       ` Borislav Petkov
  2019-08-06 13:59     ` Steven Rostedt
  1 sibling, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-08-05 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Hardening, Kristen Carlson Accardi, Kees Cook,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, LKML

On Mon, Aug 5, 2019 at 10:28 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jul 30, 2019 at 12:12:48PM -0700, Thomas Garnier wrote:
> > Change the assembly code to use only relative references of symbols for the
> > kernel to be PIE compatible.
> >
> > Position Independent Executable (PIE) support will allow to extend the
> > KASLR randomization range below 0xffffffff80000000.
> >
> > Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
> > Reviewed-by: Kees Cook <keescook@chromium.org>
> > ---
> >  arch/x86/entry/entry_64.S | 16 +++++++++++-----
> >  1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 3f5a978a02a7..4b588a902009 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -1317,7 +1317,8 @@ ENTRY(error_entry)
> >       movl    %ecx, %eax                      /* zero extend */
> >       cmpq    %rax, RIP+8(%rsp)
> >       je      .Lbstep_iret
> > -     cmpq    $.Lgs_change, RIP+8(%rsp)
> > +     leaq    .Lgs_change(%rip), %rcx
> > +     cmpq    %rcx, RIP+8(%rsp)
> >       jne     .Lerror_entry_done
> >
> >       /*
> > @@ -1514,10 +1515,10 @@ ENTRY(nmi)
> >        * resume the outer NMI.
> >        */
> >
> > -     movq    $repeat_nmi, %rdx
> > +     leaq    repeat_nmi(%rip), %rdx
> >       cmpq    8(%rsp), %rdx
> >       ja      1f
> > -     movq    $end_repeat_nmi, %rdx
> > +     leaq    end_repeat_nmi(%rip), %rdx
> >       cmpq    8(%rsp), %rdx
> >       ja      nested_nmi_out
> >  1:
> > @@ -1571,7 +1572,8 @@ nested_nmi:
> >       pushq   %rdx
> >       pushfq
> >       pushq   $__KERNEL_CS
> > -     pushq   $repeat_nmi
> > +     leaq    repeat_nmi(%rip), %rdx
> > +     pushq   %rdx
> >
> >       /* Put stack back */
> >       addq    $(6*8), %rsp
> > @@ -1610,7 +1612,11 @@ first_nmi:
> >       addq    $8, (%rsp)      /* Fix up RSP */
> >       pushfq                  /* RFLAGS */
> >       pushq   $__KERNEL_CS    /* CS */
> > -     pushq   $1f             /* RIP */
> > +     pushq   $0              /* Future return address */
> > +     pushq   %rax            /* Save RAX */
> > +     leaq    1f(%rip), %rax  /* RIP */
> > +     movq    %rax, 8(%rsp)   /* Put 1f on return address */
> > +     popq    %rax            /* Restore RAX */
>
> Can't you just use a callee-clobbered reg here instead of preserving
> %rax?

I saw that %rdx was used for temporary usage and restored before the
end so I assumed that it was not an option.

>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support
  2019-08-05 17:27       ` Borislav Petkov
@ 2019-08-05 17:53         ` Thomas Garnier
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-08-05 17:53 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kees Cook, Kernel Hardening, Kristen Carlson Accardi, Herbert Xu,
	David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Linux Crypto Mailing List, LKML

On Mon, Aug 5, 2019 at 10:27 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Aug 05, 2019 at 09:54:44AM -0700, Kees Cook wrote:
> > I think there was some long-ago feedback from someone (Ingo?) about
> > giving context for the patch so looking at one individually would let
> > someone know that it was part of a larger series.

That's correct.

>
> Strange. But then we'd have to "mark" all patches which belong to a
> larger series this way, no? And we don't do that...
>
> > Do you think it should just be dropped in each patch?
>
> I think reading it once is enough. If the change alone in some commit
> message is not clear why it is being done - to support PIE - then sure,
> by all means. But slapping it everywhere...

I assume the last sentence could be removed in most cases.

>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-05 17:50     ` Thomas Garnier
@ 2019-08-06  5:08       ` Borislav Petkov
  2019-08-06  8:30         ` Peter Zijlstra
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-06  5:08 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Kernel Hardening, Kristen Carlson Accardi, Kees Cook,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, LKML

On Mon, Aug 05, 2019 at 10:50:30AM -0700, Thomas Garnier wrote:
> I saw that %rdx was used for temporary usage and restored before the
> end so I assumed that it was not an option.

PUSH_AND_CLEAR_REGS saves all regs earlier so I think you should be
able to use others. Like SAVE_AND_SWITCH_TO_KERNEL_CR3/RESTORE_CR3, for
example, uses %r15 and %r14.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-06  5:08       ` Borislav Petkov
@ 2019-08-06  8:30         ` Peter Zijlstra
  2019-08-06 12:35           ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2019-08-06  8:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Garnier, Kernel Hardening, Kristen Carlson Accardi,
	Kees Cook, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, LKML

On Tue, Aug 06, 2019 at 07:08:51AM +0200, Borislav Petkov wrote:
> On Mon, Aug 05, 2019 at 10:50:30AM -0700, Thomas Garnier wrote:
> > I saw that %rdx was used for temporary usage and restored before the
> > end so I assumed that it was not an option.
> 
> PUSH_AND_CLEAR_REGS saves all regs earlier so I think you should be
> able to use others. Like SAVE_AND_SWITCH_TO_KERNEL_CR3/RESTORE_CR3, for
> example, uses %r15 and %r14.

AFAICT the CONFIG_DEBUG_ENTRY thing he's changing is before we setup
pt_regs.

Also consider the UNWIND hint that's in there, it states we only have
the IRET frame on stack, not a full regs set.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-06  8:30         ` Peter Zijlstra
@ 2019-08-06 12:35           ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2019-08-06 12:35 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: Thomas Garnier, Kernel Hardening, Kristen Carlson Accardi,
	Kees Cook, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, LKML

+ rostedt.

Steve, pls have a look at the patch at the beginning of this thread as
it touches the reentrant NMI magic. :)

Thx.

On Tue, Aug 06, 2019 at 10:30:32AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 06, 2019 at 07:08:51AM +0200, Borislav Petkov wrote:
> > On Mon, Aug 05, 2019 at 10:50:30AM -0700, Thomas Garnier wrote:
> > > I saw that %rdx was used for temporary usage and restored before the
> > > end so I assumed that it was not an option.
> > 
> > PUSH_AND_CLEAR_REGS saves all regs earlier so I think you should be
> > able to use others. Like SAVE_AND_SWITCH_TO_KERNEL_CR3/RESTORE_CR3, for
> > example, uses %r15 and %r14.
> 
> AFAICT the CONFIG_DEBUG_ENTRY thing he's changing is before we setup
> pt_regs.
> 
> Also consider the UNWIND hint that's in there, it states we only have
> the IRET frame on stack, not a full regs set.

Ok, after discussing it on IRC, I guess let's leave it like that. I
guess little ugly is better than a lot more ugly if we're wanting to
attempt to free up some regs here to save us the rax preservation.
Probably not worth the effort by a long shot.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-05 17:28   ` Borislav Petkov
  2019-08-05 17:50     ` Thomas Garnier
@ 2019-08-06 13:59     ` Steven Rostedt
  2019-08-06 15:35       ` Borislav Petkov
  1 sibling, 1 reply; 34+ messages in thread
From: Steven Rostedt @ 2019-08-06 13:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Garnier, kernel-hardening, kristen, keescook,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-kernel

On Mon, Aug 05, 2019 at 07:28:54PM +0200, Borislav Petkov wrote:
> >  1:
> > @@ -1571,7 +1572,8 @@ nested_nmi:
> >  	pushq	%rdx
> >  	pushfq
> >  	pushq	$__KERNEL_CS
> > -	pushq	$repeat_nmi
> > +	leaq	repeat_nmi(%rip), %rdx
> > +	pushq	%rdx
> >  
> >  	/* Put stack back */
> >  	addq	$(6*8), %rsp
> > @@ -1610,7 +1612,11 @@ first_nmi:
> >  	addq	$8, (%rsp)	/* Fix up RSP */
> >  	pushfq			/* RFLAGS */
> >  	pushq	$__KERNEL_CS	/* CS */
> > -	pushq	$1f		/* RIP */
> > +	pushq	$0		/* Future return address */
> > +	pushq	%rax		/* Save RAX */
> > +	leaq	1f(%rip), %rax	/* RIP */
> > +	movq    %rax, 8(%rsp)   /* Put 1f on return address */
> > +	popq	%rax		/* Restore RAX */
> 
> Can't you just use a callee-clobbered reg here instead of preserving
> %rax?

As Peter stated later in this thread, we only have the IRQ stack frame saved
here, because we just took an NMI, and this is the logic to determine if it
was a nested NMI or not (where we have to be *very* careful about touching the
stack!)

That said, the code modified here is to test the NMI nesting logic (only
enabled with CONFIG_DEBUG_ENTRY), and what it is doing is re-enabling NMIs
before calling the first NMI handler, to help trigger nested NMIs without the
need of a break point or page fault (iret enables NMIs again).

This code is in the path of the "first nmi" (we confirmed that this is not
nested), which means that it should be safe to push onto the stack.

Yes, we need to save and restore whatever reg we used. The only comment I
would make is to use %rdx instead of %rax as that has been our "scratch"
register used before saving pt_regs. Just to be consistent.

-- Steve



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 04/11] x86/entry/64: Adapt assembly for PIE support
  2019-08-06 13:59     ` Steven Rostedt
@ 2019-08-06 15:35       ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2019-08-06 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Garnier, kernel-hardening, kristen, keescook,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-kernel

On Tue, Aug 06, 2019 at 09:59:42AM -0400, Steven Rostedt wrote:
> As Peter stated later in this thread, we only have the IRQ stack frame saved
> here, because we just took an NMI, and this is the logic to determine if it
> was a nested NMI or not (where we have to be *very* careful about touching the
> stack!)
> 
> That said, the code modified here is to test the NMI nesting logic (only
> enabled with CONFIG_DEBUG_ENTRY), and what it is doing is re-enabling NMIs
> before calling the first NMI handler, to help trigger nested NMIs without the
> need of a break point or page fault (iret enables NMIs again).
> 
> This code is in the path of the "first nmi" (we confirmed that this is not
> nested), which means that it should be safe to push onto the stack.

Thanks for the explanation!

> Yes, we need to save and restore whatever reg we used. The only comment I
> would make is to use %rdx instead of %rax as that has been our "scratch"
> register used before saving pt_regs. Just to be consistent.

Yap, makes sense.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 00/11] x86: PIE support to extend KASLR randomization
  2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
                   ` (10 preceding siblings ...)
  2019-07-30 19:12 ` [PATCH v9 11/11] x86/alternatives: " Thomas Garnier
@ 2019-08-06 15:43 ` Borislav Petkov
  2019-08-06 15:50   ` Peter Zijlstra
  11 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-06 15:43 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Herbert Xu, David S. Miller,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andy Lutomirski, Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Rafael J. Wysocki, Len Brown, Pavel Machek, Peter Zijlstra,
	Nadav Amit, Jann Horn, Feng Tang, Maran Wilson, Enrico Weigelt,
	Allison Randal, Alexios Zavras, linux-crypto, linux-kernel,
	virtualization, linux-pm

On Tue, Jul 30, 2019 at 12:12:44PM -0700, Thomas Garnier wrote:
> These patches make some of the changes necessary to build the kernel as
> Position Independent Executable (PIE) on x86_64. Another patchset will
> add the PIE option and larger architecture changes.

Yeah, about this: do we have a longer writeup about the actual benefits
of all this and why we should take this all? After all, after looking
at the first couple of asm patches, it is posing restrictions to how
we deal with virtual addresses in asm (only RIP-relative addressing in
64-bit mode, MOVs with 64-bit immediates, etc, for example) and I'm
willing to bet money that some future unrelated change will break PIE
sooner or later. And I'd like to have a better justification why we
should enforce those new "rules" unconditionally.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 00/11] x86: PIE support to extend KASLR randomization
  2019-08-06 15:43 ` [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Borislav Petkov
@ 2019-08-06 15:50   ` Peter Zijlstra
  2019-08-29 19:55     ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2019-08-06 15:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Garnier, kernel-hardening, kristen, keescook, Herbert Xu,
	David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Andy Lutomirski, Juergen Gross, Thomas Hellstrom, VMware,
	Inc.,
	Rafael J. Wysocki, Len Brown, Pavel Machek, Nadav Amit,
	Jann Horn, Feng Tang, Maran Wilson, Enrico Weigelt,
	Allison Randal, Alexios Zavras, linux-crypto, linux-kernel,
	virtualization, linux-pm

On Tue, Aug 06, 2019 at 05:43:47PM +0200, Borislav Petkov wrote:
> On Tue, Jul 30, 2019 at 12:12:44PM -0700, Thomas Garnier wrote:
> > These patches make some of the changes necessary to build the kernel as
> > Position Independent Executable (PIE) on x86_64. Another patchset will
> > add the PIE option and larger architecture changes.
> 
> Yeah, about this: do we have a longer writeup about the actual benefits
> of all this and why we should take this all? After all, after looking
> at the first couple of asm patches, it is posing restrictions to how
> we deal with virtual addresses in asm (only RIP-relative addressing in
> 64-bit mode, MOVs with 64-bit immediates, etc, for example) and I'm
> willing to bet money that some future unrelated change will break PIE
> sooner or later.

Possibly objtool can help here; it should be possible to teach it about
these rules, and then it will yell when violated. That should avoid
regressions.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 08/11] x86/boot/64: Adapt assembly for PIE support
  2019-07-30 19:12 ` [PATCH v9 08/11] x86/boot/64: " Thomas Garnier
@ 2019-08-09 17:30   ` Borislav Petkov
  2019-10-29 21:29     ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-09 17:30 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Juergen Gross, Peter Zijlstra,
	Boris Ostrovsky, Josh Poimboeuf, Maran Wilson, Feng Tang,
	linux-kernel

chOn Tue, Jul 30, 2019 at 12:12:52PM -0700, Thomas Garnier wrote:
> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
> 
> Early at boot, the kernel is mapped at a temporary address while preparing
> the page table. To know the changes needed for the page table with KASLR,

These manipulations need to be done regardless of whether KASLR is
enabled or not. You're basically accomodating them to PIE.

> the boot code calculate the difference between the expected address of the

calculates

> kernel and the one chosen by KASLR. It does not work with PIE because all
> symbols in code are relatives. Instead of getting the future relocated
> virtual address, you will get the current temporary mapping.

Please avoid "you", "we" etc personal pronouns in commit messages.

> Instructions were changed to have absolute 64-bit references.

From Documentation/process/submitting-patches.rst:

 "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
  instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
  to do frotz", as if you are giving orders to the codebase to change
  its behaviour."

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 10/11] x86/paravirt: Adapt assembly for PIE support
  2019-07-31 12:53   ` Peter Zijlstra
@ 2019-08-12 12:55     ` Borislav Petkov
  2019-10-29 21:30       ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-12 12:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Garnier, kernel-hardening, kristen, keescook,
	Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	virtualization, linux-kernel

On Wed, Jul 31, 2019 at 02:53:06PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 30, 2019 at 12:12:54PM -0700, Thomas Garnier wrote:
> > if PIE is enabled, switch the paravirt assembly constraints to be
> > compatible. The %c/i constrains generate smaller code so is kept by
> > default.
> > 
> > Position Independent Executable (PIE) support will allow to extend the
> > KASLR randomization range below 0xffffffff80000000.
> > 
> > Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
> > Acked-by: Juergen Gross <jgross@suse.com>
> > ---
> >  arch/x86/include/asm/paravirt_types.h | 25 +++++++++++++++++++++----
> >  1 file changed, 21 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> > index 70b654f3ffe5..fd7dc37d0010 100644
> > --- a/arch/x86/include/asm/paravirt_types.h
> > +++ b/arch/x86/include/asm/paravirt_types.h
> > @@ -338,9 +338,25 @@ extern struct paravirt_patch_template pv_ops;
> >  #define PARAVIRT_PATCH(x)					\
> >  	(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
> >  
> > +#ifdef CONFIG_X86_PIE
> > +#define paravirt_opptr_call "a"
> > +#define paravirt_opptr_type "p"
> > +
> > +/*
> > + * Alternative patching requires a maximum of 7 bytes but the relative call is
> > + * only 6 bytes. If PIE is enabled, add an additional nop to the call
> > + * instruction to ensure patching is possible.
> > + */
> > +#define PARAVIRT_CALL_POST  "nop;"
> 
> I'm confused; where does the 7 come from? The relative call is 6 bytes,

Well, before it, the relative CALL is a CALL reg/mem64, i.e. the target
is mem64. For example:


ffffffff81025c45:       ff 14 25 68 37 02 82    callq  *0xffffffff82023768

That address there is practically pv_ops + offset.

Now, in the opcode bytes you have 0xff opcode, ModRM byte 0x14 and SIB
byte 0x25, and 4 bytes imm32 offset. And this is 7 bytes.

What it becomes is:

ffffffff81025cd0:       ff 15 fa d9 ff 00       callq  *0xffd9fa(%rip)        # ffffffff820236d0 <pv_ops+0x30>
ffffffff81025cd6:       90                      nop

which is a RIP-relative, i.e., opcode 0xff, ModRM byte 0x15 and imm32.
And this is 6 bytes.

And since the paravirt patching doesn't do NOP padding like the
alternatives patching does, you need to pad with a byte.

Thomas, please add the gist of this to the comments because this
incomprehensible machinery better be documented as detailed as possible.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 11/11] x86/alternatives: Adapt assembly for PIE support
  2019-07-30 19:12 ` [PATCH v9 11/11] x86/alternatives: " Thomas Garnier
@ 2019-08-12 13:57   ` Borislav Petkov
  2019-10-29 21:31     ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2019-08-12 13:57 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: kernel-hardening, kristen, keescook, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Peter Zijlstra, Nadav Amit,
	linux-kernel

On Tue, Jul 30, 2019 at 12:12:55PM -0700, Thomas Garnier wrote:
> Change the assembly options to work with pointers instead of integers.

This commit message is too vague. A before/after example would make it a
lot more clear why the change is needed.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 00/11] x86: PIE support to extend KASLR randomization
  2019-08-06 15:50   ` Peter Zijlstra
@ 2019-08-29 19:55     ` Thomas Garnier
  2019-09-06 23:22       ` Thomas Garnier
  0 siblings, 1 reply; 34+ messages in thread
From: Thomas Garnier @ 2019-08-29 19:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Kernel Hardening, Kristen Carlson Accardi,
	Kees Cook, Herbert Xu, David S. Miller, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers,
	Andy Lutomirski, Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Rafael J. Wysocki, Len Brown, Pavel Machek, Nadav Amit,
	Jann Horn, Feng Tang, Maran Wilson, Enrico Weigelt,
	Allison Randal, Alexios Zavras, Linux Crypto Mailing List, LKML,
	virtualization, Linux PM list

On Tue, Aug 6, 2019 at 8:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Aug 06, 2019 at 05:43:47PM +0200, Borislav Petkov wrote:
> > On Tue, Jul 30, 2019 at 12:12:44PM -0700, Thomas Garnier wrote:
> > > These patches make some of the changes necessary to build the kernel as
> > > Position Independent Executable (PIE) on x86_64. Another patchset will
> > > add the PIE option and larger architecture changes.
> >
> > Yeah, about this: do we have a longer writeup about the actual benefits
> > of all this and why we should take this all? After all, after looking
> > at the first couple of asm patches, it is posing restrictions to how
> > we deal with virtual addresses in asm (only RIP-relative addressing in
> > 64-bit mode, MOVs with 64-bit immediates, etc, for example) and I'm
> > willing to bet money that some future unrelated change will break PIE
> > sooner or later.

The goal is being able to extend the range of addresses where the
kernel can be placed with KASLR. I will look at clarifying that in the
future.

>
> Possibly objtool can help here; it should be possible to teach it about
> these rules, and then it will yell when violated. That should avoid
> regressions.
>

I will look into that as well.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 00/11] x86: PIE support to extend KASLR randomization
  2019-08-29 19:55     ` Thomas Garnier
@ 2019-09-06 23:22       ` Thomas Garnier
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-09-06 23:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Kernel Hardening, Kristen Carlson Accardi,
	Kees Cook, Herbert Xu, David S. Miller, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers,
	Andy Lutomirski, Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Rafael J. Wysocki, Len Brown, Pavel Machek, Nadav Amit,
	Jann Horn, Feng Tang, Maran Wilson, Enrico Weigelt,
	Allison Randal, Alexios Zavras, Linux Crypto Mailing List, LKML,
	virtualization, Linux PM list

On Thu, Aug 29, 2019 at 12:55 PM Thomas Garnier <thgarnie@chromium.org> wrote:
>
> On Tue, Aug 6, 2019 at 8:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Aug 06, 2019 at 05:43:47PM +0200, Borislav Petkov wrote:
> > > On Tue, Jul 30, 2019 at 12:12:44PM -0700, Thomas Garnier wrote:
> > > > These patches make some of the changes necessary to build the kernel as
> > > > Position Independent Executable (PIE) on x86_64. Another patchset will
> > > > add the PIE option and larger architecture changes.
> > >
> > > Yeah, about this: do we have a longer writeup about the actual benefits
> > > of all this and why we should take this all? After all, after looking
> > > at the first couple of asm patches, it is posing restrictions to how
> > > we deal with virtual addresses in asm (only RIP-relative addressing in
> > > 64-bit mode, MOVs with 64-bit immediates, etc, for example) and I'm
> > > willing to bet money that some future unrelated change will break PIE
> > > sooner or later.
>
> The goal is being able to extend the range of addresses where the
> kernel can be placed with KASLR. I will look at clarifying that in the
> future.
>
> >
> > Possibly objtool can help here; it should be possible to teach it about
> > these rules, and then it will yell when violated. That should avoid
> > regressions.
> >
>
> I will look into that as well.

Following a discussion with Kees. I will explore objtool in the
follow-up patchset as we still have more elaborate pie changes in the
second set. I like the idea overall and I think it would be great if
it works.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 08/11] x86/boot/64: Adapt assembly for PIE support
  2019-08-09 17:30   ` Borislav Petkov
@ 2019-10-29 21:29     ` Thomas Garnier
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-10-29 21:29 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Hardening, Kristen Carlson Accardi, Kees Cook,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Juergen Gross, Peter Zijlstra,
	Boris Ostrovsky, Josh Poimboeuf, Maran Wilson, Feng Tang, LKML

On Fri, Aug 9, 2019 at 10:29 AM Borislav Petkov <bp@alien8.de> wrote:
>
> chOn Tue, Jul 30, 2019 at 12:12:52PM -0700, Thomas Garnier wrote:
> > Change the assembly code to use only relative references of symbols for the
> > kernel to be PIE compatible.
> >
> > Early at boot, the kernel is mapped at a temporary address while preparing
> > the page table. To know the changes needed for the page table with KASLR,
>
> These manipulations need to be done regardless of whether KASLR is
> enabled or not. You're basically accomodating them to PIE.
>
> > the boot code calculate the difference between the expected address of the
>
> calculates
>
> > kernel and the one chosen by KASLR. It does not work with PIE because all
> > symbols in code are relatives. Instead of getting the future relocated
> > virtual address, you will get the current temporary mapping.
>
> Please avoid "you", "we" etc personal pronouns in commit messages.
>
> > Instructions were changed to have absolute 64-bit references.
>
> From Documentation/process/submitting-patches.rst:
>
>  "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
>   instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
>   to do frotz", as if you are giving orders to the codebase to change
>   its behaviour."

Sorry for the late reply, busy couple months.

I will integrate your feedback in v10. Thanks.

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 10/11] x86/paravirt: Adapt assembly for PIE support
  2019-08-12 12:55     ` Borislav Petkov
@ 2019-10-29 21:30       ` Thomas Garnier
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-10-29 21:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Kernel Hardening, Kristen Carlson Accardi,
	Kees Cook, Juergen Gross, Thomas Hellstrom, VMware, Inc.,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, virtualization, LKML

On Mon, Aug 12, 2019 at 5:54 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jul 31, 2019 at 02:53:06PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 30, 2019 at 12:12:54PM -0700, Thomas Garnier wrote:
> > > if PIE is enabled, switch the paravirt assembly constraints to be
> > > compatible. The %c/i constrains generate smaller code so is kept by
> > > default.
> > >
> > > Position Independent Executable (PIE) support will allow to extend the
> > > KASLR randomization range below 0xffffffff80000000.
> > >
> > > Signed-off-by: Thomas Garnier <thgarnie@chromium.org>
> > > Acked-by: Juergen Gross <jgross@suse.com>
> > > ---
> > >  arch/x86/include/asm/paravirt_types.h | 25 +++++++++++++++++++++----
> > >  1 file changed, 21 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> > > index 70b654f3ffe5..fd7dc37d0010 100644
> > > --- a/arch/x86/include/asm/paravirt_types.h
> > > +++ b/arch/x86/include/asm/paravirt_types.h
> > > @@ -338,9 +338,25 @@ extern struct paravirt_patch_template pv_ops;
> > >  #define PARAVIRT_PATCH(x)                                  \
> > >     (offsetof(struct paravirt_patch_template, x) / sizeof(void *))
> > >
> > > +#ifdef CONFIG_X86_PIE
> > > +#define paravirt_opptr_call "a"
> > > +#define paravirt_opptr_type "p"
> > > +
> > > +/*
> > > + * Alternative patching requires a maximum of 7 bytes but the relative call is
> > > + * only 6 bytes. If PIE is enabled, add an additional nop to the call
> > > + * instruction to ensure patching is possible.
> > > + */
> > > +#define PARAVIRT_CALL_POST  "nop;"
> >
> > I'm confused; where does the 7 come from? The relative call is 6 bytes,
>
> Well, before it, the relative CALL is a CALL reg/mem64, i.e. the target
> is mem64. For example:
>
>
> ffffffff81025c45:       ff 14 25 68 37 02 82    callq  *0xffffffff82023768
>
> That address there is practically pv_ops + offset.
>
> Now, in the opcode bytes you have 0xff opcode, ModRM byte 0x14 and SIB
> byte 0x25, and 4 bytes imm32 offset. And this is 7 bytes.
>
> What it becomes is:
>
> ffffffff81025cd0:       ff 15 fa d9 ff 00       callq  *0xffd9fa(%rip)        # ffffffff820236d0 <pv_ops+0x30>
> ffffffff81025cd6:       90                      nop
>
> which is a RIP-relative, i.e., opcode 0xff, ModRM byte 0x15 and imm32.
> And this is 6 bytes.
>
> And since the paravirt patching doesn't do NOP padding like the
> alternatives patching does, you need to pad with a byte.
>
> Thomas, please add the gist of this to the comments because this
> incomprehensible machinery better be documented as detailed as possible.

Sorry for the late reply, busy couple months. Will add it.

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v9 11/11] x86/alternatives: Adapt assembly for PIE support
  2019-08-12 13:57   ` Borislav Petkov
@ 2019-10-29 21:31     ` Thomas Garnier
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Garnier @ 2019-10-29 21:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Hardening, Kristen Carlson Accardi, Kees Cook,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Nadav Amit, LKML

On Mon, Aug 12, 2019 at 6:56 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jul 30, 2019 at 12:12:55PM -0700, Thomas Garnier wrote:
> > Change the assembly options to work with pointers instead of integers.
>
> This commit message is too vague. A before/after example would make it a
> lot more clear why the change is needed.

Sorry for the late reply, busy couple months.

I will try to do my best to explain it better in next iteration.

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, back to index

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-30 19:12 [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 01/11] x86/crypto: Adapt assembly for PIE support Thomas Garnier
2019-08-05 16:32   ` Borislav Petkov
2019-08-05 16:54     ` Kees Cook
2019-08-05 17:27       ` Borislav Petkov
2019-08-05 17:53         ` Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 02/11] x86: Add macro to get symbol address " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 03/11] x86: relocate_kernel - Adapt assembly " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 04/11] x86/entry/64: " Thomas Garnier
2019-08-05 17:28   ` Borislav Petkov
2019-08-05 17:50     ` Thomas Garnier
2019-08-06  5:08       ` Borislav Petkov
2019-08-06  8:30         ` Peter Zijlstra
2019-08-06 12:35           ` Borislav Petkov
2019-08-06 13:59     ` Steven Rostedt
2019-08-06 15:35       ` Borislav Petkov
2019-07-30 19:12 ` [PATCH v9 05/11] x86: pm-trace - " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 06/11] x86/CPU: " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 07/11] x86/acpi: " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 08/11] x86/boot/64: " Thomas Garnier
2019-08-09 17:30   ` Borislav Petkov
2019-10-29 21:29     ` Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 09/11] x86/power/64: " Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 10/11] x86/paravirt: " Thomas Garnier
2019-07-31 12:53   ` Peter Zijlstra
2019-08-12 12:55     ` Borislav Petkov
2019-10-29 21:30       ` Thomas Garnier
2019-07-30 19:12 ` [PATCH v9 11/11] x86/alternatives: " Thomas Garnier
2019-08-12 13:57   ` Borislav Petkov
2019-10-29 21:31     ` Thomas Garnier
2019-08-06 15:43 ` [PATCH v9 00/11] x86: PIE support to extend KASLR randomization Borislav Petkov
2019-08-06 15:50   ` Peter Zijlstra
2019-08-29 19:55     ` Thomas Garnier
2019-09-06 23:22       ` Thomas Garnier

Kernel-hardening archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kernel-hardening/0 kernel-hardening/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kernel-hardening kernel-hardening/ https://lore.kernel.org/kernel-hardening \
		kernel-hardening@lists.openwall.com
	public-inbox-index kernel-hardening

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/com.openwall.lists.kernel-hardening


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git