All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Thomas Garnier <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: boris.ostrovsky@oracle.com, fweisbec@gmail.com,
	vkuznets@redhat.com, corbet@lwn.net, aryabinin@virtuozzo.com,
	matt@codeblueprint.co.uk, peterz@infradead.org,
	lstoakes@gmail.com, len.brown@intel.com, keescook@chromium.org,
	sgruszka@redhat.com, rusty@rustcorp.com.au, tglx@linutronix.de,
	zijun_hu@htc.com, pbonzini@redhat.com, rjw@rjwysocki.net,
	mcgrof@kernel.org, ard.biesheuvel@linaro.org, dvyukov@google.com,
	joro@8bytes.org, borntraeger@de.ibm.com, jgross@suse.com,
	thgarnie@google.com, rkrcmar@redhat.com, jikos@kernel.org,
	akpm@linux-foundation.org, tim.c.chen@linux.intel.com,
	hpa@zytor.com, jpoimboe@redhat.com, linux-kernel@vger.kernel.org,
	mhocko@suse.com, chris@chris-wilson.co.uk, glider@google.com,
	pavel@ucw.cz, paul.gortmaker@windriver.com,
	torvalds@linux-foundation.org, mingo@kernel.org, luto@kernel.org,
	bp@suse.de
Subject: [tip:x86/mm] x86: Make the GDT remapping read-only on 64-bit
Date: Thu, 16 Mar 2017 04:11:30 -0700	[thread overview]
Message-ID: <tip-45fc8757d1d2128e342b4e7ef39adedf7752faac@git.kernel.org> (raw)
In-Reply-To: <20170314170508.100882-3-thgarnie@google.com>

Commit-ID:  45fc8757d1d2128e342b4e7ef39adedf7752faac
Gitweb:     http://git.kernel.org/tip/45fc8757d1d2128e342b4e7ef39adedf7752faac
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 14 Mar 2017 10:05:08 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 16 Mar 2017 09:06:35 +0100

x86: Make the GDT remapping read-only on 64-bit

This patch makes the GDT remapped pages read-only, to prevent accidental
(or intentional) corruption of this key data structure.

This change is done only on 64-bit, because 32-bit needs it to be writable
for TSS switches.

The native_load_tr_desc function was adapted to correctly handle a
read-only GDT. The LTR instruction always writes to the GDT TSS entry.
This generates a page fault if the GDT is read-only. This change checks
if the current GDT is a remap and swap GDTs as needed. This function was
tested by booting multiple machines and checking hibernation works
properly.

KVM SVM and VMX were adapted to use the writeable GDT. On VMX, the
per-cpu variable was removed for functions to fetch the original GDT.
Instead of reloading the previous GDT, VMX will reload the fixmap GDT as
expected. For testing, VMs were started and restored on multiple
configurations.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Luis R . Rodriguez <mcgrof@kernel.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kasan-dev@googlegroups.com
Cc: kernel-hardening@lists.openwall.com
Cc: kvm@vger.kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-pm@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: zijun_hu <zijun_hu@htc.com>
Link: http://lkml.kernel.org/r/20170314170508.100882-3-thgarnie@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/desc.h      | 106 +++++++++++++++++++++++++--------------
 arch/x86/include/asm/processor.h |   1 +
 arch/x86/kernel/cpu/common.c     |  28 ++++++++---
 arch/x86/kvm/svm.c               |   4 +-
 arch/x86/kvm/vmx.c               |  12 ++---
 5 files changed, 96 insertions(+), 55 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 4b5ef0c..ec05f9c 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -248,9 +248,77 @@ static inline void native_set_ldt(const void *addr, unsigned int entries)
 	}
 }
 
+static inline void native_load_gdt(const struct desc_ptr *dtr)
+{
+	asm volatile("lgdt %0"::"m" (*dtr));
+}
+
+static inline void native_load_idt(const struct desc_ptr *dtr)
+{
+	asm volatile("lidt %0"::"m" (*dtr));
+}
+
+static inline void native_store_gdt(struct desc_ptr *dtr)
+{
+	asm volatile("sgdt %0":"=m" (*dtr));
+}
+
+static inline void native_store_idt(struct desc_ptr *dtr)
+{
+	asm volatile("sidt %0":"=m" (*dtr));
+}
+
+/*
+ * The LTR instruction marks the TSS GDT entry as busy. On 64-bit, the GDT is
+ * a read-only remapping. To prevent a page fault, the GDT is switched to the
+ * original writeable version when needed.
+ */
+#ifdef CONFIG_X86_64
 static inline void native_load_tr_desc(void)
 {
+	struct desc_ptr gdt;
+	int cpu = raw_smp_processor_id();
+	bool restore = 0;
+	struct desc_struct *fixmap_gdt;
+
+	native_store_gdt(&gdt);
+	fixmap_gdt = get_cpu_gdt_ro(cpu);
+
+	/*
+	 * If the current GDT is the read-only fixmap, swap to the original
+	 * writeable version. Swap back at the end.
+	 */
+	if (gdt.address == (unsigned long)fixmap_gdt) {
+		load_direct_gdt(cpu);
+		restore = 1;
+	}
 	asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
+	if (restore)
+		load_fixmap_gdt(cpu);
+}
+#else
+static inline void native_load_tr_desc(void)
+{
+	asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
+}
+#endif
+
+static inline unsigned long native_store_tr(void)
+{
+	unsigned long tr;
+
+	asm volatile("str %0":"=r" (tr));
+
+	return tr;
+}
+
+static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
+{
+	struct desc_struct *gdt = get_cpu_gdt_rw(cpu);
+	unsigned int i;
+
+	for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++)
+		gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i];
 }
 
 DECLARE_PER_CPU(bool, __tss_limit_invalid);
@@ -305,44 +373,6 @@ static inline void invalidate_tss_limit(void)
 		this_cpu_write(__tss_limit_invalid, true);
 }
 
-static inline void native_load_gdt(const struct desc_ptr *dtr)
-{
-	asm volatile("lgdt %0"::"m" (*dtr));
-}
-
-static inline void native_load_idt(const struct desc_ptr *dtr)
-{
-	asm volatile("lidt %0"::"m" (*dtr));
-}
-
-static inline void native_store_gdt(struct desc_ptr *dtr)
-{
-	asm volatile("sgdt %0":"=m" (*dtr));
-}
-
-static inline void native_store_idt(struct desc_ptr *dtr)
-{
-	asm volatile("sidt %0":"=m" (*dtr));
-}
-
-static inline unsigned long native_store_tr(void)
-{
-	unsigned long tr;
-
-	asm volatile("str %0":"=r" (tr));
-
-	return tr;
-}
-
-static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
-{
-	struct desc_struct *gdt = get_cpu_gdt_rw(cpu);
-	unsigned int i;
-
-	for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++)
-		gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i];
-}
-
 /* This intentionally ignores lm, since 32-bit apps don't have that field. */
 #define LDT_empty(info)					\
 	((info)->base_addr		== 0	&&	\
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 1150e1b..edf42c4 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -716,6 +716,7 @@ extern struct desc_ptr		early_gdt_descr;
 
 extern void cpu_set_gdt(int);
 extern void switch_to_new_gdt(int);
+extern void load_direct_gdt(int);
 extern void load_fixmap_gdt(int);
 extern void load_percpu_segment(int);
 extern void cpu_init(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3cf1590..f8e22db 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -448,8 +448,15 @@ void load_percpu_segment(int cpu)
 	load_stack_canary_segment();
 }
 
-/* Used by XEN to force the GDT read-only when required */
+/*
+ * On 64-bit the GDT remapping is read-only.
+ * A global is used for Xen to change the default when required.
+ */
+#ifdef CONFIG_X86_64
+pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL_RO;
+#else
 pgprot_t pg_fixmap_gdt_flags = PAGE_KERNEL;
+#endif
 
 /* Setup the fixmap mapping only once per-processor */
 static inline void setup_fixmap_gdt(int cpu)
@@ -458,6 +465,17 @@ static inline void setup_fixmap_gdt(int cpu)
 		     __pa(get_cpu_gdt_rw(cpu)), pg_fixmap_gdt_flags);
 }
 
+/* Load the original GDT from the per-cpu structure */
+void load_direct_gdt(int cpu)
+{
+	struct desc_ptr gdt_descr;
+
+	gdt_descr.address = (long)get_cpu_gdt_rw(cpu);
+	gdt_descr.size = GDT_SIZE - 1;
+	load_gdt(&gdt_descr);
+}
+EXPORT_SYMBOL_GPL(load_direct_gdt);
+
 /* Load a fixmap remapping of the per-cpu GDT */
 void load_fixmap_gdt(int cpu)
 {
@@ -467,6 +485,7 @@ void load_fixmap_gdt(int cpu)
 	gdt_descr.size = GDT_SIZE - 1;
 	load_gdt(&gdt_descr);
 }
+EXPORT_SYMBOL_GPL(load_fixmap_gdt);
 
 /*
  * Current gdt points %fs at the "master" per-cpu area: after this,
@@ -474,11 +493,8 @@ void load_fixmap_gdt(int cpu)
  */
 void switch_to_new_gdt(int cpu)
 {
-	struct desc_ptr gdt_descr;
-
-	gdt_descr.address = (long)get_cpu_gdt_rw(cpu);
-	gdt_descr.size = GDT_SIZE - 1;
-	load_gdt(&gdt_descr);
+	/* Load the original GDT */
+	load_direct_gdt(cpu);
 	/* Reload the per-cpu base */
 	load_percpu_segment(cpu);
 }
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d1efe2c..c02b9af 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -741,7 +741,6 @@ static int svm_hardware_enable(void)
 
 	struct svm_cpu_data *sd;
 	uint64_t efer;
-	struct desc_ptr gdt_descr;
 	struct desc_struct *gdt;
 	int me = raw_smp_processor_id();
 
@@ -763,8 +762,7 @@ static int svm_hardware_enable(void)
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
 
-	native_store_gdt(&gdt_descr);
-	gdt = (struct desc_struct *)gdt_descr.address;
+	gdt = get_current_gdt_rw();
 	sd->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS);
 
 	wrmsrl(MSR_EFER, efer | EFER_SVME);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 98e82ee..596a76d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -935,7 +935,6 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
  * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
  */
 static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
-static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
 
 /*
  * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
@@ -2052,14 +2051,13 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
  */
 static unsigned long segment_base(u16 selector)
 {
-	struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
 	struct desc_struct *table;
 	unsigned long v;
 
 	if (!(selector & ~SEGMENT_RPL_MASK))
 		return 0;
 
-	table = (struct desc_struct *)gdt->address;
+	table = get_current_gdt_ro();
 
 	if ((selector & SEGMENT_TI_MASK) == SEGMENT_LDT) {
 		u16 ldt_selector = kvm_read_ldt();
@@ -2164,7 +2162,7 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx)
 #endif
 	if (vmx->host_state.msr_host_bndcfgs)
 		wrmsrl(MSR_IA32_BNDCFGS, vmx->host_state.msr_host_bndcfgs);
-	load_gdt(this_cpu_ptr(&host_gdt));
+	load_fixmap_gdt(raw_smp_processor_id());
 }
 
 static void vmx_load_host_state(struct vcpu_vmx *vmx)
@@ -2266,7 +2264,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	if (!already_loaded) {
-		struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
+		unsigned long gdt = get_current_gdt_ro_vaddr();
 		unsigned long sysenter_esp;
 
 		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
@@ -2277,7 +2275,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		 */
 		vmcs_writel(HOST_TR_BASE,
 			    (unsigned long)this_cpu_ptr(&cpu_tss));
-		vmcs_writel(HOST_GDTR_BASE, gdt->address);
+		vmcs_writel(HOST_GDTR_BASE, gdt);   /* 22.2.4 */
 
 		/*
 		 * VM exits change the host TR limit to 0x67 after a VM
@@ -3465,8 +3463,6 @@ static int hardware_enable(void)
 		ept_sync_global();
 	}
 
-	native_store_gdt(this_cpu_ptr(&host_gdt));
-
 	return 0;
 }
 

  parent reply	other threads:[~2017-03-16 11:15 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-14 17:05 [PATCH v7 1/3] x86/mm: Adapt MODULES_END based on Fixmap section size Thomas Garnier
2017-03-14 17:05 ` [kernel-hardening] " Thomas Garnier
2017-03-14 17:05 ` Thomas Garnier
2017-03-14 17:05 ` Thomas Garnier
2017-03-14 17:05 ` Thomas Garnier
2017-03-14 17:05 ` [PATCH v7 2/3] x86: Remap GDT tables in the Fixmap section Thomas Garnier
2017-03-14 17:05   ` [kernel-hardening] " Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-16 11:10   ` [tip:x86/mm] x86: Remap GDT tables in the fixmap section tip-bot for Thomas Garnier
2017-03-14 17:05 ` [PATCH v7 2/3] x86: Remap GDT tables in the Fixmap section Thomas Garnier
2017-03-14 17:05 ` [PATCH v7 3/3] x86: Make the GDT remapping read-only on 64-bit Thomas Garnier
2017-03-14 17:05   ` [kernel-hardening] " Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-14 17:05   ` Thomas Garnier
2017-03-14 21:04   ` Pavel Machek
2017-03-14 21:04     ` [kernel-hardening] " Pavel Machek
2017-03-14 21:04     ` Pavel Machek
2017-03-14 21:04     ` Pavel Machek
2017-03-14 21:20     ` Thomas Garnier
2017-03-14 21:20       ` [kernel-hardening] " Thomas Garnier
2017-03-14 21:20       ` Thomas Garnier
2017-03-14 21:20       ` Thomas Garnier
2017-03-14 22:43       ` H. Peter Anvin
2017-03-14 22:43       ` H. Peter Anvin
2017-03-14 22:43         ` H. Peter Anvin
2017-03-14 21:20     ` Thomas Garnier
2017-03-14 21:04   ` Pavel Machek
2017-03-16 11:11   ` tip-bot for Thomas Garnier [this message]
2017-03-14 17:05 ` Thomas Garnier
2017-03-15 13:52 ` [PATCH v7 1/3] x86/mm: Adapt MODULES_END based on Fixmap section size Boris Ostrovsky
2017-03-15 13:52   ` [kernel-hardening] " Boris Ostrovsky
2017-03-15 13:52   ` Boris Ostrovsky
2017-03-15 13:52   ` Boris Ostrovsky
2017-03-15 13:52   ` Boris Ostrovsky
2017-03-15 13:52 ` Boris Ostrovsky
2017-03-16  8:10 ` Ingo Molnar
2017-03-16  8:10 ` Ingo Molnar
2017-03-16  8:10   ` [kernel-hardening] " Ingo Molnar
2017-03-16  8:10   ` Ingo Molnar
2017-03-16  8:10   ` Ingo Molnar
2017-03-16  8:10   ` Ingo Molnar
2017-03-16 15:33   ` Thomas Garnier
2017-03-16 15:33     ` [kernel-hardening] " Thomas Garnier
2017-03-16 15:33     ` Thomas Garnier
2017-03-16 15:33     ` Thomas Garnier
2017-03-17  7:34     ` Ingo Molnar
2017-03-17  7:34       ` [kernel-hardening] " Ingo Molnar
2017-03-17  7:34       ` Ingo Molnar
2017-03-17  7:34       ` Ingo Molnar
2017-03-17  7:34     ` Ingo Molnar
2017-03-16 15:33   ` Thomas Garnier
2017-03-16 11:10 ` [tip:x86/mm] x86/mm: Adapt MODULES_END based on fixmap " tip-bot for Thomas Garnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-45fc8757d1d2128e342b4e7ef39adedf7752faac@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=akpm@linux-foundation.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@suse.de \
    --cc=chris@chris-wilson.co.uk \
    --cc=corbet@lwn.net \
    --cc=dvyukov@google.com \
    --cc=fweisbec@gmail.com \
    --cc=glider@google.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jikos@kernel.org \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=lstoakes@gmail.com \
    --cc=luto@kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pavel@ucw.cz \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rkrcmar@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=sgruszka@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vkuznets@redhat.com \
    --cc=zijun_hu@htc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.