linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
@ 2015-09-16 23:33 Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 1/3] x86/paravirt: Add _safe to the read_msr and write_msr PV hooks Andy Lutomirski
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-16 23:33 UTC (permalink / raw)
  To: x86
  Cc: Paolo Bonzini, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Andy Lutomirski

Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
turns all rdmsr and wrmsr operations into the safe variants without
any checks that the operations actually succeed.

This is IMO awful: it papers over bugs.  In particular, KVM gueests
might be unwittingly depending on this behavior because
CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
aware of any such problems, but applying this series would be a good
way to shake them out.

Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
maintainers are welcome to make a similar change on top of this.

Since there's plenty of time before the next merge window, I think
we should apply and fix anything that breaks.

Doing this is probably a prerequisite to sanely decoupling
CONFIG_KVM_GUEST and CONFIG_PARAVIRT, which would probably make
Arjan and the rest of the Clear Containers people happy :)

Andy Lutomirski (3):
  x86/paravirt: Add _safe to the read_msr and write_msr PV hooks
  x86/paravirt: Add paravirt_{read,write}_msr
  x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y

 arch/x86/include/asm/paravirt.h       | 45 +++++++++++++++++++++--------------
 arch/x86/include/asm/paravirt_types.h | 12 +++++++---
 arch/x86/kernel/paravirt.c            |  6 +++--
 arch/x86/xen/enlighten.c              | 27 +++++++++++++++++++--
 4 files changed, 65 insertions(+), 25 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/3] x86/paravirt: Add _safe to the read_msr and write_msr PV hooks
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
@ 2015-09-16 23:33 ` Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 2/3] x86/paravirt: Add paravirt_{read,write}_msr Andy Lutomirski
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-16 23:33 UTC (permalink / raw)
  To: x86
  Cc: Paolo Bonzini, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Andy Lutomirski

These hooks match the _safe variants, so name them accordingly.
This will make room for unsafe PV hooks.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/paravirt.h       | 33 +++++++++++++++++----------------
 arch/x86/include/asm/paravirt_types.h |  8 ++++----
 arch/x86/kernel/paravirt.c            |  4 ++--
 arch/x86/xen/enlighten.c              |  4 ++--
 4 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 10d0596433f8..9cf6e5232b0d 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -123,34 +123,35 @@ static inline void wbinvd(void)
 
 #define get_kernel_rpl()  (pv_info.kernel_rpl)
 
-static inline u64 paravirt_read_msr(unsigned msr, int *err)
+static inline u64 paravirt_read_msr_safe(unsigned msr, int *err)
 {
-	return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err);
+	return PVOP_CALL2(u64, pv_cpu_ops.read_msr_safe, msr, err);
 }
 
-static inline int paravirt_write_msr(unsigned msr, unsigned low, unsigned high)
+static inline int paravirt_write_msr_safe(unsigned msr,
+					  unsigned low, unsigned high)
 {
-	return PVOP_CALL3(int, pv_cpu_ops.write_msr, msr, low, high);
+	return PVOP_CALL3(int, pv_cpu_ops.write_msr_safe, msr, low, high);
 }
 
 /* These should all do BUG_ON(_err), but our headers are too tangled. */
 #define rdmsr(msr, val1, val2)			\
 do {						\
 	int _err;				\
-	u64 _l = paravirt_read_msr(msr, &_err);	\
+	u64 _l = paravirt_read_msr_safe(msr, &_err);	\
 	val1 = (u32)_l;				\
 	val2 = _l >> 32;			\
 } while (0)
 
 #define wrmsr(msr, val1, val2)			\
 do {						\
-	paravirt_write_msr(msr, val1, val2);	\
+	paravirt_write_msr_safe(msr, val1, val2);	\
 } while (0)
 
 #define rdmsrl(msr, val)			\
 do {						\
 	int _err;				\
-	val = paravirt_read_msr(msr, &_err);	\
+	val = paravirt_read_msr_safe(msr, &_err);	\
 } while (0)
 
 static inline void wrmsrl(unsigned msr, u64 val)
@@ -158,23 +159,23 @@ static inline void wrmsrl(unsigned msr, u64 val)
 	wrmsr(msr, (u32)val, (u32)(val>>32));
 }
 
-#define wrmsr_safe(msr, a, b)	paravirt_write_msr(msr, a, b)
+#define wrmsr_safe(msr, a, b)	paravirt_write_msr_safe(msr, a, b)
 
 /* rdmsr with exception handling */
-#define rdmsr_safe(msr, a, b)			\
-({						\
-	int _err;				\
-	u64 _l = paravirt_read_msr(msr, &_err);	\
-	(*a) = (u32)_l;				\
-	(*b) = _l >> 32;			\
-	_err;					\
+#define rdmsr_safe(msr, a, b)				\
+({							\
+	int _err;					\
+	u64 _l = paravirt_read_msr_safe(msr, &_err);	\
+	(*a) = (u32)_l;					\
+	(*b) = _l >> 32;				\
+	_err;						\
 })
 
 static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 {
 	int err;
 
-	*p = paravirt_read_msr(msr, &err);
+	*p = paravirt_read_msr_safe(msr, &err);
 	return err;
 }
 
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index ce029e4fa7c6..810e8ccaa42a 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -151,10 +151,10 @@ struct pv_cpu_ops {
 	void (*cpuid)(unsigned int *eax, unsigned int *ebx,
 		      unsigned int *ecx, unsigned int *edx);
 
-	/* MSR, PMC and TSR operations.
-	   err = 0/-EFAULT.  wrmsr returns 0/-EFAULT. */
-	u64 (*read_msr)(unsigned int msr, int *err);
-	int (*write_msr)(unsigned int msr, unsigned low, unsigned high);
+	/* MSR operations.
+	   err = 0/-EIO.  wrmsr returns 0/-EIO. */
+	u64 (*read_msr_safe)(unsigned int msr, int *err);
+	int (*write_msr_safe)(unsigned int msr, unsigned low, unsigned high);
 
 	u64 (*read_pmc)(int counter);
 
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index f68e48f5f6c2..fe8cd519a796 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -349,8 +349,8 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
 	.write_cr8 = native_write_cr8,
 #endif
 	.wbinvd = native_wbinvd,
-	.read_msr = native_read_msr_safe,
-	.write_msr = native_write_msr_safe,
+	.read_msr_safe = native_read_msr_safe,
+	.write_msr_safe = native_write_msr_safe,
 	.read_pmc = native_read_pmc,
 	.load_tr_desc = native_load_tr_desc,
 	.set_ldt = native_set_ldt,
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 30d12afe52ed..8523d42d163e 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1216,8 +1216,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 
 	.wbinvd = native_wbinvd,
 
-	.read_msr = xen_read_msr_safe,
-	.write_msr = xen_write_msr_safe,
+	.read_msr_safe = xen_read_msr_safe,
+	.write_msr_safe = xen_write_msr_safe,
 
 	.read_pmc = xen_read_pmc,
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/3] x86/paravirt: Add paravirt_{read,write}_msr
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 1/3] x86/paravirt: Add _safe to the read_msr and write_msr PV hooks Andy Lutomirski
@ 2015-09-16 23:33 ` Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 3/3] x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y Andy Lutomirski
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-16 23:33 UTC (permalink / raw)
  To: x86
  Cc: Paolo Bonzini, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Andy Lutomirski

This adds paravirt hooks for unsafe MSR access.  On native, they
call native_{read,write}_msr.  On Xen, they use
xen_{read,write}_msr_safe.

Nothing uses them yet for ease of bisection.  The next patch will
use them in rdmsrl, wrmsrl, etc.

I intentionally didn't make them OOPS on #GP on Xen.  I think that
should be done separately by the Xen maintainers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/paravirt.h       | 11 +++++++++++
 arch/x86/include/asm/paravirt_types.h | 10 ++++++++--
 arch/x86/kernel/paravirt.c            |  2 ++
 arch/x86/xen/enlighten.c              | 23 +++++++++++++++++++++++
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 9cf6e5232b0d..e6569a3b0a37 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -123,6 +123,17 @@ static inline void wbinvd(void)
 
 #define get_kernel_rpl()  (pv_info.kernel_rpl)
 
+static inline u64 paravirt_read_msr(unsigned msr)
+{
+	return PVOP_CALL1(u64, pv_cpu_ops.read_msr, msr);
+}
+
+static inline void paravirt_write_msr(unsigned msr,
+				      unsigned low, unsigned high)
+{
+	return PVOP_VCALL3(pv_cpu_ops.write_msr, msr, low, high);
+}
+
 static inline u64 paravirt_read_msr_safe(unsigned msr, int *err)
 {
 	return PVOP_CALL2(u64, pv_cpu_ops.read_msr_safe, msr, err);
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 810e8ccaa42a..cb1e7af9fe42 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -151,8 +151,14 @@ struct pv_cpu_ops {
 	void (*cpuid)(unsigned int *eax, unsigned int *ebx,
 		      unsigned int *ecx, unsigned int *edx);
 
-	/* MSR operations.
-	   err = 0/-EIO.  wrmsr returns 0/-EIO. */
+	/* Unsafe MSR operations.  These either succeed or crash. */
+	u64 (*read_msr)(unsigned int msr);
+	int (*write_msr)(unsigned int msr, unsigned low, unsigned high);
+
+	/*
+	 * Safe MSR operations.
+	 * err = 0/-EIO.  wrmsr returns 0/-EIO.
+	 */
 	u64 (*read_msr_safe)(unsigned int msr, int *err);
 	int (*write_msr_safe)(unsigned int msr, unsigned low, unsigned high);
 
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index fe8cd519a796..21fc4686f760 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -349,6 +349,8 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
 	.write_cr8 = native_write_cr8,
 #endif
 	.wbinvd = native_wbinvd,
+	.read_msr = native_read_msr,
+	.write_msr = native_write_msr,
 	.read_msr_safe = native_read_msr_safe,
 	.write_msr_safe = native_write_msr_safe,
 	.read_pmc = native_read_pmc,
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 8523d42d163e..d2bc6afeaf33 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1086,6 +1086,26 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
 	return ret;
 }
 
+static u64 xen_read_msr(unsigned int msr)
+{
+	/*
+	 * This will silently swallow a #GP from RDMSR.  It may be worth
+	 * changing that.
+	 */
+	int err;
+
+	return xen_read_msr_safe(msr, &err);
+}
+
+static void xen_write_msr(unsigned int msr, unsigned low, unsigned high)
+{
+	/*
+	 * This will silently swallow a #GP from WRMSR.  It may be worth
+	 * changing that.
+	 */
+	xen_write_msr_safe(msr, low, high);
+}
+
 void xen_setup_shared_info(void)
 {
 	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -1216,6 +1236,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 
 	.wbinvd = native_wbinvd,
 
+	.read_msr = xen_read_msr,
+	.write_msr = xen_write_msr,
+
 	.read_msr_safe = xen_read_msr_safe,
 	.write_msr_safe = xen_write_msr_safe,
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/3] x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 1/3] x86/paravirt: Add _safe to the read_msr and write_msr PV hooks Andy Lutomirski
  2015-09-16 23:33 ` [PATCH 2/3] x86/paravirt: Add paravirt_{read,write}_msr Andy Lutomirski
@ 2015-09-16 23:33 ` Andy Lutomirski
  2015-09-17  7:19 ` [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Ingo Molnar
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-16 23:33 UTC (permalink / raw)
  To: x86
  Cc: Paolo Bonzini, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Andy Lutomirski

Enabling CONFIG_PARAVIRT had an unintended side effect: rdmsr turned
into rdmsr_safe and wrmsr turned into wrmsr_safe, even on bare
metal.  Undo that by using the new unsafe paravirt MSR hooks.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/paravirt.h | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index e6569a3b0a37..f61975a3ccfd 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -145,24 +145,21 @@ static inline int paravirt_write_msr_safe(unsigned msr,
 	return PVOP_CALL3(int, pv_cpu_ops.write_msr_safe, msr, low, high);
 }
 
-/* These should all do BUG_ON(_err), but our headers are too tangled. */
 #define rdmsr(msr, val1, val2)			\
 do {						\
-	int _err;				\
-	u64 _l = paravirt_read_msr_safe(msr, &_err);	\
+	u64 _l = paravirt_read_msr(msr);	\
 	val1 = (u32)_l;				\
 	val2 = _l >> 32;			\
 } while (0)
 
 #define wrmsr(msr, val1, val2)			\
 do {						\
-	paravirt_write_msr_safe(msr, val1, val2);	\
+	paravirt_write_msr(msr, val1, val2);	\
 } while (0)
 
 #define rdmsrl(msr, val)			\
 do {						\
-	int _err;				\
-	val = paravirt_read_msr_safe(msr, &_err);	\
+	val = paravirt_read_msr(msr);		\
 } while (0)
 
 static inline void wrmsrl(unsigned msr, u64 val)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
                   ` (2 preceding siblings ...)
  2015-09-16 23:33 ` [PATCH 3/3] x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y Andy Lutomirski
@ 2015-09-17  7:19 ` Ingo Molnar
  2015-09-17  9:31   ` Borislav Petkov
  2015-09-17 15:23   ` Andy Lutomirski
  2015-09-17  8:58 ` Peter Zijlstra
  2015-09-17  9:10 ` [Xen-devel] " Andrew Cooper
  5 siblings, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2015-09-17  7:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Paolo Bonzini, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Linus Torvalds, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, H. Peter Anvin


* Andy Lutomirski <luto@kernel.org> wrote:

> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> turns all rdmsr and wrmsr operations into the safe variants without
> any checks that the operations actually succeed.
> 
> This is IMO awful: it papers over bugs.  In particular, KVM gueests
> might be unwittingly depending on this behavior because
> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
> aware of any such problems, but applying this series would be a good
> way to shake them out.
> 
> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
> maintainers are welcome to make a similar change on top of this.
> 
> Since there's plenty of time before the next merge window, I think
> we should apply and fix anything that breaks.

No, I think we should at most generate a warning instead, and not crash the kernel 
via rdmsr()!

Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked Ubuntu and 
Fedora), so we are potentially exposing a lot of users to problems.

Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are 
non-critical and returning the 'safe' result is much better than crashing or 
hanging the bootup.

( We should double check that rdmsr()/wrmsr() results are never left 
  uninitialized, but are set to zero or so, for cases where the return code is not 
  checked. )

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
                   ` (3 preceding siblings ...)
  2015-09-17  7:19 ` [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Ingo Molnar
@ 2015-09-17  8:58 ` Peter Zijlstra
  2015-09-17 11:40   ` Paolo Bonzini
  2015-09-17  9:10 ` [Xen-devel] " Andrew Cooper
  5 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2015-09-17  8:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Paolo Bonzini, KVM list, Arjan van de Ven, xen-devel, linux-kernel

On Wed, Sep 16, 2015 at 04:33:11PM -0700, Andy Lutomirski wrote:
> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> turns all rdmsr and wrmsr operations into the safe variants without
> any checks that the operations actually succeed.
> 
> This is IMO awful: it papers over bugs.  In particular, KVM gueests
> might be unwittingly depending on this behavior because
> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
> aware of any such problems, but applying this series would be a good
> way to shake them out.
> 
> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
> maintainers are welcome to make a similar change on top of this.
> 
> Since there's plenty of time before the next merge window, I think
> we should apply and fix anything that breaks.
> 
> Doing this is probably a prerequisite to sanely decoupling
> CONFIG_KVM_GUEST and CONFIG_PARAVIRT, which would probably make
> Arjan and the rest of the Clear Containers people happy :)

So I actually like this, although by Ingo's argument, its a tad risky.

But the far greater problem I have with the whole virt thing is that
you cannot use rdmsr_safe() to probe if an MSR exists at all because, as
you told me, these virt thingies return 0 for all 'unknown' MSRs instead
of faulting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
                   ` (4 preceding siblings ...)
  2015-09-17  8:58 ` Peter Zijlstra
@ 2015-09-17  9:10 ` Andrew Cooper
  2015-09-17 15:11   ` Boris Ostrovsky
  5 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2015-09-17  9:10 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: KVM list, Peter Zijlstra, linux-kernel, xen-devel, Paolo Bonzini,
	Arjan van de Ven

On 17/09/15 00:33, Andy Lutomirski wrote:
> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> turns all rdmsr and wrmsr operations into the safe variants without
> any checks that the operations actually succeed.
>
> This is IMO awful: it papers over bugs.  In particular, KVM gueests
> might be unwittingly depending on this behavior because
> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
> aware of any such problems, but applying this series would be a good
> way to shake them out.
>
> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
> maintainers are welcome to make a similar change on top of this.

The Xen side of things need some further modification before this would
be a safe operation to perform.

On the wrmsr side of things alone, this is the list of things Xen
currently objects to and injects #GP faults for.

(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000081 from
0xe023e00800000000 to 0x0023001000000000.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000082 from
0xffff82d0bffff000 to 0xffffffff81560060.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000083 from
0xffff82d0bffff020 to 0xffffffff81558100.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000174 from
0x000000000000e008 to 0x0000000000000010.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000175 from
0xffff8300ac0f7fc0 to 0x0000000000000000.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000176 from
0xffff82d08023fd50 to 0xffffffff815616d0.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000083 from
0xffff82d0bffff020 to 0xffffffff81561910.
(XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000084 from
0x0000000000074700 to 0x0000000000047700.

However, it would be certainly be worth teaching PVops not to play with
MSRs it doesn't own.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  7:19 ` [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Ingo Molnar
@ 2015-09-17  9:31   ` Borislav Petkov
  2015-09-17 11:22     ` H. Peter Anvin
  2015-09-17 11:39     ` Paolo Bonzini
  2015-09-17 15:23   ` Andy Lutomirski
  1 sibling, 2 replies; 27+ messages in thread
From: Borislav Petkov @ 2015-09-17  9:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, x86, Paolo Bonzini, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, H. Peter Anvin

On Thu, Sep 17, 2015 at 09:19:20AM +0200, Ingo Molnar wrote:
> Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked Ubuntu and 
> Fedora), so we are potentially exposing a lot of users to problems.

+ SUSE.

> Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are 
> non-critical and returning the 'safe' result is much better than crashing or 
> hanging the bootup.

... and prepending all MSR accesses with feature/CPUID checks is probably almost
impossible.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  9:31   ` Borislav Petkov
@ 2015-09-17 11:22     ` H. Peter Anvin
  2015-09-17 11:39     ` Paolo Bonzini
  1 sibling, 0 replies; 27+ messages in thread
From: H. Peter Anvin @ 2015-09-17 11:22 UTC (permalink / raw)
  To: Borislav Petkov, Ingo Molnar
  Cc: Andy Lutomirski, x86, Paolo Bonzini, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton

However, the difference between one CONFIG and another is quite frankly crazy.  We should explicitly use the safe versions where this is appropriate, and then yes, we should do this.

Yet another reason the paravirt code is batshit crazy.

On September 17, 2015 2:31:34 AM PDT, Borislav Petkov <bp@alien8.de> wrote:
>On Thu, Sep 17, 2015 at 09:19:20AM +0200, Ingo Molnar wrote:
>> Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I
>checked Ubuntu and 
>> Fedora), so we are potentially exposing a lot of users to problems.
>
>+ SUSE.
>
>> Crashing the bootup on an unknown MSR is bad. Many MSR reads and
>writes are 
>> non-critical and returning the 'safe' result is much better than
>crashing or 
>> hanging the bootup.
>
>... and prepending all MSR accesses with feature/CPUID checks is
>probably almost
>impossible.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  9:31   ` Borislav Petkov
  2015-09-17 11:22     ` H. Peter Anvin
@ 2015-09-17 11:39     ` Paolo Bonzini
  2015-09-17 15:27       ` Borislav Petkov
  1 sibling, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2015-09-17 11:39 UTC (permalink / raw)
  To: Borislav Petkov, Ingo Molnar
  Cc: Andy Lutomirski, x86, Peter Zijlstra, KVM list, Arjan van de Ven,
	xen-devel, linux-kernel, Linus Torvalds, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, H. Peter Anvin



On 17/09/2015 11:31, Borislav Petkov wrote:
> 
>> > Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are 
>> > non-critical and returning the 'safe' result is much better than crashing or 
>> > hanging the bootup.
> ... and prepending all MSR accesses with feature/CPUID checks is probably almost
> impossible.

That's not a big deal, that's what *_safe is for.  The problem is that
there are definitely some cases where the *_safe version is not being used.

I agree with Ingo that we should start with a WARN.  For example:

- give the read_msr and write_msr hooks the same prototype as the safe
variants

- make the virt platforms always return "no error" for the unsafe
variants (I understand if your first reaction is "ouch", but this
effectively is already the current behavior)

- change rdmsr/wrmsr/rdmsrl/wrmsrl to WARN if the read_msr and write_msr
hooks return an error

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  8:58 ` Peter Zijlstra
@ 2015-09-17 11:40   ` Paolo Bonzini
  2015-09-17 12:27     ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2015-09-17 11:40 UTC (permalink / raw)
  To: Peter Zijlstra, Andy Lutomirski
  Cc: x86, KVM list, Arjan van de Ven, xen-devel, linux-kernel



On 17/09/2015 10:58, Peter Zijlstra wrote:
> But the far greater problem I have with the whole virt thing is that
> you cannot use rdmsr_safe() to probe if an MSR exists at all because, as
> you told me, these virt thingies return 0 for all 'unknown' MSRs instead
> of faulting.

At least for KVM, that behavior is opt-in (the ignore_msrs parameter)
and no distro that I know enables it by default.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 11:40   ` Paolo Bonzini
@ 2015-09-17 12:27     ` Peter Zijlstra
  2015-09-17 15:17       ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2015-09-17 12:27 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andy Lutomirski, x86, KVM list, Arjan van de Ven, xen-devel,
	linux-kernel

On Thu, Sep 17, 2015 at 01:40:30PM +0200, Paolo Bonzini wrote:
> 
> 
> On 17/09/2015 10:58, Peter Zijlstra wrote:
> > But the far greater problem I have with the whole virt thing is that
> > you cannot use rdmsr_safe() to probe if an MSR exists at all because, as
> > you told me, these virt thingies return 0 for all 'unknown' MSRs instead
> > of faulting.
> 
> At least for KVM, that behavior is opt-in (the ignore_msrs parameter)
> and no distro that I know enables it by default.

Ah, that would be good news. Andy earlier argued I could not rely on
rdmsr_safe() faulting on unknown MSRs. If practically we can there's
some code I can simplify :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  9:10 ` [Xen-devel] " Andrew Cooper
@ 2015-09-17 15:11   ` Boris Ostrovsky
  0 siblings, 0 replies; 27+ messages in thread
From: Boris Ostrovsky @ 2015-09-17 15:11 UTC (permalink / raw)
  To: Andrew Cooper, Andy Lutomirski
  Cc: x86, KVM list, Peter Zijlstra, linux-kernel, xen-devel,
	Paolo Bonzini, Arjan van de Ven

On 09/17/2015 05:10 AM, Andrew Cooper wrote:
> On 17/09/15 00:33, Andy Lutomirski wrote:
>> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
>> turns all rdmsr and wrmsr operations into the safe variants without
>> any checks that the operations actually succeed.
>>
>> This is IMO awful: it papers over bugs.  In particular, KVM gueests
>> might be unwittingly depending on this behavior because
>> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
>> aware of any such problems, but applying this series would be a good
>> way to shake them out.
>>
>> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
>> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
>> maintainers are welcome to make a similar change on top of this.
> The Xen side of things need some further modification before this would
> be a safe operation to perform.
>
> On the wrmsr side of things alone, this is the list of things Xen
> currently objects to and injects #GP faults for.
>
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000081 from
> 0xe023e00800000000 to 0x0023001000000000.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000082 from
> 0xffff82d0bffff000 to 0xffffffff81560060.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000083 from
> 0xffff82d0bffff020 to 0xffffffff81558100.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000174 from
> 0x000000000000e008 to 0x0000000000000010.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000175 from
> 0xffff8300ac0f7fc0 to 0x0000000000000000.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 0000000000000176 from
> 0xffff82d08023fd50 to 0xffffffff815616d0.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000083 from
> 0xffff82d0bffff020 to 0xffffffff81561910.
> (XEN) traps.c:2692:d0v0 Domain attempted WRMSR 00000000c0000084 from
> 0x0000000000074700 to 0x0000000000047700.
>
> However, it would be certainly be worth teaching PVops not to play with
> MSRs it doesn't own.

PVops already knows about those. There is even has a comment about how 
we shouldn't touch those MSRs. And yet three lines later we still write 
them.

-boris


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 12:27     ` Peter Zijlstra
@ 2015-09-17 15:17       ` Andy Lutomirski
  2015-09-17 15:17         ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-17 15:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paolo Bonzini, xen-devel, Arjan van de Ven, linux-kernel, X86 ML,
	KVM list

On Sep 17, 2015 5:33 AM, "Peter Zijlstra" <peterz@infradead.org> wrote:
>
> On Thu, Sep 17, 2015 at 01:40:30PM +0200, Paolo Bonzini wrote:
> >
> >
> > On 17/09/2015 10:58, Peter Zijlstra wrote:
> > > But the far greater problem I have with the whole virt thing is that
> > > you cannot use rdmsr_safe() to probe if an MSR exists at all because, as
> > > you told me, these virt thingies return 0 for all 'unknown' MSRs instead
> > > of faulting.
> >
> > At least for KVM, that behavior is opt-in (the ignore_msrs parameter)
> > and no distro that I know enables it by default.
>
> Ah, that would be good news. Andy earlier argued I could not rely on
> rdmsr_safe() faulting on unknown MSRs. If practically we can there's
> some code I can simplify :-)

I was taking about QEMU TCG, not KVM.


--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:17       ` Andy Lutomirski
@ 2015-09-17 15:17         ` Peter Zijlstra
  2015-09-17 15:26           ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2015-09-17 15:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, xen-devel, Arjan van de Ven, linux-kernel, X86 ML,
	KVM list

On Thu, Sep 17, 2015 at 08:17:18AM -0700, Andy Lutomirski wrote:

> > Ah, that would be good news. Andy earlier argued I could not rely on
> > rdmsr_safe() faulting on unknown MSRs. If practically we can there's
> > some code I can simplify :-)
> 
> I was taking about QEMU TCG, not KVM.

Just for my education, TCG is the thing without _any_ hardware assist?
The thing you fear to use because it cannot boot a kernel this side of
tomorrow etc.. ?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17  7:19 ` [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Ingo Molnar
  2015-09-17  9:31   ` Borislav Petkov
@ 2015-09-17 15:23   ` Andy Lutomirski
  2015-09-17 15:27     ` Arjan van de Ven
  2015-09-17 17:30     ` Ingo Molnar
  1 sibling, 2 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-17 15:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Paolo Bonzini, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, H. Peter Anvin

On Thu, Sep 17, 2015 at 12:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@kernel.org> wrote:
>
>> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
>> turns all rdmsr and wrmsr operations into the safe variants without
>> any checks that the operations actually succeed.
>>
>> This is IMO awful: it papers over bugs.  In particular, KVM gueests
>> might be unwittingly depending on this behavior because
>> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
>> aware of any such problems, but applying this series would be a good
>> way to shake them out.
>>
>> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
>> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
>> maintainers are welcome to make a similar change on top of this.
>>
>> Since there's plenty of time before the next merge window, I think
>> we should apply and fix anything that breaks.
>
> No, I think we should at most generate a warning instead, and not crash the kernel
> via rdmsr()!
>
> Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked Ubuntu and
> Fedora), so we are potentially exposing a lot of users to problems.
>
> Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are
> non-critical and returning the 'safe' result is much better than crashing or
> hanging the bootup.
>

Should we do that for CONFIG_PARAVIRT=n, too?

It would be straightforward to rig this up (temporarily?) on top of
these patches.  To keep bloat down, we might want to implement it in
do_general_protection rather than sticking it in native_read_msr.

wrmsr is a different beast, since we can fail due to writing the wrong
value to an otherwise valid MSR.  Given that MSR screwups can very
easily be security holes, I'm not sure that warning and blindly
continuing on an unchecked failed wrmsr is a good idea.

In any event, I think it's nuts that CONFIG_PARAVIRT changes this
behavior.  We should pick something sane and stick with it.
CONFIG_PARAVIRT=n appears to work just fine on bare metal in lots of
deployments, especially pre-KVM.  CONFIG_PARAVIRT=n also appears to
work fine under KVM, and I use it frequently.

> ( We should double check that rdmsr()/wrmsr() results are never left
>   uninitialized, but are set to zero or so, for cases where the return code is not
>   checked. )

It sure looks like native_read_msr_safe doesn't clear the output if
the rdmsr fails.

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:17         ` Peter Zijlstra
@ 2015-09-17 15:26           ` Andy Lutomirski
  2015-09-17 15:29             ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-17 15:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paolo Bonzini, xen-devel, Arjan van de Ven, linux-kernel, X86 ML,
	KVM list

On Thu, Sep 17, 2015 at 8:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Sep 17, 2015 at 08:17:18AM -0700, Andy Lutomirski wrote:
>
>> > Ah, that would be good news. Andy earlier argued I could not rely on
>> > rdmsr_safe() faulting on unknown MSRs. If practically we can there's
>> > some code I can simplify :-)
>>
>> I was taking about QEMU TCG, not KVM.
>
> Just for my education, TCG is the thing without _any_ hardware assist?

Yes.

> The thing you fear to use because it cannot boot a kernel this side of
> tomorrow etc.. ?

I actually use it on a semi-regular basis.  It appears to boot a
normal kernel correctly and surprisingly quickly.  It's important for
a silly reason.  Asking KVM on an Intel host to emulate an AMD CPU or
vice versa results in a chimera: I get an Opteron that's
"GenuineIntel", which causes Linux to treat it as Intel, which means
it gets the Intel quirks (SYSENTER in long mode, no
X86_BUG_SYSRET_SS_ATTRS, etc) instead of the AMD quirks (SYSCALL in
compat mode, etc).  So, if I want to test compat SYSCALL, I have to
use TCG, which will actually do a decent job of emulating an AMD CPU
for me.

Maybe Paolo can fix QEMU to fail bad MSR accesses for real...

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 11:39     ` Paolo Bonzini
@ 2015-09-17 15:27       ` Borislav Petkov
  2015-09-17 15:32         ` [Xen-devel] " Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Borislav Petkov @ 2015-09-17 15:27 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ingo Molnar, Andy Lutomirski, x86, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, H. Peter Anvin

On Thu, Sep 17, 2015 at 01:39:26PM +0200, Paolo Bonzini wrote:
> That's not a big deal, that's what *_safe is for.  The problem is that
> there are definitely some cases where the *_safe version is not being used.

I mean to do feature checks which assure you that those MSRs are
there so you don't need the safe variants. And that is not always
easy/possible.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:23   ` Andy Lutomirski
@ 2015-09-17 15:27     ` Arjan van de Ven
  2015-09-17 15:29       ` Paolo Bonzini
  2015-09-17 17:30     ` Ingo Molnar
  1 sibling, 1 reply; 27+ messages in thread
From: Arjan van de Ven @ 2015-09-17 15:27 UTC (permalink / raw)
  To: Andy Lutomirski, Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Paolo Bonzini, Peter Zijlstra, KVM list,
	xen-devel, linux-kernel, Linus Torvalds, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, H. Peter Anvin

>
>> ( We should double check that rdmsr()/wrmsr() results are never left
>>    uninitialized, but are set to zero or so, for cases where the return code is not
>>    checked. )
>
> It sure looks like native_read_msr_safe doesn't clear the output if
> the rdmsr fails.

I'd suggest to return some poison not just 0...
less likely to get interesting surprises that are insane hard to debug/diagnose




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:27     ` Arjan van de Ven
@ 2015-09-17 15:29       ` Paolo Bonzini
  2015-09-17 15:31         ` Arjan van de Ven
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2015-09-17 15:29 UTC (permalink / raw)
  To: Arjan van de Ven, Andy Lutomirski, Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Peter Zijlstra, KVM list, xen-devel,
	linux-kernel, Linus Torvalds, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, H. Peter Anvin



On 17/09/2015 17:27, Arjan van de Ven wrote:
>>
>>> ( We should double check that rdmsr()/wrmsr() results are never left
>>>    uninitialized, but are set to zero or so, for cases where the
>>> return code is not
>>>    checked. )
>>
>> It sure looks like native_read_msr_safe doesn't clear the output if
>> the rdmsr fails.
> 
> I'd suggest to return some poison not just 0...

What about 0 + WARN?

Paolo

> less likely to get interesting surprises that are insane hard to
> debug/diagnose

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:26           ` Andy Lutomirski
@ 2015-09-17 15:29             ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2015-09-17 15:29 UTC (permalink / raw)
  To: Andy Lutomirski, Peter Zijlstra
  Cc: xen-devel, Arjan van de Ven, linux-kernel, X86 ML, KVM list



On 17/09/2015 17:26, Andy Lutomirski wrote:
> Maybe Paolo can fix QEMU to fail bad MSR accesses for real...

I was afraid of someone proposing exactly that. :)

I can do it since the list of MSRs can be lifted from KVM.  Let's first
see the direction these patches go...

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:29       ` Paolo Bonzini
@ 2015-09-17 15:31         ` Arjan van de Ven
  2015-09-17 15:33           ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Arjan van de Ven @ 2015-09-17 15:31 UTC (permalink / raw)
  To: Paolo Bonzini, Andy Lutomirski, Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Peter Zijlstra, KVM list, xen-devel,
	linux-kernel, Linus Torvalds, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, H. Peter Anvin

On 9/17/2015 8:29 AM, Paolo Bonzini wrote:
>
>
> On 17/09/2015 17:27, Arjan van de Ven wrote:
>>>
>>>> ( We should double check that rdmsr()/wrmsr() results are never left
>>>>     uninitialized, but are set to zero or so, for cases where the
>>>> return code is not
>>>>     checked. )
>>>
>>> It sure looks like native_read_msr_safe doesn't clear the output if
>>> the rdmsr fails.
>>
>> I'd suggest to return some poison not just 0...
>
> What about 0 + WARN?

why 0?

0xdeadbeef or any other pattern (even 0x3636363636) makes more sense (of course also WARN... but most folks don't read dmesg for WARNs)

(it's the same thing we do for list or slab poison stuff)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:27       ` Borislav Petkov
@ 2015-09-17 15:32         ` Andrew Cooper
  2015-09-17 15:37           ` Borislav Petkov
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:32 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: Peter Zijlstra, KVM list, Peter Zijlstra, Linus Torvalds, x86,
	linux-kernel, Andrew Morton, xen-devel, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Arjan van de Ven, Ingo Molnar

On 17/09/15 16:27, Borislav Petkov wrote:
> On Thu, Sep 17, 2015 at 01:39:26PM +0200, Paolo Bonzini wrote:
>> That's not a big deal, that's what *_safe is for.  The problem is that
>> there are definitely some cases where the *_safe version is not being used.
> I mean to do feature checks which assure you that those MSRs are
> there so you don't need the safe variants. And that is not always
> easy/possible.
>

There are plenty of non-architectural MSRs in use which don't have
feature bits.

Xen used to have problems booting when using the masking MSRs when
booting virtualised.  Nowadays it uses a cpu vendor check and _safe()
probe to detect support.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:31         ` Arjan van de Ven
@ 2015-09-17 15:33           ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2015-09-17 15:33 UTC (permalink / raw)
  To: Arjan van de Ven, Andy Lutomirski, Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Peter Zijlstra, KVM list, xen-devel,
	linux-kernel, Linus Torvalds, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, H. Peter Anvin



On 17/09/2015 17:31, Arjan van de Ven wrote:
>>
>> What about 0 + WARN?
> 
> why 0?
> 
> 0xdeadbeef or any other pattern (even 0x3636363636) makes more sense (of
> course also WARN... but most folks don't read dmesg for WARNs)
> 
> (it's the same thing we do for list or slab poison stuff)

Sorry, brain fart.  That makes sense for the safe variants.  It would
require some auditing though.

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:32         ` [Xen-devel] " Andrew Cooper
@ 2015-09-17 15:37           ` Borislav Petkov
  0 siblings, 0 replies; 27+ messages in thread
From: Borislav Petkov @ 2015-09-17 15:37 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Paolo Bonzini, Peter Zijlstra, KVM list, Peter Zijlstra,
	Linus Torvalds, x86, linux-kernel, Andrew Morton, xen-devel,
	Andy Lutomirski, H. Peter Anvin, Thomas Gleixner,
	Arjan van de Ven, Ingo Molnar

On Thu, Sep 17, 2015 at 04:32:53PM +0100, Andrew Cooper wrote:
> There are plenty of non-architectural MSRs in use which don't have
> feature bits.

That's exactly what the "possible" adjective was supposed to represent.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 15:23   ` Andy Lutomirski
  2015-09-17 15:27     ` Arjan van de Ven
@ 2015-09-17 17:30     ` Ingo Molnar
  2015-09-17 18:51       ` Andy Lutomirski
  1 sibling, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2015-09-17 17:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, X86 ML, Paolo Bonzini, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, H. Peter Anvin


* Andy Lutomirski <luto@amacapital.net> wrote:

> On Thu, Sep 17, 2015 at 12:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Andy Lutomirski <luto@kernel.org> wrote:
> >
> >> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> >> turns all rdmsr and wrmsr operations into the safe variants without
> >> any checks that the operations actually succeed.
> >>
> >> This is IMO awful: it papers over bugs.  In particular, KVM gueests
> >> might be unwittingly depending on this behavior because
> >> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
> >> aware of any such problems, but applying this series would be a good
> >> way to shake them out.
> >>
> >> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
> >> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
> >> maintainers are welcome to make a similar change on top of this.
> >>
> >> Since there's plenty of time before the next merge window, I think
> >> we should apply and fix anything that breaks.
> >
> > No, I think we should at most generate a warning instead, and not crash the kernel
> > via rdmsr()!
> >
> > Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked Ubuntu and
> > Fedora), so we are potentially exposing a lot of users to problems.
> >
> > Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are
> > non-critical and returning the 'safe' result is much better than crashing or
> > hanging the bootup.
> >
> 
> Should we do that for CONFIG_PARAVIRT=n, too?

Absolutely. PARAVIRT=n should not behave differently from PARAVIRT=y on bare 
metal.

> It would be straightforward to rig this up (temporarily?) on top of these 
> patches.  To keep bloat down, we might want to implement it in 
> do_general_protection rather than sticking it in native_read_msr.

Fair enough.

> wrmsr is a different beast, since we can fail due to writing the wrong value to 
> an otherwise valid MSR.  Given that MSR screwups can very easily be security 
> holes, I'm not sure that warning and blindly continuing on an unchecked failed 
> wrmsr is a good idea.

So the fact is that right now we are silently ignoring failures there, and have 
been doing that for some time. The right first step is to live with that and start 
generating low-key, once-per-bootup warnings at most, and see how frequent (and 
how serious) they are.

We could add a (default disabled) CONFIG_PANIC_ON_UNKNOWN_MSR=y option if that's 
really a serious concern.

> In any event, I think it's nuts that CONFIG_PARAVIRT changes this
> behavior.  We should pick something sane and stick with it.

Absolutely - and as it happens, the 'does not crash the kernel' PARAVIRT=y 
accidental behavior is actually quite close to what we wanted for a long time, so 
let's make it official - and add a warning to make sure we are aware of problems. 

But don't turn 'potential problems' into showstopper bugs such as a hard to debug 
early boot crash, which to most Linux users means a black screen on bootup!

> > ( We should double check that rdmsr()/wrmsr() results are never left
> >   uninitialized, but are set to zero or so, for cases where the return code is not
> >   checked. )
> 
> It sure looks like native_read_msr_safe doesn't clear the output if the rdmsr 
> fails.

Yeah, that should be fixed, to make such soft failures deterministic.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops
  2015-09-17 17:30     ` Ingo Molnar
@ 2015-09-17 18:51       ` Andy Lutomirski
  0 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2015-09-17 18:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Paolo Bonzini, Peter Zijlstra, KVM list,
	Arjan van de Ven, xen-devel, linux-kernel, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Andrew Morton, H. Peter Anvin

On Thu, Sep 17, 2015 at 10:30 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
>> On Thu, Sep 17, 2015 at 12:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> >
>> > * Andy Lutomirski <luto@kernel.org> wrote:
>> >
>> >> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
>> >> turns all rdmsr and wrmsr operations into the safe variants without
>> >> any checks that the operations actually succeed.
>> >>
>> >> This is IMO awful: it papers over bugs.  In particular, KVM gueests
>> >> might be unwittingly depending on this behavior because
>> >> CONFIG_KVM_GUEST currently depends on CONFIG_PARAVIRT.  I'm not
>> >> aware of any such problems, but applying this series would be a good
>> >> way to shake them out.
>> >>
>> >> Fix it so that the MSR operations work the same on CONFIG_PARAVIRT=n
>> >> and CONFIG_PARAVIRT=y as long as Xen isn't being used.  The Xen
>> >> maintainers are welcome to make a similar change on top of this.
>> >>
>> >> Since there's plenty of time before the next merge window, I think
>> >> we should apply and fix anything that breaks.
>> >
>> > No, I think we should at most generate a warning instead, and not crash the kernel
>> > via rdmsr()!
>> >
>> > Most big distro kernels on bare metal have CONFIG_PARAVIRT=y (I checked Ubuntu and
>> > Fedora), so we are potentially exposing a lot of users to problems.
>> >
>> > Crashing the bootup on an unknown MSR is bad. Many MSR reads and writes are
>> > non-critical and returning the 'safe' result is much better than crashing or
>> > hanging the bootup.
>> >
>>
>> Should we do that for CONFIG_PARAVIRT=n, too?
>
> Absolutely. PARAVIRT=n should not behave differently from PARAVIRT=y on bare
> metal.
>
>> It would be straightforward to rig this up (temporarily?) on top of these
>> patches.  To keep bloat down, we might want to implement it in
>> do_general_protection rather than sticking it in native_read_msr.
>
> Fair enough.
>
>> wrmsr is a different beast, since we can fail due to writing the wrong value to
>> an otherwise valid MSR.  Given that MSR screwups can very easily be security
>> holes, I'm not sure that warning and blindly continuing on an unchecked failed
>> wrmsr is a good idea.
>
> So the fact is that right now we are silently ignoring failures there, and have
> been doing that for some time. The right first step is to live with that and start
> generating low-key, once-per-bootup warnings at most, and see how frequent (and
> how serious) they are.
>
> We could add a (default disabled) CONFIG_PANIC_ON_UNKNOWN_MSR=y option if that's
> really a serious concern.

Could we abuse panic_on_oops for this purpose?

>
>> In any event, I think it's nuts that CONFIG_PARAVIRT changes this
>> behavior.  We should pick something sane and stick with it.
>
> Absolutely - and as it happens, the 'does not crash the kernel' PARAVIRT=y
> accidental behavior is actually quite close to what we wanted for a long time, so
> let's make it official - and add a warning to make sure we are aware of problems.
>
> But don't turn 'potential problems' into showstopper bugs such as a hard to debug
> early boot crash, which to most Linux users means a black screen on bootup!
>

Fair enough.

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-09-17 18:51 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-16 23:33 [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Andy Lutomirski
2015-09-16 23:33 ` [PATCH 1/3] x86/paravirt: Add _safe to the read_msr and write_msr PV hooks Andy Lutomirski
2015-09-16 23:33 ` [PATCH 2/3] x86/paravirt: Add paravirt_{read,write}_msr Andy Lutomirski
2015-09-16 23:33 ` [PATCH 3/3] x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y Andy Lutomirski
2015-09-17  7:19 ` [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops Ingo Molnar
2015-09-17  9:31   ` Borislav Petkov
2015-09-17 11:22     ` H. Peter Anvin
2015-09-17 11:39     ` Paolo Bonzini
2015-09-17 15:27       ` Borislav Petkov
2015-09-17 15:32         ` [Xen-devel] " Andrew Cooper
2015-09-17 15:37           ` Borislav Petkov
2015-09-17 15:23   ` Andy Lutomirski
2015-09-17 15:27     ` Arjan van de Ven
2015-09-17 15:29       ` Paolo Bonzini
2015-09-17 15:31         ` Arjan van de Ven
2015-09-17 15:33           ` Paolo Bonzini
2015-09-17 17:30     ` Ingo Molnar
2015-09-17 18:51       ` Andy Lutomirski
2015-09-17  8:58 ` Peter Zijlstra
2015-09-17 11:40   ` Paolo Bonzini
2015-09-17 12:27     ` Peter Zijlstra
2015-09-17 15:17       ` Andy Lutomirski
2015-09-17 15:17         ` Peter Zijlstra
2015-09-17 15:26           ` Andy Lutomirski
2015-09-17 15:29             ` Paolo Bonzini
2015-09-17  9:10 ` [Xen-devel] " Andrew Cooper
2015-09-17 15:11   ` Boris Ostrovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).