linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs
       [not found]           ` <20201030190400.GA13797@agluck-desk2.amr.corp.intel.com>
@ 2020-10-30 19:08             ` Luck, Tony
  2020-11-02 11:12               ` Borislav Petkov
  2020-11-02 11:18               ` [tip: ras/core] " tip-bot2 for Tony Luck
  0 siblings, 2 replies; 21+ messages in thread
From: Luck, Tony @ 2020-10-30 19:08 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Thomas Gleixner, Philippe Conde, x86, linux-kernel

On Fri, Oct 30, 2020 at 12:04:03PM -0700, Luck, Tony wrote:

Bah, didn't notice this conversation didn't include LKML.

> The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an
> optional additional error logging mode which is enabled by an MSR.
> 
> Previously this mode was enabled from the mcelog(8) tool via /dev/cpu,
> but the kernel is now very picky about which MSRs may be written. So
> move the enabling into the kernel.
> 
> Suggested-by: Boris Petkov <bp@alien8.de>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> 
> N.B. I don't have any of these old systems in my lab any more. So
> this is untested :-(
> 
>  arch/x86/include/asm/msr-index.h |  1 +
>  arch/x86/kernel/cpu/mce/intel.c  | 20 ++++++++++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 972a34d93505..b2dd2648c0e2 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -139,6 +139,7 @@
>  #define MSR_IA32_MCG_CAP		0x00000179
>  #define MSR_IA32_MCG_STATUS		0x0000017a
>  #define MSR_IA32_MCG_CTL		0x0000017b
> +#define MSR_ERROR_CONTROL		0x0000017f
>  #define MSR_IA32_MCG_EXT_CTL		0x000004d0
>  
>  #define MSR_OFFCORE_RSP_0		0x000001a6
> diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
> index abe9fe0fb851..b47883e364b4 100644
> --- a/arch/x86/kernel/cpu/mce/intel.c
> +++ b/arch/x86/kernel/cpu/mce/intel.c
> @@ -509,12 +509,32 @@ static void intel_ppin_init(struct cpuinfo_x86 *c)
>  	}
>  }
>  
> +/*
> + * Enable additional error logs from the integrated
> + * memory controller on processors that support this.
> + */
> +static void intel_imc_init(struct cpuinfo_x86 *c)
> +{
> +	u64 error_control;
> +
> +	switch (c->x86_model) {
> +	case INTEL_FAM6_SANDYBRIDGE_X:
> +	case INTEL_FAM6_IVYBRIDGE_X:
> +	case INTEL_FAM6_HASWELL_X:
> +		rdmsrl(MSR_ERROR_CONTROL, error_control);
> +		error_control |= 2;
> +		wrmsrl(MSR_ERROR_CONTROL, error_control);
> +		break;
> +	}
> +}
> +
>  void mce_intel_feature_init(struct cpuinfo_x86 *c)
>  {
>  	intel_init_thermal(c);
>  	intel_init_cmci();
>  	intel_init_lmce();
>  	intel_ppin_init(c);
> +	intel_imc_init(c);
>  }
>  
>  void mce_intel_feature_clear(struct cpuinfo_x86 *c)
> -- 
> 2.21.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-10-30 19:08             ` [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs Luck, Tony
@ 2020-11-02 11:12               ` Borislav Petkov
  2020-11-02 11:18               ` [tip: ras/core] " tip-bot2 for Tony Luck
  1 sibling, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2020-11-02 11:12 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Thomas Gleixner, Philippe Conde, x86, linux-kernel

On Fri, Oct 30, 2020 at 12:08:07PM -0700, Luck, Tony wrote:
> On Fri, Oct 30, 2020 at 12:04:03PM -0700, Luck, Tony wrote:
> 
> Bah, didn't notice this conversation didn't include LKML.
> 
> > The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an
> > optional additional error logging mode which is enabled by an MSR.
> > 
> > Previously this mode was enabled from the mcelog(8) tool via /dev/cpu,
> > but the kernel is now very picky about which MSRs may be written. So
> > move the enabling into the kernel.
> > 
> > Suggested-by: Boris Petkov <bp@alien8.de>
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> > 
> > N.B. I don't have any of these old systems in my lab any more. So
> > this is untested :-(

I do:

# rdmsr 0x0000017f
2

Thx for doing this, queued.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-10-30 19:08             ` [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs Luck, Tony
  2020-11-02 11:12               ` Borislav Petkov
@ 2020-11-02 11:18               ` tip-bot2 for Tony Luck
  2020-11-09 21:55                 ` Qian Cai
  1 sibling, 1 reply; 21+ messages in thread
From: tip-bot2 for Tony Luck @ 2020-11-02 11:18 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Boris Petkov, Tony Luck, Borislav Petkov, x86, LKML

The following commit has been merged into the ras/core branch of tip:

Commit-ID:     68299a42f84288537ee3420c431ac0115ccb90b1
Gitweb:        https://git.kernel.org/tip/68299a42f84288537ee3420c431ac0115ccb90b1
Author:        Tony Luck <tony.luck@intel.com>
AuthorDate:    Fri, 30 Oct 2020 12:04:00 -07:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 02 Nov 2020 11:15:59 +01:00

x86/mce: Enable additional error logging on certain Intel CPUs

The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an
optional additional error logging mode which is enabled by an MSR.

Previously, this mode was enabled from the mcelog(8) tool via /dev/cpu,
but userspace should not be poking at MSRs. So move the enabling into
the kernel.

 [ bp: Correct the explanation why this is done. ]

Suggested-by: Boris Petkov <bp@alien8.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201030190807.GA13884@agluck-desk2.amr.corp.intel.com
---
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/kernel/cpu/mce/intel.c  | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 972a34d..b2dd264 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -139,6 +139,7 @@
 #define MSR_IA32_MCG_CAP		0x00000179
 #define MSR_IA32_MCG_STATUS		0x0000017a
 #define MSR_IA32_MCG_CTL		0x0000017b
+#define MSR_ERROR_CONTROL		0x0000017f
 #define MSR_IA32_MCG_EXT_CTL		0x000004d0
 
 #define MSR_OFFCORE_RSP_0		0x000001a6
diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index abe9fe0..b47883e 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -509,12 +509,32 @@ static void intel_ppin_init(struct cpuinfo_x86 *c)
 	}
 }
 
+/*
+ * Enable additional error logs from the integrated
+ * memory controller on processors that support this.
+ */
+static void intel_imc_init(struct cpuinfo_x86 *c)
+{
+	u64 error_control;
+
+	switch (c->x86_model) {
+	case INTEL_FAM6_SANDYBRIDGE_X:
+	case INTEL_FAM6_IVYBRIDGE_X:
+	case INTEL_FAM6_HASWELL_X:
+		rdmsrl(MSR_ERROR_CONTROL, error_control);
+		error_control |= 2;
+		wrmsrl(MSR_ERROR_CONTROL, error_control);
+		break;
+	}
+}
+
 void mce_intel_feature_init(struct cpuinfo_x86 *c)
 {
 	intel_init_thermal(c);
 	intel_init_cmci();
 	intel_init_lmce();
 	intel_ppin_init(c);
+	intel_imc_init(c);
 }
 
 void mce_intel_feature_clear(struct cpuinfo_x86 *c)

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-02 11:18               ` [tip: ras/core] " tip-bot2 for Tony Luck
@ 2020-11-09 21:55                 ` Qian Cai
  2020-11-09 22:09                   ` Luck, Tony
  0 siblings, 1 reply; 21+ messages in thread
From: Qian Cai @ 2020-11-09 21:55 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits, Tony Luck
  Cc: Boris Petkov, Borislav Petkov, x86, Paolo Bonzini, kvm

On Mon, 2020-11-02 at 11:18 +0000, tip-bot2 for Tony Luck wrote:
> The following commit has been merged into the ras/core branch of tip:
> 
> Commit-ID:     68299a42f84288537ee3420c431ac0115ccb90b1
> Gitweb:        
> https://git.kernel.org/tip/68299a42f84288537ee3420c431ac0115ccb90b1
> Author:        Tony Luck <tony.luck@intel.com>
> AuthorDate:    Fri, 30 Oct 2020 12:04:00 -07:00
> Committer:     Borislav Petkov <bp@suse.de>
> CommitterDate: Mon, 02 Nov 2020 11:15:59 +01:00
> 
> x86/mce: Enable additional error logging on certain Intel CPUs
> 
> The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an
> optional additional error logging mode which is enabled by an MSR.
> 
> Previously, this mode was enabled from the mcelog(8) tool via /dev/cpu,
> but userspace should not be poking at MSRs. So move the enabling into
> the kernel.
> 
>  [ bp: Correct the explanation why this is done. ]
> 
> Suggested-by: Boris Petkov <bp@alien8.de>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>

Booting a simple KVM guest using today's linux-next is now generating those
errors below inside the guest due to this patch. Are those expected?

# qemu-kvm -name kata -cpu host -smp 48 -m 48g -hda rhel-8.3-x86_64-kvm.img.qcow2 -cdrom kata.iso -nic user,hostfwd=tcp::2222-:22 -nographic

guest .config (if ever matters): https://cailca.coding.net/public/linux/mm/git/files/master/x86.config

[    6.801741][    T0] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1e3bca858ab, max_idle_ns: 440795282452 ns
[    6.804371][    T0] Calibrating delay loop (skipped), value calculated using timer frequency.. 4194.90 BogoMIPS (lpj=20974530)
[    6.806956][    T0] pid_max: default: 49152 minimum: 384
[    6.814328][    T0] Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    6.814328][    T0] Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    6.814328][    T0] x86/cpu: User Mode Instruction Prevention (UMIP) activated
[    6.814328][    T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0xffffffff84483f16 (mce_intel_feature_init+0x156/0x270)
[    6.814328][    T0] Call Trace:
[    6.814328][    T0]  __mcheck_cpu_init_vendor+0x105/0x250
__rdmsr at arch/x86/include/asm/msr.h:93
(inlined by) native_read_msr at arch/x86/include/asm/msr.h:127
(inlined by) intel_imc_init at arch/x86/kernel/cpu/mce/intel.c:524
(inlined by) mce_intel_feature_init at arch/x86/kernel/cpu/mce/intel.c:537
[    6.814328][    T0]  mcheck_cpu_init+0x21f/0xb00
[    6.814328][    T0]  identify_cpu+0xfcb/0x1980
[    6.814328][    T0]  identify_boot_cpu+0xd/0xb5
[    6.814328][    T0]  check_bugs+0x6c/0x1606
[    6.814328][    T0]  ? _raw_spin_unlock+0x1a/0x30
[    6.814328][    T0]  ? poking_init+0x2b5/0x2ea
[    6.814328][    T0]  ? l1tf_cmdline+0x11a/0x11a
[    6.814328][    T0]  ? lockdep_init_map_waits+0x267/0x6f0
[    6.814328][    T0]  start_kernel+0x372/0x39f
[    6.814328][    T0]  secondary_startup_64_no_verify+0xc2/0xcb
[    6.814328][    T0] unchecked MSR access error: WRMSR to 0x17f (tried to write 0x0000000000000002) at rIP: 0xffffffff84483f3a (mce_intel_feature_init+0x17a/0x270)
[    6.814328][    T0] Call Trace:
[    6.814328][    T0]  __mcheck_cpu_init_vendor+0x105/0x250
[    6.814328][    T0]  mcheck_cpu_init+0x21f/0xb00
[    6.814328][    T0]  identify_cpu+0xfcb/0x1980
[    6.814328][    T0]  identify_boot_cpu+0xd/0xb5
[    6.814328][    T0]  check_bugs+0x6c/0x1606
[    6.814328][    T0]  ? _raw_spin_unlock+0x1a/0x30
[    6.814328][    T0]  ? poking_init+0x2b5/0x2ea
[    6.814328][    T0]  ? l1tf_cmdline+0x11a/0x11a
[    6.814328][    T0]  ? lockdep_init_map_waits+0x267/0x6f0
[    6.814328][    T0]  start_kernel+0x372/0x39f
[    6.814328][    T0]  secondary_startup_64_no_verify+0xc2/0xcb
[    6.814328][    T0] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    6.814328][    T0] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0

== host CPU ==
# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              48
On-line CPU(s) list: 0-47
Thread(s) per core:  1
Core(s) per socket:  12
Socket(s):           4
NUMA node(s):        4
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-4650 v3 @ 2.10GHz
Stepping:            2
CPU MHz:             1980.076
BogoMIPS:            4195.25
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0-5,24-29
NUMA node1 CPU(s):   6-11,30-35
NUMA node2 CPU(s):   12-17,36-41
NUMA node3 CPU(s):   18-23,42-47
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb
invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority
ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm
xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts md_clear flush_l1d

> Link: https://lkml.kernel.org/r/20201030190807.GA13884@agluck-desk2.amr.corp.intel.com
> ---
>  arch/x86/include/asm/msr-index.h |  1 +
>  arch/x86/kernel/cpu/mce/intel.c  | 20 ++++++++++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-
> index.h
> index 972a34d..b2dd264 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -139,6 +139,7 @@
>  #define MSR_IA32_MCG_CAP		0x00000179
>  #define MSR_IA32_MCG_STATUS		0x0000017a
>  #define MSR_IA32_MCG_CTL		0x0000017b
> +#define MSR_ERROR_CONTROL		0x0000017f
>  #define MSR_IA32_MCG_EXT_CTL		0x000004d0
>  
>  #define MSR_OFFCORE_RSP_0		0x000001a6
> diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
> index abe9fe0..b47883e 100644
> --- a/arch/x86/kernel/cpu/mce/intel.c
> +++ b/arch/x86/kernel/cpu/mce/intel.c
> @@ -509,12 +509,32 @@ static void intel_ppin_init(struct cpuinfo_x86 *c)
>  	}
>  }
>  
> +/*
> + * Enable additional error logs from the integrated
> + * memory controller on processors that support this.
> + */
> +static void intel_imc_init(struct cpuinfo_x86 *c)
> +{
> +	u64 error_control;
> +
> +	switch (c->x86_model) {
> +	case INTEL_FAM6_SANDYBRIDGE_X:
> +	case INTEL_FAM6_IVYBRIDGE_X:
> +	case INTEL_FAM6_HASWELL_X:
> +		rdmsrl(MSR_ERROR_CONTROL, error_control);
> +		error_control |= 2;
> +		wrmsrl(MSR_ERROR_CONTROL, error_control);
> +		break;
> +	}
> +}
> +
>  void mce_intel_feature_init(struct cpuinfo_x86 *c)
>  {
>  	intel_init_thermal(c);
>  	intel_init_cmci();
>  	intel_init_lmce();
>  	intel_ppin_init(c);
> +	intel_imc_init(c);
>  }
>  
>  void mce_intel_feature_clear(struct cpuinfo_x86 *c)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 21:55                 ` Qian Cai
@ 2020-11-09 22:09                   ` Luck, Tony
  2020-11-09 22:36                     ` Jim Mattson
  0 siblings, 1 reply; 21+ messages in thread
From: Luck, Tony @ 2020-11-09 22:09 UTC (permalink / raw)
  To: Qian Cai, linux-kernel, linux-tip-commits
  Cc: Boris Petkov, Borislav Petkov, x86, Paolo Bonzini, kvm

What does KVM do with model specific MSRs?

Looks like you let the guest believe it was running on one of Sandy Bridge, Ivy Bridge, Haswell (Xeon).

So, the core MCE code tried to enable extended error reporting.

If there is a mode to have KVM let the guest think that it read/wrote MSR 0x17F,
but actually, doesn't do it ... that would seem to be a reasonable thing to do here.

-Tony

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 22:09                   ` Luck, Tony
@ 2020-11-09 22:36                     ` Jim Mattson
  2020-11-09 22:57                       ` Luck, Tony
  0 siblings, 1 reply; 21+ messages in thread
From: Jim Mattson @ 2020-11-09 22:36 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, Paolo Bonzini, kvm

On Mon, Nov 9, 2020 at 2:09 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> What does KVM do with model specific MSRs?

"Model specific model-specific registers?" :-)

KVM only implements a small subset of MSRs. By default, any access to
the rest raises #GP.

> Looks like you let the guest believe it was running on one of Sandy Bridge, Ivy Bridge, Haswell (Xeon).
>
> So, the core MCE code tried to enable extended error reporting.
>
> If there is a mode to have KVM let the guest think that it read/wrote MSR 0x17F,
> but actually, doesn't do it ... that would seem to be a reasonable thing to do here.

There is an 'ignore_msrs' module parameter, to sink writes and return
zero on reads for unknown MSRs, but I don't think it's commonly used.

I thought Linux had long ago gone the route of turning rdmsr/wrmsr
into rdmsr_safe/wrmsr_safe, so that the guest would ignore the #GPs on
writes and return zero to the caller for #GPs on reads.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 22:36                     ` Jim Mattson
@ 2020-11-09 22:57                       ` Luck, Tony
  2020-11-09 23:24                         ` [PATCH] x86/mce: Check for hypervisor before enabling additional error logging Luck, Tony
  2020-11-09 23:26                         ` [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs Jim Mattson
  0 siblings, 2 replies; 21+ messages in thread
From: Luck, Tony @ 2020-11-09 22:57 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, Paolo Bonzini, kvm

> I thought Linux had long ago gone the route of turning rdmsr/wrmsr
> into rdmsr_safe/wrmsr_safe, so that the guest would ignore the #GPs on
> writes and return zero to the caller for #GPs on reads.

Linux just switched that around for the machine check banks ... if they #GP
fault, then something is seriously wrong.

Maybe that isn't a general change of direction though. Perhaps I
should either use rdmsrl_safe() in this code. Or (better?) add

	if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
		return;

to the start of intel_imc_init().

-Tony

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-09 22:57                       ` Luck, Tony
@ 2020-11-09 23:24                         ` Luck, Tony
  2020-11-10  6:31                           ` Borislav Petkov
  2020-11-09 23:26                         ` [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs Jim Mattson
  1 sibling, 1 reply; 21+ messages in thread
From: Luck, Tony @ 2020-11-09 23:24 UTC (permalink / raw)
  To: Borislav Petkov, Jim Mattson
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, Paolo Bonzini, kvm

Booting as a guest under KVM results in error messages about
unchecked MSR access:

[    6.814328][    T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0xffffffff84483f16 (mce_intel_feature_init+0x156/0x270)

because KVM doesn't provide emulation for random model specific registers.

Check for X86_FEATURE_HYPERVISOR and skip trying to enable the mode (a
guest shouldn't be concerned with corrected errors anyway).

Reported-by: Qian Cai <cai@redhat.com>
Fixes: 68299a42f842 ("x86/mce: Enable additional error logging on certain Intel CPUs")
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mce/intel.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b47883e364b4..7f7d863400b7 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -517,6 +517,9 @@ static void intel_imc_init(struct cpuinfo_x86 *c)
 {
 	u64 error_control;
 
+	if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
+		return;
+
 	switch (c->x86_model) {
 	case INTEL_FAM6_SANDYBRIDGE_X:
 	case INTEL_FAM6_IVYBRIDGE_X:
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 22:57                       ` Luck, Tony
  2020-11-09 23:24                         ` [PATCH] x86/mce: Check for hypervisor before enabling additional error logging Luck, Tony
@ 2020-11-09 23:26                         ` Jim Mattson
  2020-11-09 23:36                           ` Luck, Tony
  1 sibling, 1 reply; 21+ messages in thread
From: Jim Mattson @ 2020-11-09 23:26 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, Paolo Bonzini, kvm

On Mon, Nov 9, 2020 at 2:57 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> > I thought Linux had long ago gone the route of turning rdmsr/wrmsr
> > into rdmsr_safe/wrmsr_safe, so that the guest would ignore the #GPs on
> > writes and return zero to the caller for #GPs on reads.
>
> Linux just switched that around for the machine check banks ... if they #GP
> fault, then something is seriously wrong.
>
> Maybe that isn't a general change of direction though. Perhaps I
> should either use rdmsrl_safe() in this code. Or (better?) add
>
>         if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
>                 return;
>
> to the start of intel_imc_init().

I wouldn't expect all hypervisors to necessarily set CPUID.01H:ECX[bit
31]. Architecturally, on Intel CPUs, that bit is simply defined as
"not used." There is no documented contract between Intel and
hypervisor vendors regarding the use of that bit. (AMD, on the other
hand, *does* document that bit as "reserved for use by hypervisor to
indicate guest status.")

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 23:26                         ` [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs Jim Mattson
@ 2020-11-09 23:36                           ` Luck, Tony
  2020-11-10  9:10                             ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Luck, Tony @ 2020-11-09 23:36 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, Paolo Bonzini, kvm

> I wouldn't expect all hypervisors to necessarily set CPUID.01H:ECX[bit
> 31]. Architecturally, on Intel CPUs, that bit is simply defined as
> "not used." There is no documented contract between Intel and
> hypervisor vendors regarding the use of that bit. (AMD, on the other
> hand, *does* document that bit as "reserved for use by hypervisor to
> indicate guest status.")

Maybe no contract ... but a bunch of places (many of them in Intel
specific code) that check for it.  Perhaps I should poke the owners
of the Intel SDM to accept the inevitable.

$ git grep "boot_cpu_has(X86_FEATURE_HYPERVISOR"
arch/x86/events/core.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
arch/x86/events/intel/core.c:   if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/events/intel/core.c:           int assume = 3 * !boot_cpu_has(X86_FEATURE_HYPERVISOR);
arch/x86/events/intel/cstate.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/events/intel/uncore.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/apic/apic.c:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/cpu/bugs.c:     else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
arch/x86/kernel/cpu/intel.c:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/cpu/mshyperv.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/cpu/vmware.c: * If !boot_cpu_has(X86_FEATURE_HYPERVISOR), vmware_hypercall_mode
arch/x86/kernel/cpu/vmware.c:   if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
arch/x86/kernel/jailhouse.c:        !boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/kvm.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/paravirt.c:     if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
arch/x86/kernel/tsc.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
arch/x86/mm/init_64.c:  if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
drivers/acpi/processor_idle.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h:       return boot_cpu_has(X86_FEATURE_HYPERVISOR);
drivers/gpu/drm/i915/i915_memcpy.c:         !boot_cpu_has(X86_FEATURE_HYPERVISOR))
drivers/gpu/drm/radeon/radeon_device.c: return boot_cpu_has(X86_FEATURE_HYPERVISOR);
drivers/visorbus/visorchipset.c:        if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {

-Tony

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-09 23:24                         ` [PATCH] x86/mce: Check for hypervisor before enabling additional error logging Luck, Tony
@ 2020-11-10  6:31                           ` Borislav Petkov
  2020-11-10  8:50                             ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2020-11-10  6:31 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Jim Mattson, Qian Cai, linux-kernel, linux-tip-commits, x86,
	Paolo Bonzini, kvm

On Mon, Nov 09, 2020 at 03:24:02PM -0800, Luck, Tony wrote:
> Booting as a guest under KVM results in error messages about
> unchecked MSR access:
> 
> [    6.814328][    T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0xffffffff84483f16 (mce_intel_feature_init+0x156/0x270)
> 
> because KVM doesn't provide emulation for random model specific registers.
> 
> Check for X86_FEATURE_HYPERVISOR and skip trying to enable the mode (a
> guest shouldn't be concerned with corrected errors anyway).
> 
> Reported-by: Qian Cai <cai@redhat.com>
> Fixes: 68299a42f842 ("x86/mce: Enable additional error logging on certain Intel CPUs")
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/mce/intel.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
> index b47883e364b4..7f7d863400b7 100644
> --- a/arch/x86/kernel/cpu/mce/intel.c
> +++ b/arch/x86/kernel/cpu/mce/intel.c
> @@ -517,6 +517,9 @@ static void intel_imc_init(struct cpuinfo_x86 *c)
>  {
>  	u64 error_control;
>  
> +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> +		return;
> +

Frankly, I'm tired of wagging the dog because the tail can't. If
qemu/kvm can't emulate a CPU model fully then it should ignore those
unknown MSR accesses by default, i.e., that "ignore_msrs" functionality
should be on by default I'd say...

We certainly can't be sprinkling this check everytime the kernel tries
to do something as basic as read an MSR.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10  6:31                           ` Borislav Petkov
@ 2020-11-10  8:50                             ` Paolo Bonzini
  2020-11-10  9:56                               ` Borislav Petkov
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2020-11-10  8:50 UTC (permalink / raw)
  To: Borislav Petkov, Luck, Tony
  Cc: Jim Mattson, Qian Cai, linux-kernel, linux-tip-commits, x86, kvm

On 10/11/20 07:31, Borislav Petkov wrote:
>>   
>> +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
>> +		return;
>> +
> Frankly, I'm tired of wagging the dog because the tail can't. If
> qemu/kvm can't emulate a CPU model fully then it should ignore those
> unknown MSR accesses by default, i.e., that "ignore_msrs" functionality
> should be on by default I'd say...
> 
> We certainly can't be sprinkling this check everytime the kernel tries
> to do something as basic as read an MSR.

You don't have to, also because it's wrong.  Fortunately it's much 
simpler than that:

1) ignore_msrs _cannot_ be on by default.  You cannot know in advance 
that for all non-architectural MSRs it's okay for them to read as zero 
and eat writes.  For some non-architectural MSR which never reads as 
zero on real hardware, who knows that there isn't some code using the 
contents of the MSR as a divisor, and causing a division by zero 
exception with ignore_msrs=1?

2) it's not just KVM.  _Any_ hypervisor is bound to have this issue for 
some non-architectural MSRs.  KVM just gets the flak because Linux CI 
environments (for obvious reasons) use it more than they use Hyper-V or 
ESXi or VirtualBox.

3) because of (1) and (2), the solution is very simple.  If the MSR is 
architectural, its absence is a KVM bug and we'll fix it in all stable 
versions.  If the MSR is not architectural (and 17Fh isn't; not only 
it's not mentioned in the SDM, even Google is failing me), never ever 
assume that the CPUID family/model/stepping implies a given MSR is 
there, and just use rdmsr_safe/wrmsr_safe.

So, for this patch,

Nacked-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs
  2020-11-09 23:36                           ` Luck, Tony
@ 2020-11-10  9:10                             ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2020-11-10  9:10 UTC (permalink / raw)
  To: Luck, Tony, Jim Mattson
  Cc: Qian Cai, linux-kernel, linux-tip-commits, Boris Petkov,
	Borislav Petkov, x86, kvm

> Maybe no contract ... but a bunch of places (many of them in Intel
> specific code) that check for it

Interestingly, quite a few of them are actually checking for HYPERVISOR 
not because of missing hypervisor features, but rather because 
hypervisors don't have to work around certain errata. :)

Full analysis after my sig, but tl;dr: the only case of using HYPERVISOR 
before using MSRs are in arch/x86/events/intel/cstate.c and 
arch/x86/events/intel/uncore.c.  There are some workarounds in 
drivers/gpu that might fall into a similar bucket.  But as far as MSRs 
go, the way to go  overwhelmingly seems to be {rd,wr}msrl_safe.

Thanks,

Paolo

On 10/11/20 00:36, Luck, Tony wrote:
> arch/x86/events/core.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {

Print a slightly less frightening warning.

> arch/x86/events/intel/core.c:   if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))

Working around KVM's ignore_msrs=1 option (and quite ugly; shows that 
the option shouldn't be enabled by default).

> arch/x86/events/intel/core.c:           int assume = 3 * !boot_cpu_has(X86_FEATURE_HYPERVISOR);

Seems unnecessary.

> arch/x86/events/intel/cstate.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/events/intel/uncore.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

Too complicated. :)

> arch/x86/kernel/apic/apic.c:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

Hypervisors don't have errata.

> arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/kernel/cpu/bugs.c:     else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> arch/x86/kernel/cpu/bugs.c:     if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> arch/x86/kernel/cpu/intel.c:    if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

Print different vulnerability status in sysfs.

> arch/x86/kernel/cpu/mshyperv.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/kernel/cpu/vmware.c: * If !boot_cpu_has(X86_FEATURE_HYPERVISOR), vmware_hypercall_mode
> arch/x86/kernel/cpu/vmware.c:   if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> arch/x86/kernel/jailhouse.c:        !boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/kernel/kvm.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
> arch/x86/kernel/paravirt.c:     if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))

Obviously needed before using paravirt features of the hypervisor.

> arch/x86/kernel/tsc.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||

Disables ART in VMs.  Probably the idea is that ART does not have an 
offset field similar to the TSC's, but it's not necessary.  Looking at 
the hypervisor-provided CPUID should be enough.

> arch/x86/mm/init_64.c:  if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {

Tweaks hotplug heuristics, no MSRs involved.

> drivers/acpi/processor_idle.c:  if (boot_cpu_has(X86_FEATURE_HYPERVISOR))

Avoids pointless hypervisor exit on idle (i.e. just an optimization).

> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h:       return boot_cpu_has(X86_FEATURE_HYPERVISOR);

Work around SR-IOV bugs.

> drivers/gpu/drm/i915/i915_memcpy.c:         !boot_cpu_has(X86_FEATURE_HYPERVISOR))

Work around KVM deficiency.

> drivers/gpu/drm/radeon/radeon_device.c: return boot_cpu_has(X86_FEATURE_HYPERVISOR);

Work around SR-IOV bugs.

> drivers/visorbus/visorchipset.c:        if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {

Needed before using paravirt features of the hypervisor.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10  8:50                             ` Paolo Bonzini
@ 2020-11-10  9:56                               ` Borislav Petkov
  2020-11-10 10:40                                 ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2020-11-10  9:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Luck, Tony, Jim Mattson, Qian Cai, linux-kernel,
	linux-tip-commits, x86, kvm

On Tue, Nov 10, 2020 at 09:50:43AM +0100, Paolo Bonzini wrote:
> 1) ignore_msrs _cannot_ be on by default.  You cannot know in advance that
> for all non-architectural MSRs it's okay for them to read as zero and eat
> writes.  For some non-architectural MSR which never reads as zero on real
> hardware, who knows that there isn't some code using the contents of the MSR
> as a divisor, and causing a division by zero exception with ignore_msrs=1?

So if you're emulating a certain type of hardware - say a certain CPU
model - then what are you saying? That you're emulating it but not
really all of it, just some bits?

Because this is what happens - the kernel checks that it runs on a
certain CPU type and this tells it that those MSRs are there. But then
comes virt and throws all assumptions out.

So if it emulates a CPU model and the kernel tries to access those MSRs,
then the HV should ignore those MSR accesses if it doesn't know about
them. Why should the kernel change everytime some tool or virtualization
has shortcomings?

> 2) it's not just KVM.  _Any_ hypervisor is bound to have this issue for some
> non-architectural MSRs.  KVM just gets the flak because Linux CI
> environments (for obvious reasons) use it more than they use Hyper-V or ESXi
> or VirtualBox.

It's not flak - I'm trying to find a solution which is workable for
both. The kernel is not at fault here.

> 3) because of (1) and (2), the solution is very simple.  If the MSR is
> architectural, its absence is a KVM bug and we'll fix it in all stable
> versions.  If the MSR is not architectural (and 17Fh isn't; not only it's
> not mentioned in the SDM,

It is mentioned in the SDM.

> even Google is failing me), never ever assume that the CPUID
> family/model/stepping implies a given MSR is there, and just use
> rdmsr_safe/wrmsr_safe.

Yes, we don't have a choice, as always. ;-\

But maybe we should have a choice and maybe qemu/kvm should have a way
to ignore certain MSRs for certain CPU types, regardless of them being
architectural or not.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10  9:56                               ` Borislav Petkov
@ 2020-11-10 10:40                                 ` Paolo Bonzini
  2020-11-10 15:50                                   ` Borislav Petkov
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2020-11-10 10:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Jim Mattson, Qian Cai, linux-kernel,
	linux-tip-commits, x86, kvm

On 10/11/20 10:56, Borislav Petkov wrote:
> On Tue, Nov 10, 2020 at 09:50:43AM +0100, Paolo Bonzini wrote:
>> 1) ignore_msrs _cannot_ be on by default.  You cannot know in advance that
>> for all non-architectural MSRs it's okay for them to read as zero and eat
>> writes.  For some non-architectural MSR which never reads as zero on real
>> hardware, who knows that there isn't some code using the contents of the MSR
>> as a divisor, and causing a division by zero exception with ignore_msrs=1?
> 
> So if you're emulating a certain type of hardware - say a certain CPU
> model - then what are you saying? That you're emulating it but not
> really all of it, just some bits?

We try to emulate all that is described in the SDM as architectural, as 
long as we expose the corresponding CPUID leaves.

However, f/m/s mean nothing when running virtualized.  First, trying to 
derive any non-architectural property from the f/m/s is going to fail. 
Second, even the host can be anything as long as it's newer than the 
f/m/s that the VM reports (i.e. you can get a Sandy Bridge model and 
model name even if running on Skylake).

Also, X86_FEATURE_HYPERVISOR might be clear even if running virtualized. 
  (Thank you nVidia for using it to play market segmentation games).

> Because this is what happens - the kernel checks that it runs on a
> certain CPU type and this tells it that those MSRs are there. But then
> comes virt and throws all assumptions out.
> 
> So if it emulates a CPU model and the kernel tries to access those MSRs,
> then the HV should ignore those MSR accesses if it doesn't know about
> them. Why should the kernel change everytime some tool or virtualization
> has shortcomings?

See above: how can the hypervisor know a safe value for all MSRs, 
possibly including the undocumented ones?

>> 3) because of (1) and (2), the solution is very simple.  If the MSR is
>> architectural, its absence is a KVM bug and we'll fix it in all stable
>> versions.  If the MSR is not architectural (and 17Fh isn't; not only it's
>> not mentioned in the SDM,
> 
> It is mentioned in the SDM.

Oh right they moved the MSRs to a separate manual; found it now.  Still, 
it's not architectural.

> But maybe we should have a choice and maybe qemu/kvm should have a way
> to ignore certain MSRs for certain CPU types, regardless of them being
> architectural or not.

If it makes sense to emulate certain non-architectural MSRs we can add 
them.  Supporting the error control MSR wouldn't even be hard, but I'm 
not sure it makes sense:

1) that MSR has not been there on current processors for several years 
(and therefore Intel has clearly no intention of making architectural). 
  For what we know, even current processors might not provide any of 
that extended information at all (and still the VM could present Sandy 
Bridge f/m/s).

2) it would only present extended error info if the host itself enables 
the bit, so one might question the wisdom of backporting that support 
this to stable kernels

3) It's unclear whether the guest would be able to use the extended 
error information at all (and in some cases the description in the 
manual is not even proper English: "allows the iMC to log first device 
error when corrected error is detected during normal read"?).

4) other hypervisors, including older distros, would likely have the 
same issue.

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10 10:40                                 ` Paolo Bonzini
@ 2020-11-10 15:50                                   ` Borislav Petkov
  2020-11-10 16:08                                     ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Borislav Petkov @ 2020-11-10 15:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Luck, Tony, Jim Mattson, Qian Cai, linux-kernel,
	linux-tip-commits, x86, kvm

On Tue, Nov 10, 2020 at 11:40:34AM +0100, Paolo Bonzini wrote:
> However, f/m/s mean nothing when running virtualized.  First, trying to
> derive any non-architectural property from the f/m/s is going to fail.
> Second, even the host can be anything as long as it's newer than the f/m/s
> that the VM reports (i.e. you can get a Sandy Bridge model and model name
> even if running on Skylake).

Yah, that's why I said "then comes virt and throws all assumptions out."

> See above: how can the hypervisor know a safe value for all MSRs, possibly
> including the undocumented ones?

Not all - just the ones which belong to a model. I was thinking of
having a mapping between f/m/s and a list of MSRs which those models
have - even non-architectural ones - but that's a waste of energy. Why?
Because using the *msr_safe() variants will give you the same thing
without doing any of the MSR lists in KVM and any of that jumping thru
hoops.

Which means, the kernel MSR accessors should be doing _safe(), i.e.,
exception handling, by default. And the non-safe ones - rdmsrl, wrmsrl,
etc, should all be used only in very seldom cases. Hmm, yeah, that would
probably solve this class of problems once and for all.

> If it makes sense to emulate certain non-architectural MSRs we can add them.

See above - probably not worth the effort.

I'll take a look at how ugly it would become to make the majority of MSR
accesses safe.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10 15:50                                   ` Borislav Petkov
@ 2020-11-10 16:08                                     ` Paolo Bonzini
  2020-11-10 17:52                                       ` Luck, Tony
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2020-11-10 16:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, Jim Mattson, Qian Cai, linux-kernel,
	linux-tip-commits, x86, kvm

On 10/11/20 16:50, Borislav Petkov wrote:
> I was thinking of
> having a mapping between f/m/s and a list of MSRs which those models
> have - even non-architectural ones - but that's a waste of energy. Why?
> Because using the *msr_safe() variants will give you the same thing

Yes, pretty much.

>> If it makes sense to emulate certain non-architectural MSRs we can add them.
> See above - probably not worth the effort.

When we do, certain Microsoft OSes are usually involved. :)

> I'll take a look at how ugly it would become to make the majority of MSR
> accesses safe.

I think most of them already are, especially the non-architectural ones, 
because I remember going through a similar discussion a few years ago 
and Andy said basically "screw it, let's just use *_safe anywhere" as 
well.  I don't see any need to do anything but add it to your review 
checklist if you have it (and do it now for MSR_ERROR_CONTROL).

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10 16:08                                     ` Paolo Bonzini
@ 2020-11-10 17:52                                       ` Luck, Tony
  2020-11-10 20:37                                         ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Luck, Tony @ 2020-11-10 17:52 UTC (permalink / raw)
  To: Paolo Bonzini, Borislav Petkov
  Cc: Jim Mattson, Qian Cai, linux-kernel, linux-tip-commits, x86, kvm

I still think there is a reasonable case to claim that this usage is right to check
whether it is running as a guest.

Look at what it is trying to do ... change the behavior of the platform w.r.t. logging
of memory errors.  How does that make any sense for a guest ... that doesn't even
know what memory is present on the platform. Or have guarantees that what it sees
as memory address 0x12345678 maps to the same set of cells in a DRAM from one
second to the next?

-Tony

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging
  2020-11-10 17:52                                       ` Luck, Tony
@ 2020-11-10 20:37                                         ` Paolo Bonzini
  2020-11-11  0:39                                           ` [PATCH v2] x86/mce: Use "safe" MSR functions when " Luck, Tony
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2020-11-10 20:37 UTC (permalink / raw)
  To: Luck, Tony, Borislav Petkov
  Cc: Jim Mattson, Qian Cai, linux-kernel, linux-tip-commits, x86, kvm

On 10/11/20 18:52, Luck, Tony wrote:
> Look at what it is trying to do ... change the behavior of the platform w.r.t. logging
> of memory errors.  How does that make any sense for a guest ...

Logging of memory errors certainly makes sense for a guest, KVM already 
does MCE forwarding as you probably know.

The exact set of information that MSR_ERROR_CONTROL[1] adds may not make 
much sense in the case of KVM, but it may make sense for other 
hypervisors that do nothing but partition the host.  (Difficult for me 
to say since the relevant part of the SDM might as well be written in 
Klingon :)).

In any case, checking HYPERVISOR is not enough because having it clear 
is a valid configuration.  So you would still have to switch to 
{rd,wr}msrl_safe, and then checking HYPERVISOR is pointless.

Paolo

> that doesn't even
> know what memory is present on the platform. Or have guarantees that what it sees
> as memory address 0x12345678 maps to the same set of cells in a DRAM from one
> second to the next?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2] x86/mce: Use "safe" MSR functions when enabling additional error logging
  2020-11-10 20:37                                         ` Paolo Bonzini
@ 2020-11-11  0:39                                           ` Luck, Tony
  2020-11-16 16:44                                             ` [tip: ras/core] " tip-bot2 for Tony Luck
  0 siblings, 1 reply; 21+ messages in thread
From: Luck, Tony @ 2020-11-11  0:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Borislav Petkov, Jim Mattson, Qian Cai, linux-kernel,
	linux-tip-commits, x86, kvm

Booting as a guest under KVM results in error messages about
unchecked MSR access:

[    6.814328][    T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0xffffffff84483f16 (mce_intel_feature_init+0x156/0x270)

because KVM doesn't provide emulation for random model specific registers.

Switch to using rdmsrl_safe()/wrmsrl_safe() to avoid the message.

Reported-by: Qian Cai <cai@redhat.com>
Fixes: 68299a42f842 ("x86/mce: Enable additional error logging on certain Intel CPUs")
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mce/intel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b47883e364b4..42e60ef16c3a 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -521,9 +521,10 @@ static void intel_imc_init(struct cpuinfo_x86 *c)
 	case INTEL_FAM6_SANDYBRIDGE_X:
 	case INTEL_FAM6_IVYBRIDGE_X:
 	case INTEL_FAM6_HASWELL_X:
-		rdmsrl(MSR_ERROR_CONTROL, error_control);
+		if (rdmsrl_safe(MSR_ERROR_CONTROL, &error_control))
+			return;
 		error_control |= 2;
-		wrmsrl(MSR_ERROR_CONTROL, error_control);
+		wrmsrl_safe(MSR_ERROR_CONTROL, error_control);
 		break;
 	}
 }
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip: ras/core] x86/mce: Use "safe" MSR functions when enabling additional error logging
  2020-11-11  0:39                                           ` [PATCH v2] x86/mce: Use "safe" MSR functions when " Luck, Tony
@ 2020-11-16 16:44                                             ` tip-bot2 for Tony Luck
  0 siblings, 0 replies; 21+ messages in thread
From: tip-bot2 for Tony Luck @ 2020-11-16 16:44 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Qian Cai, Tony Luck, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the ras/core branch of tip:

Commit-ID:     098416e6986127f7e4c8ce4fd6bbbd80e55b0386
Gitweb:        https://git.kernel.org/tip/098416e6986127f7e4c8ce4fd6bbbd80e55b0386
Author:        Tony Luck <tony.luck@intel.com>
AuthorDate:    Tue, 10 Nov 2020 16:39:54 -08:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 16 Nov 2020 17:34:08 +01:00

x86/mce: Use "safe" MSR functions when enabling additional error logging

Booting as a guest under KVM results in error messages about
unchecked MSR access:

  unchecked MSR access error: RDMSR from 0x17f at rIP: 0xffffffff84483f16 (mce_intel_feature_init+0x156/0x270)

because KVM doesn't provide emulation for random model specific
registers.

Switch to using rdmsrl_safe()/wrmsrl_safe() to avoid the message.

Fixes: 68299a42f842 ("x86/mce: Enable additional error logging on certain Intel CPUs")
Reported-by: Qian Cai <cai@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201111003954.GA11878@agluck-desk2.amr.corp.intel.com
---
 arch/x86/kernel/cpu/mce/intel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b47883e..c2476fe 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -521,9 +521,10 @@ static void intel_imc_init(struct cpuinfo_x86 *c)
 	case INTEL_FAM6_SANDYBRIDGE_X:
 	case INTEL_FAM6_IVYBRIDGE_X:
 	case INTEL_FAM6_HASWELL_X:
-		rdmsrl(MSR_ERROR_CONTROL, error_control);
+		if (rdmsrl_safe(MSR_ERROR_CONTROL, &error_control))
+			return;
 		error_control |= 2;
-		wrmsrl(MSR_ERROR_CONTROL, error_control);
+		wrmsrl_safe(MSR_ERROR_CONTROL, error_control);
 		break;
 	}
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-11-16 16:45 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <fcb21490-84a1-8b99-b494-3a6ac2a0e16a@skynet.be>
     [not found] ` <20201029100655.GA31903@zn.tnic>
     [not found]   ` <20201029151518.GA23990@agluck-desk2.amr.corp.intel.com>
     [not found]     ` <20201029194118.GC31903@zn.tnic>
     [not found]       ` <87ft5wo8zn.fsf@nanos.tec.linutronix.de>
     [not found]         ` <20201030091056.GA6532@zn.tnic>
     [not found]           ` <20201030190400.GA13797@agluck-desk2.amr.corp.intel.com>
2020-10-30 19:08             ` [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs Luck, Tony
2020-11-02 11:12               ` Borislav Petkov
2020-11-02 11:18               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-11-09 21:55                 ` Qian Cai
2020-11-09 22:09                   ` Luck, Tony
2020-11-09 22:36                     ` Jim Mattson
2020-11-09 22:57                       ` Luck, Tony
2020-11-09 23:24                         ` [PATCH] x86/mce: Check for hypervisor before enabling additional error logging Luck, Tony
2020-11-10  6:31                           ` Borislav Petkov
2020-11-10  8:50                             ` Paolo Bonzini
2020-11-10  9:56                               ` Borislav Petkov
2020-11-10 10:40                                 ` Paolo Bonzini
2020-11-10 15:50                                   ` Borislav Petkov
2020-11-10 16:08                                     ` Paolo Bonzini
2020-11-10 17:52                                       ` Luck, Tony
2020-11-10 20:37                                         ` Paolo Bonzini
2020-11-11  0:39                                           ` [PATCH v2] x86/mce: Use "safe" MSR functions when " Luck, Tony
2020-11-16 16:44                                             ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-11-09 23:26                         ` [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs Jim Mattson
2020-11-09 23:36                           ` Luck, Tony
2020-11-10  9:10                             ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).