linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] perf/x86/intel: Disable check_msr for real hw
@ 2019-06-14 11:28 Jiri Olsa
  2019-06-14 13:33 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jiri Olsa @ 2019-06-14 11:28 UTC (permalink / raw)
  To: Peter Zijlstra, Kan Liang
  Cc: Jiri Olsa, David Carrillo-Cisneros, Arnaldo Carvalho de Melo,
	lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin, Andi Kleen,
	Tom Vaden

hi,
the HPE server can do POST tracing and have enabled LBR
tracing during the boot, which makes check_msr fail falsly.

It looks like check_msr code was added only to check on guests
MSR access, would it be then ok to disable check_msr for real
hardware? (as in patch below)

We could also check if LBR tracing is enabled and make
appropriate checks, but this change is simpler ;-)

ideas? thanks,
jirka


---
 arch/x86/events/intel/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 71001f005bfe..1194ae7e1992 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -20,6 +20,7 @@
 #include <asm/intel-family.h>
 #include <asm/apic.h>
 #include <asm/cpu_device_id.h>
+#include <asm/hypervisor.h>
 
 #include "../perf_event.h"
 
@@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
 {
 	u64 val_old, val_new, val_tmp;
 
+	/*
+	 * Disable the check for real HW, so we don't
+	 * mess up with potentionaly enabled regs.
+	 */
+	if (hypervisor_is_type(X86_HYPER_NATIVE))
+		return true;
+
 	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-14 11:28 [RFC] perf/x86/intel: Disable check_msr for real hw Jiri Olsa
@ 2019-06-14 13:33 ` Peter Zijlstra
  2019-06-14 13:45 ` Liang, Kan
  2019-06-21 17:48 ` [RFC] perf/x86/intel: Disable check_msr for real hw Andi Kleen
  2 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2019-06-14 13:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Andi Kleen, Tom Vaden

On Fri, Jun 14, 2019 at 01:28:53PM +0200, Jiri Olsa wrote:
> hi,
> the HPE server can do POST tracing and have enabled LBR
> tracing during the boot, which makes check_msr fail falsly.
> 
> It looks like check_msr code was added only to check on guests
> MSR access, would it be then ok to disable check_msr for real
> hardware? (as in patch below)
> 
> We could also check if LBR tracing is enabled and make
> appropriate checks, but this change is simpler ;-)
> 
> ideas? thanks,
> jirka
> 
> 
> ---
>  arch/x86/events/intel/core.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 71001f005bfe..1194ae7e1992 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -20,6 +20,7 @@
>  #include <asm/intel-family.h>
>  #include <asm/apic.h>
>  #include <asm/cpu_device_id.h>
> +#include <asm/hypervisor.h>
>  
>  #include "../perf_event.h"
>  
> @@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
>  {
>  	u64 val_old, val_new, val_tmp;
>  
> +	/*
> +	 * Disable the check for real HW, so we don't
> +	 * mess up with potentionaly enabled regs.
> +	 */
> +	if (hypervisor_is_type(X86_HYPER_NATIVE))
> +		return true;

Yeah, I think that works, or !boot_cpu_has(X86_FEATURE_HYPERVISOR).

>  	/*
>  	 * Read the current value, change it and read it back to see if it
>  	 * matches, this is needed to detect certain hardware emulators
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-14 11:28 [RFC] perf/x86/intel: Disable check_msr for real hw Jiri Olsa
  2019-06-14 13:33 ` Peter Zijlstra
@ 2019-06-14 13:45 ` Liang, Kan
  2019-06-16 14:13   ` [PATCH] " Jiri Olsa
  2019-06-21 17:48 ` [RFC] perf/x86/intel: Disable check_msr for real hw Andi Kleen
  2 siblings, 1 reply; 14+ messages in thread
From: Liang, Kan @ 2019-06-14 13:45 UTC (permalink / raw)
  To: Jiri Olsa, Peter Zijlstra, Kan Liang
  Cc: Jiri Olsa, David Carrillo-Cisneros, Arnaldo Carvalho de Melo,
	lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin, Andi Kleen,
	Tom Vaden



On 6/14/2019 7:28 AM, Jiri Olsa wrote:
> hi,
> the HPE server can do POST tracing and have enabled LBR
> tracing during the boot, which makes check_msr fail falsly.
> 
> It looks like check_msr code was added only to check on guests
> MSR access, would it be then ok to disable check_msr for real
> hardware? (as in patch below)

Yes, the check_msr patch was to fix a bug report in guest.
I didn't get similar bug report for real hardware.
I think it should be OK to disable it for real hardware.

Thanks,
Kan

> 
> We could also check if LBR tracing is enabled and make
> appropriate checks, but this change is simpler ;-)
> 
> ideas? thanks,
> jirka
> 
> 
> ---
>   arch/x86/events/intel/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 71001f005bfe..1194ae7e1992 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -20,6 +20,7 @@
>   #include <asm/intel-family.h>
>   #include <asm/apic.h>
>   #include <asm/cpu_device_id.h>
> +#include <asm/hypervisor.h>
>   
>   #include "../perf_event.h"
>   
> @@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
>   {
>   	u64 val_old, val_new, val_tmp;
>   
> +	/*
> +	 * Disable the check for real HW, so we don't
> +	 * mess up with potentionaly enabled regs.
> +	 */
> +	if (hypervisor_is_type(X86_HYPER_NATIVE))
> +		return true;
> +
>   	/*
>   	 * Read the current value, change it and read it back to see if it
>   	 * matches, this is needed to detect certain hardware emulators
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] perf/x86/intel: Disable check_msr for real hw
  2019-06-14 13:45 ` Liang, Kan
@ 2019-06-16 14:13   ` Jiri Olsa
  2019-06-16 20:56     ` Vaden, Tom (HPE Server OS Architecture)
  2019-06-17 14:41     ` [tip:perf/core] perf/x86/intel: Disable check_msr for real HW tip-bot for Jiri Olsa
  0 siblings, 2 replies; 14+ messages in thread
From: Jiri Olsa @ 2019-06-16 14:13 UTC (permalink / raw)
  To: Liang, Kan, Peter Zijlstra
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Andi Kleen, Tom Vaden

On Fri, Jun 14, 2019 at 09:45:21AM -0400, Liang, Kan wrote:
> 
> 
> On 6/14/2019 7:28 AM, Jiri Olsa wrote:
> > hi,
> > the HPE server can do POST tracing and have enabled LBR
> > tracing during the boot, which makes check_msr fail falsly.
> > 
> > It looks like check_msr code was added only to check on guests
> > MSR access, would it be then ok to disable check_msr for real
> > hardware? (as in patch below)
> 
> Yes, the check_msr patch was to fix a bug report in guest.
> I didn't get similar bug report for real hardware.
> I think it should be OK to disable it for real hardware.
> 

thanks for confirmation, attaching the full patch

thanks,
jirka


---
Tom Vaden reported false failure of check_msr function, because
some servers can do POST tracing and enable LBR tracing during
the boot.

Kan confirmed that check_msr patch was to fix a bug report in
guest, so it's ok to disable it for real HW.

Cc: Kan Liang <kan.liang@intel.com>
Reported-by: Tom Vaden <tom.vaden@hpe.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/events/intel/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 71001f005bfe..1194ae7e1992 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -20,6 +20,7 @@
 #include <asm/intel-family.h>
 #include <asm/apic.h>
 #include <asm/cpu_device_id.h>
+#include <asm/hypervisor.h>
 
 #include "../perf_event.h"
 
@@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
 {
 	u64 val_old, val_new, val_tmp;
 
+	/*
+	 * Disable the check for real HW, so we don't
+	 * mess up with potentionaly enabled regs.
+	 */
+	if (hypervisor_is_type(X86_HYPER_NATIVE))
+		return true;
+
 	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] perf/x86/intel: Disable check_msr for real hw
  2019-06-16 14:13   ` [PATCH] " Jiri Olsa
@ 2019-06-16 20:56     ` Vaden, Tom (HPE Server OS Architecture)
  2019-06-17 14:41     ` [tip:perf/core] perf/x86/intel: Disable check_msr for real HW tip-bot for Jiri Olsa
  1 sibling, 0 replies; 14+ messages in thread
From: Vaden, Tom (HPE Server OS Architecture) @ 2019-06-16 20:56 UTC (permalink / raw)
  To: Jiri Olsa, Liang, Kan, Peter Zijlstra
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Andi Kleen



On 6/16/19 10:13 AM, Jiri Olsa wrote:
> On Fri, Jun 14, 2019 at 09:45:21AM -0400, Liang, Kan wrote:
>>
>>
>> On 6/14/2019 7:28 AM, Jiri Olsa wrote:
>>> hi,
>>> the HPE server can do POST tracing and have enabled LBR
>>> tracing during the boot, which makes check_msr fail falsly.
>>>
>>> It looks like check_msr code was added only to check on guests
>>> MSR access, would it be then ok to disable check_msr for real
>>> hardware? (as in patch below)
>>
>> Yes, the check_msr patch was to fix a bug report in guest.
>> I didn't get similar bug report for real hardware.
>> I think it should be OK to disable it for real hardware.
>>
> 
> thanks for confirmation, attaching the full patch
> 
> thanks,
> jirka
> 
> 
> ---
> Tom Vaden reported false failure of check_msr function, because
> some servers can do POST tracing and enable LBR tracing during
> the boot.
> 
> Kan confirmed that check_msr patch was to fix a bug report in
> guest, so it's ok to disable it for real HW.
> 
> Cc: Kan Liang <kan.liang@intel.com>
> Reported-by: Tom Vaden <tom.vaden@hpe.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Thanks for handling the patch.

Acked-by: Tom Vaden <tom.vaden@hpe.com>

> ---
>   arch/x86/events/intel/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 71001f005bfe..1194ae7e1992 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -20,6 +20,7 @@
>   #include <asm/intel-family.h>
>   #include <asm/apic.h>
>   #include <asm/cpu_device_id.h>
> +#include <asm/hypervisor.h>
>   
>   #include "../perf_event.h"
>   
> @@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
>   {
>   	u64 val_old, val_new, val_tmp;
>   
> +	/*
> +	 * Disable the check for real HW, so we don't
> +	 * mess up with potentionaly enabled regs.
> +	 */
> +	if (hypervisor_is_type(X86_HYPER_NATIVE))
> +		return true;
> +
>   	/*
>   	 * Read the current value, change it and read it back to see if it
>   	 * matches, this is needed to detect certain hardware emulators
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:perf/core] perf/x86/intel: Disable check_msr for real HW
  2019-06-16 14:13   ` [PATCH] " Jiri Olsa
  2019-06-16 20:56     ` Vaden, Tom (HPE Server OS Architecture)
@ 2019-06-17 14:41     ` tip-bot for Jiri Olsa
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-06-17 14:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, torvalds, mingo, kan.liang, peterz, acme,
	alexander.shishkin, jolsa, jolsa, tom.vaden, hpa, namhyung, tglx

Commit-ID:  d0e1a507bdc761a14906f03399d933ea639a1756
Gitweb:     https://git.kernel.org/tip/d0e1a507bdc761a14906f03399d933ea639a1756
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Sun, 16 Jun 2019 16:13:13 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 17 Jun 2019 12:36:24 +0200

perf/x86/intel: Disable check_msr for real HW

Tom Vaden reported false failure of the check_msr() function, because
some servers can do POST tracing and enable LBR tracing during
bootup.

Kan confirmed that check_msr patch was to fix a bug report in
guest, so it's ok to disable it for real HW.

Reported-by: Tom Vaden <tom.vaden@hpe.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tom Vaden <tom.vaden@hpe.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Liang Kan <kan.liang@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190616141313.GD2500@krava
[ Readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5e6ae481dee7..bda450ff51ee 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -20,6 +20,7 @@
 #include <asm/intel-family.h>
 #include <asm/apic.h>
 #include <asm/cpu_device_id.h>
+#include <asm/hypervisor.h>
 
 #include "../perf_event.h"
 
@@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
 {
 	u64 val_old, val_new, val_tmp;
 
+	/*
+	 * Disable the check for real HW, so we don't
+	 * mess with potentionaly enabled registers:
+	 */
+	if (hypervisor_is_type(X86_HYPER_NATIVE))
+		return true;
+
 	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-14 11:28 [RFC] perf/x86/intel: Disable check_msr for real hw Jiri Olsa
  2019-06-14 13:33 ` Peter Zijlstra
  2019-06-14 13:45 ` Liang, Kan
@ 2019-06-21 17:48 ` Andi Kleen
  2019-06-23 22:40   ` Jiri Olsa
  2 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2019-06-21 17:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden

On Fri, Jun 14, 2019 at 01:28:53PM +0200, Jiri Olsa wrote:
> hi,
> the HPE server can do POST tracing and have enabled LBR
> tracing during the boot, which makes check_msr fail falsly.
> 
> It looks like check_msr code was added only to check on guests
> MSR access, would it be then ok to disable check_msr for real
> hardware? (as in patch below)
> 
> We could also check if LBR tracing is enabled and make
> appropriate checks, but this change is simpler ;-)
> 
> ideas? thanks,
> jirka

Sorry for the late comment. I see this patch has been merged now.

Unfortunately I don't think it's a good idea. The problem 
is that the hypervisor flags are only set for a few hypervisors
that Linux knows about. But in practice there are many more
Hypervisors around that will not cause these flags to be set.
But these are still likely to miss MSRs.

The other hypervisors are relatively obscure, but eventually
someone will hit problems.

-Andi



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-21 17:48 ` [RFC] perf/x86/intel: Disable check_msr for real hw Andi Kleen
@ 2019-06-23 22:40   ` Jiri Olsa
  2019-06-23 23:44     ` Vaden, Tom (HPE Server OS Architecture)
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jiri Olsa @ 2019-06-23 22:40 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Paolo Bonzini, Juergen Gross,
	Alok Kataria

On Fri, Jun 21, 2019 at 10:48:25AM -0700, Andi Kleen wrote:
> On Fri, Jun 14, 2019 at 01:28:53PM +0200, Jiri Olsa wrote:
> > hi,
> > the HPE server can do POST tracing and have enabled LBR
> > tracing during the boot, which makes check_msr fail falsly.
> > 
> > It looks like check_msr code was added only to check on guests
> > MSR access, would it be then ok to disable check_msr for real
> > hardware? (as in patch below)
> > 
> > We could also check if LBR tracing is enabled and make
> > appropriate checks, but this change is simpler ;-)
> > 
> > ideas? thanks,
> > jirka
> 
> Sorry for the late comment. I see this patch has been merged now.
> 
> Unfortunately I don't think it's a good idea. The problem 
> is that the hypervisor flags are only set for a few hypervisors
> that Linux knows about. But in practice there are many more
> Hypervisors around that will not cause these flags to be set.
> But these are still likely to miss MSRs.
> 
> The other hypervisors are relatively obscure, but eventually
> someone will hit problems.

any idea if there's any other flag/way we could use to detect those?

adding few virtualization folks to the loop
and attaching the original patch

thanks,
jirka


---
Tom Vaden reported false failure of check_msr function, because
some servers can do POST tracing and enable LBR tracing during
the boot.

Kan confirmed that check_msr patch was to fix a bug report in
guest, so it's ok to disable it for real HW.

Cc: Kan Liang <kan.liang@intel.com>
Reported-by: Tom Vaden <tom.vaden@hpe.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/events/intel/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 71001f005bfe..1194ae7e1992 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -20,6 +20,7 @@
 #include <asm/intel-family.h>
 #include <asm/apic.h>
 #include <asm/cpu_device_id.h>
+#include <asm/hypervisor.h>
 
 #include "../perf_event.h"
 
@@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
 {
 	u64 val_old, val_new, val_tmp;
 
+	/*
+	 * Disable the check for real HW, so we don't
+	 * mess up with potentionaly enabled regs.
+	 */
+	if (hypervisor_is_type(X86_HYPER_NATIVE))
+		return true;
+
 	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-23 22:40   ` Jiri Olsa
@ 2019-06-23 23:44     ` Vaden, Tom (HPE Server OS Architecture)
  2019-06-24  8:06     ` Paolo Bonzini
  2019-06-24 16:46     ` Andi Kleen
  2 siblings, 0 replies; 14+ messages in thread
From: Vaden, Tom (HPE Server OS Architecture) @ 2019-06-23 23:44 UTC (permalink / raw)
  To: Jiri Olsa, Andi Kleen
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Paolo Bonzini, Juergen Gross, Alok Kataria



On 6/23/19 6:40 PM, Jiri Olsa wrote:
> On Fri, Jun 21, 2019 at 10:48:25AM -0700, Andi Kleen wrote:
>> On Fri, Jun 14, 2019 at 01:28:53PM +0200, Jiri Olsa wrote:
>>> hi,
>>> the HPE server can do POST tracing and have enabled LBR
>>> tracing during the boot, which makes check_msr fail falsly.
>>>
>>> It looks like check_msr code was added only to check on guests
>>> MSR access, would it be then ok to disable check_msr for real
>>> hardware? (as in patch below)
>>>
>>> We could also check if LBR tracing is enabled and make
>>> appropriate checks, but this change is simpler ;-)
>>>
>>> ideas? thanks,
>>> jirka
>>
>> Sorry for the late comment. I see this patch has been merged now.
>>
>> Unfortunately I don't think it's a good idea. The problem
>> is that the hypervisor flags are only set for a few hypervisors
>> that Linux knows about. But in practice there are many more
>> Hypervisors around that will not cause these flags to be set.
>> But these are still likely to miss MSRs.
>>
>> The other hypervisors are relatively obscure, but eventually
>> someone will hit problems.
> 
> any idea if there's any other flag/way we could use to detect those?
> 
> adding few virtualization folks to the loop
> and attaching the original patch
> 
> thanks,
> jirka
> 
Would it be reasonable to change the sense of the original patch in 
commit 4550931 to only enable the check for the set of "certain hardware 
emulators" and leave the check otherwise disabled by default? I'm 
assuming that set is known (and small)?

> 
> ---
> Tom Vaden reported false failure of check_msr function, because
> some servers can do POST tracing and enable LBR tracing during
> the boot.
> 
> Kan confirmed that check_msr patch was to fix a bug report in
> guest, so it's ok to disable it for real HW.
> 
> Cc: Kan Liang <kan.liang@intel.com>
> Reported-by: Tom Vaden <tom.vaden@hpe.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   arch/x86/events/intel/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 71001f005bfe..1194ae7e1992 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -20,6 +20,7 @@
>   #include <asm/intel-family.h>
>   #include <asm/apic.h>
>   #include <asm/cpu_device_id.h>
> +#include <asm/hypervisor.h>
>   
>   #include "../perf_event.h"
>   
> @@ -4050,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
>   {
>   	u64 val_old, val_new, val_tmp;
>   
> +	/*
> +	 * Disable the check for real HW, so we don't
> +	 * mess up with potentionaly enabled regs.
> +	 */
> +	if (hypervisor_is_type(X86_HYPER_NATIVE))
> +		return true;
> +
>   	/*
>   	 * Read the current value, change it and read it back to see if it
>   	 * matches, this is needed to detect certain hardware emulators
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-23 22:40   ` Jiri Olsa
  2019-06-23 23:44     ` Vaden, Tom (HPE Server OS Architecture)
@ 2019-06-24  8:06     ` Paolo Bonzini
  2019-06-24 16:46     ` Andi Kleen
  2 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2019-06-24  8:06 UTC (permalink / raw)
  To: Jiri Olsa, Andi Kleen
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Juergen Gross, Alok Kataria

On 24/06/19 00:40, Jiri Olsa wrote:
> On Fri, Jun 21, 2019 at 10:48:25AM -0700, Andi Kleen wrote:
>> On Fri, Jun 14, 2019 at 01:28:53PM +0200, Jiri Olsa wrote:
>>> hi,
>>> the HPE server can do POST tracing and have enabled LBR
>>> tracing during the boot, which makes check_msr fail falsly.
>>>
>>> It looks like check_msr code was added only to check on guests
>>> MSR access, would it be then ok to disable check_msr for real
>>> hardware? (as in patch below)
>>>
>>> We could also check if LBR tracing is enabled and make
>>> appropriate checks, but this change is simpler ;-)
>>>
>>> ideas? thanks,
>>> jirka
>>
>> Sorry for the late comment. I see this patch has been merged now.
>>
>> Unfortunately I don't think it's a good idea. The problem 
>> is that the hypervisor flags are only set for a few hypervisors
>> that Linux knows about. But in practice there are many more
>> Hypervisors around that will not cause these flags to be set.
>> But these are still likely to miss MSRs.
>>
>> The other hypervisors are relatively obscure, but eventually
>> someone will hit problems.
> 
> any idea if there's any other flag/way we could use to detect those?

There's no silver bullet, the best is boot_cpu_has(X86_FEATURE_HYPERVISOR).

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-23 22:40   ` Jiri Olsa
  2019-06-23 23:44     ` Vaden, Tom (HPE Server OS Architecture)
  2019-06-24  8:06     ` Paolo Bonzini
@ 2019-06-24 16:46     ` Andi Kleen
  2019-06-24 18:06       ` Jiri Olsa
  2 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2019-06-24 16:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Paolo Bonzini, Juergen Gross,
	Alok Kataria

> > The other hypervisors are relatively obscure, but eventually
> > someone will hit problems.
> 
> any idea if there's any other flag/way we could use to detect those?

I'm not aware of a generic way to detect any hypervisor unfortunately.

There are hypervisor reserved cpuid ranges, in theory you could
probe the existence of those. But there might be always some which
don't have extra CPUIDs.

> 
> adding few virtualization folks to the loop
> and attaching the original patch
> 
> thanks,
> jirka
> 
> 
> ---
> Tom Vaden reported false failure of check_msr function, because
> some servers can do POST tracing and enable LBR tracing during
> the boot.

Just to understand the original problem, the LBR registers
get locked somehow? It would be reasonable to not use LBRs
in this case. We just need to make sure everything 
else is still probed.

-Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-24 16:46     ` Andi Kleen
@ 2019-06-24 18:06       ` Jiri Olsa
  2019-06-24 18:38         ` Andi Kleen
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2019-06-24 18:06 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Paolo Bonzini, Juergen Gross,
	Alok Kataria

On Mon, Jun 24, 2019 at 09:46:17AM -0700, Andi Kleen wrote:
> > > The other hypervisors are relatively obscure, but eventually
> > > someone will hit problems.
> > 
> > any idea if there's any other flag/way we could use to detect those?
> 
> I'm not aware of a generic way to detect any hypervisor unfortunately.
> 
> There are hypervisor reserved cpuid ranges, in theory you could
> probe the existence of those. But there might be always some which
> don't have extra CPUIDs.
> 
> > 
> > adding few virtualization folks to the loop
> > and attaching the original patch
> > 
> > thanks,
> > jirka
> > 
> > 
> > ---
> > Tom Vaden reported false failure of check_msr function, because
> > some servers can do POST tracing and enable LBR tracing during
> > the boot.
> 
> Just to understand the original problem, the LBR registers
> get locked somehow? It would be reasonable to not use LBRs
> in this case. We just need to make sure everything 
> else is still probed.

Tom, plz correctme if I'm wrongm but AFAIK because the LBR tracing is
enabled during the boot the lbr_from/lbr_to registers will fail the
check_msr 'val_new != val_tmp' check

if there's no good way to detect this, maybe we add boot option
to disable the check_msr check

jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-24 18:06       ` Jiri Olsa
@ 2019-06-24 18:38         ` Andi Kleen
  2019-06-24 18:49           ` Jiri Olsa
  0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2019-06-24 18:38 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Paolo Bonzini, Juergen Gross,
	Alok Kataria

> Tom, plz correctme if I'm wrongm but AFAIK because the LBR tracing is
> enabled during the boot the lbr_from/lbr_to registers will fail the
> check_msr 'val_new != val_tmp' check

Ok this should be handleable. It should be enough to check
the ctrl register, if that working likely we don't need
to check the data registers which might be changing.

-Andi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] perf/x86/intel: Disable check_msr for real hw
  2019-06-24 18:38         ` Andi Kleen
@ 2019-06-24 18:49           ` Jiri Olsa
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2019-06-24 18:49 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Kan Liang, Jiri Olsa, David Carrillo-Cisneros,
	Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Tom Vaden, Paolo Bonzini, Juergen Gross,
	Alok Kataria

On Mon, Jun 24, 2019 at 11:38:06AM -0700, Andi Kleen wrote:
> > Tom, plz correctme if I'm wrongm but AFAIK because the LBR tracing is
> > enabled during the boot the lbr_from/lbr_to registers will fail the
> > check_msr 'val_new != val_tmp' check
> 
> Ok this should be handleable. It should be enough to check
> the ctrl register, if that working likely we don't need
> to check the data registers which might be changing.

if the control register check is enough, we should be ok then,
I'll send new version

thanks,
jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-06-24 18:49 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-14 11:28 [RFC] perf/x86/intel: Disable check_msr for real hw Jiri Olsa
2019-06-14 13:33 ` Peter Zijlstra
2019-06-14 13:45 ` Liang, Kan
2019-06-16 14:13   ` [PATCH] " Jiri Olsa
2019-06-16 20:56     ` Vaden, Tom (HPE Server OS Architecture)
2019-06-17 14:41     ` [tip:perf/core] perf/x86/intel: Disable check_msr for real HW tip-bot for Jiri Olsa
2019-06-21 17:48 ` [RFC] perf/x86/intel: Disable check_msr for real hw Andi Kleen
2019-06-23 22:40   ` Jiri Olsa
2019-06-23 23:44     ` Vaden, Tom (HPE Server OS Architecture)
2019-06-24  8:06     ` Paolo Bonzini
2019-06-24 16:46     ` Andi Kleen
2019-06-24 18:06       ` Jiri Olsa
2019-06-24 18:38         ` Andi Kleen
2019-06-24 18:49           ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).