linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists
@ 2015-05-18 19:16 Don Zickus
  2015-05-19  8:17 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Don Zickus @ 2015-05-18 19:16 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: george.dunlap, LKML, Ingo Molnar, konrad.wilk, Don Zickus

I stumbled upon an AMD box that had the BIOS using a hardware counter.  Instead
of printing out a warning and continuing, it failed and blocked further perf
counter usage.

Looking through the history, I found commit a5ebe0ba3dff had tweaked the rules
for a xen guest on an almost identical box and now changed the behaviour.

Unfortunately the rules were tweaked incorrectly and will always lead to msr
failures even though the msrs are completely fine.

What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists:

<snip>
        for (i = 0; i < x86_pmu.num_counters; i++) {
                reg = x86_pmu_config_addr(i);
                ret = rdmsrl_safe(reg, &val);
                if (ret)
                        goto msr_fail;
                if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
                        bios_fail = 1;
                        val_fail = val;
                        reg_fail = reg;
                }
        }

<snip>
        /*
         * Read the current value, change it and read it back to see if it
         * matches, this is needed to detect certain hardware emulators
         * (qemu/kvm) that don't trap on the MSR access and always return 0s.
         */
        reg = x86_pmu_event_addr(0);
				^^^^

if the first perf counter is enabled, then this routine will always fail
because the counter is running. :-(

        if (rdmsrl_safe(reg, &val))
                goto msr_fail;
        val ^= 0xffffUL;
        ret = wrmsrl_safe(reg, val);
        ret |= rdmsrl_safe(reg, &val_new);
        if (ret || val != val_new)
                goto msr_fail;

The above bios_fail used to be a 'goto' which is why it worked in the past.

Further, most vendors have migrated to using fixed counters to hide their
evilness hence this problem rarely shows up now days except on a few old boxes.

I fixed my problem and kept the spirit of the original Xen fix, by recording a
safe non-enable register to be used safely for the reading/writing check.
Because it is not enabled, this passes on bare metal boxes (like metal), but
should continue to throw an msr_fail on Xen guests because the register isn't
emulated yet.

Now I get a proper bios_fail error message and Xen should still see their
msr_fail message (untested).

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/cpu/perf_event.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 1ee7b19..e4dc4e5 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -186,6 +186,7 @@ static bool check_hw_exists(void)
 	u64 val, val_fail, val_new= ~0;
 	int i, reg, reg_fail, ret = 0;
 	int bios_fail = 0;
+	int reg_safe = -1;
 
 	/*
 	 * Check to see if the BIOS enabled any of the counters, if so
@@ -200,6 +201,8 @@ static bool check_hw_exists(void)
 			bios_fail = 1;
 			val_fail = val;
 			reg_fail = reg;
+		} else {
+			reg_safe = i;
 		}
 	}
 
@@ -218,11 +221,22 @@ static bool check_hw_exists(void)
 	}
 
 	/*
+	 * If all the counters are enabled, the below test will always
+	 * fail.  The tools will also become useless in this scenario.
+	 * Just fail and disable the hardware counters.
+	 */
+
+	if (reg_safe == -1) {
+		reg = reg_safe;
+		goto msr_fail;
+	}
+
+	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators
 	 * (qemu/kvm) that don't trap on the MSR access and always return 0s.
 	 */
-	reg = x86_pmu_event_addr(0);
+	reg = x86_pmu_event_addr(reg_safe);
 	if (rdmsrl_safe(reg, &val))
 		goto msr_fail;
 	val ^= 0xffffUL;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists
  2015-05-18 19:16 [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists Don Zickus
@ 2015-05-19  8:17 ` Peter Zijlstra
  2015-05-21 17:57 ` George Dunlap
  2015-05-27 10:02 ` [tip:perf/core] perf/x86: Tweak broken BIOS rules during check_hw_exists() tip-bot for Don Zickus
  2 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2015-05-19  8:17 UTC (permalink / raw)
  To: Don Zickus; +Cc: george.dunlap, LKML, Ingo Molnar, konrad.wilk

On Mon, May 18, 2015 at 03:16:48PM -0400, Don Zickus wrote:
> I stumbled upon an AMD box that had the BIOS using a hardware counter.  Instead
> of printing out a warning and continuing, it failed and blocked further perf
> counter usage.

Hehe, which was the original behaviour iirc.

> Looking through the history, I found commit a5ebe0ba3dff had tweaked the rules
> for a xen guest on an almost identical box and now changed the behaviour.
> 
> Unfortunately the rules were tweaked incorrectly and will always lead to msr
> failures even though the msrs are completely fine.
> 
> What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists:
> 
> <snip>
>         for (i = 0; i < x86_pmu.num_counters; i++) {
>                 reg = x86_pmu_config_addr(i);
>                 ret = rdmsrl_safe(reg, &val);
>                 if (ret)
>                         goto msr_fail;
>                 if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
>                         bios_fail = 1;
>                         val_fail = val;
>                         reg_fail = reg;
>                 }
>         }
> 
> <snip>
>         /*
>          * Read the current value, change it and read it back to see if it
>          * matches, this is needed to detect certain hardware emulators
>          * (qemu/kvm) that don't trap on the MSR access and always return 0s.
>          */
>         reg = x86_pmu_event_addr(0);
> 				^^^^
> 
> if the first perf counter is enabled, then this routine will always fail
> because the counter is running. :-(
> 
>         if (rdmsrl_safe(reg, &val))
>                 goto msr_fail;
>         val ^= 0xffffUL;
>         ret = wrmsrl_safe(reg, val);
>         ret |= rdmsrl_safe(reg, &val_new);
>         if (ret || val != val_new)
>                 goto msr_fail;
> 
> The above bios_fail used to be a 'goto' which is why it worked in the past.
> 
> Further, most vendors have migrated to using fixed counters to hide their
> evilness hence this problem rarely shows up now days except on a few old boxes.
> 
> I fixed my problem and kept the spirit of the original Xen fix, by recording a
> safe non-enable register to be used safely for the reading/writing check.
> Because it is not enabled, this passes on bare metal boxes (like metal), but
> should continue to throw an msr_fail on Xen guests because the register isn't
> emulated yet.
> 
> Now I get a proper bios_fail error message and Xen should still see their
> msr_fail message (untested).

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists
  2015-05-18 19:16 [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists Don Zickus
  2015-05-19  8:17 ` Peter Zijlstra
@ 2015-05-21 17:57 ` George Dunlap
  2015-06-02 15:15   ` George Dunlap
  2015-05-27 10:02 ` [tip:perf/core] perf/x86: Tweak broken BIOS rules during check_hw_exists() tip-bot for Don Zickus
  2 siblings, 1 reply; 5+ messages in thread
From: George Dunlap @ 2015-05-21 17:57 UTC (permalink / raw)
  To: Don Zickus, Peter Zijlstra; +Cc: LKML, Ingo Molnar, konrad.wilk

On 05/18/2015 08:16 PM, Don Zickus wrote:
> I stumbled upon an AMD box that had the BIOS using a hardware counter.  Instead
> of printing out a warning and continuing, it failed and blocked further perf
> counter usage.
> 
> Looking through the history, I found commit a5ebe0ba3dff had tweaked the rules
> for a xen guest on an almost identical box and now changed the behaviour.
> 
> Unfortunately the rules were tweaked incorrectly and will always lead to msr
> failures even though the msrs are completely fine.
> 
> What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists:
> 
> <snip>
>         for (i = 0; i < x86_pmu.num_counters; i++) {
>                 reg = x86_pmu_config_addr(i);
>                 ret = rdmsrl_safe(reg, &val);
>                 if (ret)
>                         goto msr_fail;
>                 if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
>                         bios_fail = 1;
>                         val_fail = val;
>                         reg_fail = reg;
>                 }
>         }
> 
> <snip>
>         /*
>          * Read the current value, change it and read it back to see if it
>          * matches, this is needed to detect certain hardware emulators
>          * (qemu/kvm) that don't trap on the MSR access and always return 0s.
>          */
>         reg = x86_pmu_event_addr(0);
> 				^^^^
> 
> if the first perf counter is enabled, then this routine will always fail
> because the counter is running. :-(
> 
>         if (rdmsrl_safe(reg, &val))
>                 goto msr_fail;
>         val ^= 0xffffUL;
>         ret = wrmsrl_safe(reg, val);
>         ret |= rdmsrl_safe(reg, &val_new);
>         if (ret || val != val_new)
>                 goto msr_fail;
> 
> The above bios_fail used to be a 'goto' which is why it worked in the past.
> 
> Further, most vendors have migrated to using fixed counters to hide their
> evilness hence this problem rarely shows up now days except on a few old boxes.
> 
> I fixed my problem and kept the spirit of the original Xen fix, by recording a
> safe non-enable register to be used safely for the reading/writing check.
> Because it is not enabled, this passes on bare metal boxes (like metal), but
> should continue to throw an msr_fail on Xen guests because the register isn't
> emulated yet.
> 
> Now I get a proper bios_fail error message and Xen should still see their
> msr_fail message (untested).
> 
> Signed-off-by: Don Zickus <dzickus@redhat.com>

Right -- so what was actually broken was the "does this register work"
check, which needs a non-enabled register.

Would it make sense to add a comment somewhere in the code saying that
you need a disabled event counter for the MSR check to work properly?
It's sort of implied but it's not explicit.

Other than that, this looks good to me.  I'm not positive I have access
to the box I needed this for anymore -- I'll take a look for it next week.

In the mean time:

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

> ---
>  arch/x86/kernel/cpu/perf_event.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 1ee7b19..e4dc4e5 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -186,6 +186,7 @@ static bool check_hw_exists(void)
>  	u64 val, val_fail, val_new= ~0;
>  	int i, reg, reg_fail, ret = 0;
>  	int bios_fail = 0;
> +	int reg_safe = -1;
>  
>  	/*
>  	 * Check to see if the BIOS enabled any of the counters, if so
> @@ -200,6 +201,8 @@ static bool check_hw_exists(void)
>  			bios_fail = 1;
>  			val_fail = val;
>  			reg_fail = reg;
> +		} else {
> +			reg_safe = i;
>  		}
>  	}
>  
> @@ -218,11 +221,22 @@ static bool check_hw_exists(void)
>  	}
>  
>  	/*
> +	 * If all the counters are enabled, the below test will always
> +	 * fail.  The tools will also become useless in this scenario.
> +	 * Just fail and disable the hardware counters.
> +	 */
> +
> +	if (reg_safe == -1) {
> +		reg = reg_safe;
> +		goto msr_fail;
> +	}
> +
> +	/*
>  	 * Read the current value, change it and read it back to see if it
>  	 * matches, this is needed to detect certain hardware emulators
>  	 * (qemu/kvm) that don't trap on the MSR access and always return 0s.
>  	 */
> -	reg = x86_pmu_event_addr(0);
> +	reg = x86_pmu_event_addr(reg_safe);
>  	if (rdmsrl_safe(reg, &val))
>  		goto msr_fail;
>  	val ^= 0xffffUL;
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:perf/core] perf/x86: Tweak broken BIOS rules during check_hw_exists()
  2015-05-18 19:16 [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists Don Zickus
  2015-05-19  8:17 ` Peter Zijlstra
  2015-05-21 17:57 ` George Dunlap
@ 2015-05-27 10:02 ` tip-bot for Don Zickus
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Don Zickus @ 2015-05-27 10:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, linux-kernel, torvalds, dzickus, mingo, tglx, hpa

Commit-ID:  68ab747604da98f0a0414f197f346ac22888fcee
Gitweb:     http://git.kernel.org/tip/68ab747604da98f0a0414f197f346ac22888fcee
Author:     Don Zickus <dzickus@redhat.com>
AuthorDate: Mon, 18 May 2015 15:16:48 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 09:16:20 +0200

perf/x86: Tweak broken BIOS rules during check_hw_exists()

I stumbled upon an AMD box that had the BIOS using a hardware performance
counter. Instead of printing out a warning and continuing, it failed and
blocked further perf counter usage.

Looking through the history, I found this commit:

  a5ebe0ba3dff ("perf/x86: Check all MSRs before passing hw check")

which tweaked the rules for a Xen guest on an almost identical box and now
changed the behaviour.

Unfortunately the rules were tweaked incorrectly and will always lead to
MSR failures even though the MSRs are completely fine.

What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists():

<snip>
        for (i = 0; i < x86_pmu.num_counters; i++) {
                reg = x86_pmu_config_addr(i);
                ret = rdmsrl_safe(reg, &val);
                if (ret)
                        goto msr_fail;
                if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
                        bios_fail = 1;
                        val_fail = val;
                        reg_fail = reg;
                }
        }

<snip>
        /*
         * Read the current value, change it and read it back to see if it
         * matches, this is needed to detect certain hardware emulators
         * (qemu/kvm) that don't trap on the MSR access and always return 0s.
         */
        reg = x86_pmu_event_addr(0);
				^^^^

if the first perf counter is enabled, then this routine will always fail
because the counter is running. :-(

        if (rdmsrl_safe(reg, &val))
                goto msr_fail;
        val ^= 0xffffUL;
        ret = wrmsrl_safe(reg, val);
        ret |= rdmsrl_safe(reg, &val_new);
        if (ret || val != val_new)
                goto msr_fail;

The above bios_fail used to be a 'goto' which is why it worked in the past.

Further, most vendors have migrated to using fixed counters to hide their
evilness hence this problem rarely shows up now days except on a few old boxes.

I fixed my problem and kept the spirit of the original Xen fix, by recording a
safe non-enable register to be used safely for the reading/writing check.
Because it is not enabled, this passes on bare metal boxes (like metal), but
should continue to throw an msr_fail on Xen guests because the register isn't
emulated yet.

Now I get a proper bios_fail error message and Xen should still see their
msr_fail message (untested).

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: george.dunlap@eu.citrix.com
Cc: konrad.wilk@oracle.com
Link: http://lkml.kernel.org/r/1431976608-56970-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/perf_event.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 2eca194..4f7001f 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -190,6 +190,7 @@ static bool check_hw_exists(void)
 	u64 val, val_fail, val_new= ~0;
 	int i, reg, reg_fail, ret = 0;
 	int bios_fail = 0;
+	int reg_safe = -1;
 
 	/*
 	 * Check to see if the BIOS enabled any of the counters, if so
@@ -204,6 +205,8 @@ static bool check_hw_exists(void)
 			bios_fail = 1;
 			val_fail = val;
 			reg_fail = reg;
+		} else {
+			reg_safe = i;
 		}
 	}
 
@@ -222,11 +225,22 @@ static bool check_hw_exists(void)
 	}
 
 	/*
+	 * If all the counters are enabled, the below test will always
+	 * fail.  The tools will also become useless in this scenario.
+	 * Just fail and disable the hardware counters.
+	 */
+
+	if (reg_safe == -1) {
+		reg = reg_safe;
+		goto msr_fail;
+	}
+
+	/*
 	 * Read the current value, change it and read it back to see if it
 	 * matches, this is needed to detect certain hardware emulators
 	 * (qemu/kvm) that don't trap on the MSR access and always return 0s.
 	 */
-	reg = x86_pmu_event_addr(0);
+	reg = x86_pmu_event_addr(reg_safe);
 	if (rdmsrl_safe(reg, &val))
 		goto msr_fail;
 	val ^= 0xffffUL;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists
  2015-05-21 17:57 ` George Dunlap
@ 2015-06-02 15:15   ` George Dunlap
  0 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2015-06-02 15:15 UTC (permalink / raw)
  To: Don Zickus, Peter Zijlstra; +Cc: LKML, Ingo Molnar, konrad.wilk

On 05/21/2015 06:57 PM, George Dunlap wrote:
> On 05/18/2015 08:16 PM, Don Zickus wrote:
>> I stumbled upon an AMD box that had the BIOS using a hardware counter.  Instead
>> of printing out a warning and continuing, it failed and blocked further perf
>> counter usage.
>>
>> Looking through the history, I found commit a5ebe0ba3dff had tweaked the rules
>> for a xen guest on an almost identical box and now changed the behaviour.
>>
>> Unfortunately the rules were tweaked incorrectly and will always lead to msr
>> failures even though the msrs are completely fine.
>>
>> What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists:
>>
>> <snip>
>>         for (i = 0; i < x86_pmu.num_counters; i++) {
>>                 reg = x86_pmu_config_addr(i);
>>                 ret = rdmsrl_safe(reg, &val);
>>                 if (ret)
>>                         goto msr_fail;
>>                 if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
>>                         bios_fail = 1;
>>                         val_fail = val;
>>                         reg_fail = reg;
>>                 }
>>         }
>>
>> <snip>
>>         /*
>>          * Read the current value, change it and read it back to see if it
>>          * matches, this is needed to detect certain hardware emulators
>>          * (qemu/kvm) that don't trap on the MSR access and always return 0s.
>>          */
>>         reg = x86_pmu_event_addr(0);
>> 				^^^^
>>
>> if the first perf counter is enabled, then this routine will always fail
>> because the counter is running. :-(
>>
>>         if (rdmsrl_safe(reg, &val))
>>                 goto msr_fail;
>>         val ^= 0xffffUL;
>>         ret = wrmsrl_safe(reg, val);
>>         ret |= rdmsrl_safe(reg, &val_new);
>>         if (ret || val != val_new)
>>                 goto msr_fail;
>>
>> The above bios_fail used to be a 'goto' which is why it worked in the past.
>>
>> Further, most vendors have migrated to using fixed counters to hide their
>> evilness hence this problem rarely shows up now days except on a few old boxes.
>>
>> I fixed my problem and kept the spirit of the original Xen fix, by recording a
>> safe non-enable register to be used safely for the reading/writing check.
>> Because it is not enabled, this passes on bare metal boxes (like metal), but
>> should continue to throw an msr_fail on Xen guests because the register isn't
>> emulated yet.
>>
>> Now I get a proper bios_fail error message and Xen should still see their
>> msr_fail message (untested).
>>
>> Signed-off-by: Don Zickus <dzickus@redhat.com>
> 
> Right -- so what was actually broken was the "does this register work"
> check, which needs a non-enabled register.
> 
> Would it make sense to add a comment somewhere in the code saying that
> you need a disabled event counter for the MSR check to work properly?
> It's sort of implied but it's not explicit.
> 
> Other than that, this looks good to me.  I'm not positive I have access
> to the box I needed this for anymore -- I'll take a look for it next week.
> 
> In the mean time:
> 
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

I managed to track down the machine that had the problem and verify that
things still work for me after this patch.  So now you can add:

Tested-by: George Dunlap <george.dunlap@eu.citrix.com>

Thanks,
 -George

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-06-02 15:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-18 19:16 [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists Don Zickus
2015-05-19  8:17 ` Peter Zijlstra
2015-05-21 17:57 ` George Dunlap
2015-06-02 15:15   ` George Dunlap
2015-05-27 10:02 ` [tip:perf/core] perf/x86: Tweak broken BIOS rules during check_hw_exists() tip-bot for Don Zickus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).