[PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot

* [PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot
@ 2018-06-03 18:23 David Arcari
  2018-06-04  8:24 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: David Arcari @ 2018-06-03 18:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Arcari, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Andi Kleen, Kan Liang, Jiri Olsa, Donald Zickus,
	Prarit Bhargava, Jerry Hoemann

On some systems pressing the external NMI button is now failing to inject
an NMI 5-10% of the time.  This causes confusion for a user that expects
the NMI to dump the system.

Commit 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")
does not read the firmware setting of the FREEZE_WHILE_SMM bit and will
always clear it when the PMU is initialized.  As a result the performance
counters will always run and that greatly expands the race in which
external NMI will not be processed if a local NMI is already being
processed.

One option is to change default_do_nmi().  The code snippet below shows the
relevant portion of a patch that resolves the issue, but it is problematic
from a performance perspective and was dismissed.

-345,7 +345,17 @@ static void default_do_nmi(struct pt_regs *regs)
 		 */
 		if (handled > 1)
 			__this_cpu_write(swallow_nmi, true);
-		return;
+
+		/*
+		 * Unfortunately, there is a race condition which can
+		 * result in a missing an external NMI.  Typically, an
+		 * external NMI is processed on cpu 0.  Therefore, on
+		 * cpu 0 check for an external NMI before returning.
+		 */
+		if (smp_processor_id() ||
+		    (x86_platform.get_nmi_reason() & NMI_REASON_MASK) == 0) {
+			return;
+		}
 	}

Ultimately, the issue can be resolved by storing the default firmware
setting of FREEZE_WHILE_SMM before initializing the PMU.

Fixes: 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")

Signed-off-by: David Arcari <darcari@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Donald Zickus <dzickus@redhat.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
---
 arch/x86/events/intel/core.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 707b2a9..fce98df 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3325,6 +3325,18 @@ static void flip_smm_bit(void *data)
 	}
 }
 
+static int read_smm_bit(void)
+{
+	u64 val;
+
+	if (!rdmsrl_safe(MSR_IA32_DEBUGCTLMSR, &val)) {
+		if (val & DEBUGCTLMSR_FREEZE_IN_SMM)
+			return 1;
+	}
+
+	return 0;
+}
+
 static void intel_pmu_cpu_starting(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
@@ -4423,6 +4435,8 @@ __init int intel_pmu_init(void)
 		pr_cont("full-width counters, ");
 	}
 
+	x86_pmu.attr_freeze_on_smi = read_smm_bit();
+
 	kfree(to_free);
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread