All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Borislav Petkov <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: Yazen.Ghannam@amd.com, torvalds@linux-foundation.org,
	mingo@kernel.org, tglx@linutronix.de, bp@suse.de,
	linux-edac@vger.kernel.org, ashok.raj@intel.com, hpa@zytor.com,
	tony.luck@intel.com, peterz@infradead.org,
	linux-kernel@vger.kernel.org
Subject: [tip:ras/core] x86/ras/therm_throt: Do not log a fake MCE for thermal events
Date: Tue, 24 Jan 2017 00:47:20 -0800	[thread overview]
Message-ID: <tip-9b052ea4ced0fa1ad30a2eafe86984a16297e6f1@git.kernel.org> (raw)
In-Reply-To: <20170123183514.13356-3-bp@alien8.de>

Commit-ID:  9b052ea4ced0fa1ad30a2eafe86984a16297e6f1
Gitweb:     http://git.kernel.org/tip/9b052ea4ced0fa1ad30a2eafe86984a16297e6f1
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:07 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:53 +0100

x86/ras/therm_throt: Do not log a fake MCE for thermal events

We log a fake bank 128 MCE to note that we're handling a CPU thermal
event. However, this confuses people into thinking that their hardware
generates MCEs. Hijacking MCA for logging thermal events is a gross
misuse anyway and it shouldn't have been done in the first place. And
besides we have other means for dealing with thermal events which are
much more suitable.

So let's kill the MCE logging part.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170105213846.GA12024@gmail.com
Link: http://lkml.kernel.org/r/20170123183514.13356-3-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/mce.h               |  6 ------
 arch/x86/kernel/cpu/mcheck/mce.c         | 25 -------------------------
 arch/x86/kernel/cpu/mcheck/therm_throt.c | 30 +++++++++++-------------------
 3 files changed, 11 insertions(+), 50 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 5132f2a..a09ed05 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -97,10 +97,6 @@
 
 #define MCE_OVERFLOW 0		/* bit 0 in flags means overflow */
 
-/* Software defined banks */
-#define MCE_EXTENDED_BANK	128
-#define MCE_THERMAL_BANK	(MCE_EXTENDED_BANK + 0)
-
 #define MCE_LOG_LEN 32
 #define MCE_LOG_SIGNATURE	"MACHINECHECK"
 
@@ -306,8 +302,6 @@ extern void (*deferred_error_int_vector)(void);
 
 void intel_init_thermal(struct cpuinfo_x86 *c);
 
-void mce_log_therm_throt_event(__u64 status);
-
 /* Interrupt Handler for core thermal thresholds */
 extern int (*platform_thermal_notify)(__u64 msr_val);
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef432..6eef6fd 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1331,31 +1331,6 @@ static void mce_process_work(struct work_struct *dummy)
 	mce_gen_pool_process();
 }
 
-#ifdef CONFIG_X86_MCE_INTEL
-/***
- * mce_log_therm_throt_event - Logs the thermal throttling event to mcelog
- * @cpu: The CPU on which the event occurred.
- * @status: Event status information
- *
- * This function should be called by the thermal interrupt after the
- * event has been processed and the decision was made to log the event
- * further.
- *
- * The status parameter will be saved to the 'status' field of 'struct mce'
- * and historically has been the register value of the
- * MSR_IA32_THERMAL_STATUS (Intel) msr.
- */
-void mce_log_therm_throt_event(__u64 status)
-{
-	struct mce m;
-
-	mce_setup(&m);
-	m.bank = MCE_THERMAL_BANK;
-	m.status = status;
-	mce_log(&m);
-}
-#endif /* CONFIG_X86_MCE_INTEL */
-
 /*
  * Periodic polling timer for "silent" machine check errors.  If the
  * poller finds an MCE, poll 2x faster.  When the poller finds no more
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 465aca8..85469f8 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -6,7 +6,7 @@
  *
  * Maintains a counter in /sys that keeps track of the number of thermal
  * events, such that the user knows how bad the thermal problem might be
- * (since the logging to syslog and mcelog is rate limited).
+ * (since the logging to syslog is rate limited).
  *
  * Author: Dmitriy Zavin (dmitriyz@google.com)
  *
@@ -141,13 +141,8 @@ static struct attribute_group thermal_attr_group = {
  * IRQ has been acknowledged.
  *
  * It will take care of rate limiting and printing messages to the syslog.
- *
- * Returns: 0 : Event should NOT be further logged, i.e. still in
- *              "timeout" from previous log message.
- *          1 : Event should be logged further, and a message has been
- *              printed to the syslog.
  */
-static int therm_throt_process(bool new_event, int event, int level)
+static void therm_throt_process(bool new_event, int event, int level)
 {
 	struct _thermal_state *state;
 	unsigned int this_cpu = smp_processor_id();
@@ -162,16 +157,16 @@ static int therm_throt_process(bool new_event, int event, int level)
 		else if (event == POWER_LIMIT_EVENT)
 			state = &pstate->core_power_limit;
 		else
-			 return 0;
+			return;
 	} else if (level == PACKAGE_LEVEL) {
 		if (event == THERMAL_THROTTLING_EVENT)
 			state = &pstate->package_throttle;
 		else if (event == POWER_LIMIT_EVENT)
 			state = &pstate->package_power_limit;
 		else
-			return 0;
+			return;
 	} else
-		return 0;
+		return;
 
 	old_event = state->new_event;
 	state->new_event = new_event;
@@ -181,7 +176,7 @@ static int therm_throt_process(bool new_event, int event, int level)
 
 	if (time_before64(now, state->next_check) &&
 			state->count != state->last_count)
-		return 0;
+		return;
 
 	state->next_check = now + CHECK_INTERVAL;
 	state->last_count = state->count;
@@ -193,16 +188,14 @@ static int therm_throt_process(bool new_event, int event, int level)
 				this_cpu,
 				level == CORE_LEVEL ? "Core" : "Package",
 				state->count);
-		return 1;
+		return;
 	}
 	if (old_event) {
 		if (event == THERMAL_THROTTLING_EVENT)
 			pr_info("CPU%d: %s temperature/speed normal\n", this_cpu,
 				level == CORE_LEVEL ? "Core" : "Package");
-		return 1;
+		return;
 	}
-
-	return 0;
 }
 
 static int thresh_event_valid(int level, int event)
@@ -365,10 +358,9 @@ static void intel_thermal_interrupt(void)
 	/* Check for violation of core thermal thresholds*/
 	notify_thresholds(msr_val);
 
-	if (therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
-				THERMAL_THROTTLING_EVENT,
-				CORE_LEVEL) != 0)
-		mce_log_therm_throt_event(msr_val);
+	therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
+			    THERMAL_THROTTLING_EVENT,
+			    CORE_LEVEL);
 
 	if (this_cpu_has(X86_FEATURE_PLN) && int_pln_enable)
 		therm_throt_process(msr_val & THERM_STATUS_POWER_LIMIT,

  reply	other threads:[~2017-01-24  8:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
2017-01-23 18:35 ` [PATCH 1/9] x86/mce-inject: Make it depend on X86_LOCAL_APIC Borislav Petkov
2017-01-24  8:46   ` [tip:ras/core] x86/ras/inject: Make it depend on X86_LOCAL_APIC=y tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 2/9] x86/MCE/therm_throt: Do not log a fake MCE for a thermal event Borislav Petkov
2017-01-24  8:47   ` tip-bot for Borislav Petkov [this message]
2017-01-23 18:35 ` [PATCH 3/9] x86/MCE/AMD: Make sysfs names of banks more user-friendly Borislav Petkov
2017-01-24  8:47   ` [tip:ras/core] x86/ras/amd: " tip-bot for Yazen Ghannam
2017-01-23 18:35 ` [PATCH 4/9] x86/MCE: Flip the TSC-adding logic Borislav Petkov
2017-01-24  8:48   ` [tip:ras/core] x86/ras: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 5/9] x86/ras/mce_amd_inj: Change dependency Borislav Petkov
2017-01-24  8:48   ` [tip:ras/core] x86/ras/amd/inj: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 6/9] EDAC/mce_amd: Unexport amd_decode_mce() Borislav Petkov
2017-01-24  8:49   ` [tip:ras/core] EDAC/mce/amd: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 7/9] EDAC/mce_amd: Dump TSC value Borislav Petkov
2017-01-24  8:50   ` [tip:ras/core] EDAC/mce/amd: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 8/9] x86/MCE: Get rid of mce_process_work() Borislav Petkov
2017-01-24  8:50   ` [tip:ras/core] x86/ras: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 9/9] x86/MCE, EDAC, acpi: Assign MCE notifier handlers a priority Borislav Petkov
2017-01-24  8:51   ` [tip:ras/core] x86/ras, " tip-bot for Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2017-01-05  5:00 Dell XPS13: MCE (Hardware Error) reported Daniel J Blueman
2017-01-05 14:05 ` Daniel J Blueman
2017-01-05 20:10   ` Alexander Alemayhu
2017-01-05 20:31     ` Borislav Petkov
2017-01-05 20:43       ` Raj, Ashok
2017-01-05 21:03         ` Pandruvada, Srinivas
2017-01-05 23:23           ` Alexander Alemayhu
2017-01-05 21:38       ` Alexander Alemayhu
2017-01-05 23:28       ` Raj, Ashok
2017-01-05 23:56         ` Borislav Petkov
2017-01-06  1:26           ` Raj, Ashok
2017-01-06 11:16             ` Borislav Petkov
2017-01-06 15:58               ` Raj, Ashok
2017-01-06 16:54                 ` Borislav Petkov
2017-01-06 17:04                   ` Raj, Ashok
2017-01-09 10:55                   ` Paul Menzel
2017-01-09 11:05                     ` Borislav Petkov
2017-01-09 11:11                       ` Paul Menzel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-9b052ea4ced0fa1ad30a2eafe86984a16297e6f1@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=ashok.raj@intel.com \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.