linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"bberg@redhat.com" <bberg@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hdegoede@redhat.com" <hdegoede@redhat.com>,
	"ckellner@redhat.com" <ckellner@redhat.com>
Subject: Re: [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages
Date: Thu, 17 Oct 2019 23:44:45 +0200	[thread overview]
Message-ID: <20191017214445.GG14441@zn.tnic> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F7F4A57D0@ORSMSX115.amr.corp.intel.com>

On Thu, Oct 17, 2019 at 09:31:30PM +0000, Luck, Tony wrote:
> That sounds like the right short term action.
> 
> Depending on what we end up with from Srinivas ... we may want
> to reconsider the severity.  The basic premise of Srinivas' patch
> is to avoid printing anything for short excursions above temperature
> threshold. But the effect of that is that when we find the core/package
> staying above temperature for an extended period of time, we are
> in a serious situation where some action may be needed. E.g.
> move the laptop off the soft surface that is blocking the air vents.

I don't think having a critical severity message is nearly enough.
There are cases where the users simply won't see that message, no shell
opened, nothing scanning dmesg, nothing pops up on the desktop to show
KERN_CRIT messages, etc.

If we really wanna handle this case then we must be much more reliable:

* we throttle the machine from within the kernel - whatever that may mean
* if that doesn't help, we stop scheduling !root tasks
* if that doesn't help, we halt
* ...

These are purely hypothetical things to do but I'm pointing them out as
an example that in a high temperature situation we should be actively
doing something and not wait for the user to do that.

Come to think of it, one can apply the same type of logic here and split
the temp severity into action-required events and action-optional events
and then depending on the type, we do things.

Now what those things are, should be determined by the severity of the
events. Which would mean, we'd need to know how severe those events are.
And since this is left in the hands of the OEMs, good luck to us. ;-\

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2019-10-17 21:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2c2b65c23be3064504566c5f621c1f37bf7e7326.camel@redhat.com>
2019-10-14 21:21 ` [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages Srinivas Pandruvada
2019-10-14 21:21   ` [PATCH 2/2] x86, mce: Add additional kernel boot parameter Srinivas Pandruvada
2019-10-14 21:36   ` [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages Borislav Petkov
2019-10-14 22:27     ` Luck, Tony
2019-10-15  8:36       ` Borislav Petkov
2019-10-15  8:52       ` Peter Zijlstra
2019-10-15 13:43         ` Srinivas Pandruvada
2019-10-14 22:41     ` Srinivas Pandruvada
2019-10-15  8:46       ` Borislav Petkov
2019-10-15 14:01         ` Srinivas Pandruvada
2019-10-15  8:48   ` Peter Zijlstra
2019-10-15 13:31     ` Srinivas Pandruvada
2019-10-16  8:14       ` Peter Zijlstra
2019-10-16 14:00         ` Borislav Petkov
2019-10-17 21:31           ` Luck, Tony
2019-10-17 21:44             ` Borislav Petkov [this message]
2019-10-17 23:53               ` Luck, Tony
2019-10-18  6:46                 ` Borislav Petkov
2019-10-18  7:17               ` Peter Zijlstra
2019-10-18 12:26               ` Srinivas Pandruvada
2019-10-18 13:23                 ` Borislav Petkov
2019-10-18 15:55                   ` Srinivas Pandruvada
2019-10-18 19:40                     ` Borislav Petkov
2019-10-18 18:02                   ` Luck, Tony
2019-10-18 19:45                     ` Borislav Petkov
2019-10-18 20:38                       ` Luck, Tony
2019-10-19  8:10                         ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191017214445.GG14441@zn.tnic \
    --to=bp@alien8.de \
    --cc=bberg@redhat.com \
    --cc=ckellner@redhat.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).