linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	"Luck, Tony" <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
	tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	bberg@redhat.com, x86@kernel.org, linux-edac@vger.kernel.org,
	linux-kernel@vger.kernel.org, hdegoede@redhat.com,
	ckellner@redhat.com
Subject: Re: [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages
Date: Tue, 15 Oct 2019 06:43:07 -0700	[thread overview]
Message-ID: <9d1ab837e757375374f2a45655dbe8aba42aeee5.camel@linux.intel.com> (raw)
In-Reply-To: <20191015085257.GE2311@hirez.programming.kicks-ass.net>

On Tue, 2019-10-15 at 10:52 +0200, Peter Zijlstra wrote:
> On Mon, Oct 14, 2019 at 03:27:35PM -0700, Luck, Tony wrote:
> > On Mon, Oct 14, 2019 at 11:36:18PM +0200, Borislav Petkov wrote:
> > > This description is already *begging* for this delay value to be
> > > automatically set by the kernel. Putting yet another knob in
> > > front of
> > > the user who doesn't have a clue most of the time shows one more
> > > time
> > > that we haven't done our job properly by asking her to know what
> > > we
> > > already do.
> > > 
> > > IOW, a simple history feedback mechanism which sets the timeout
> > > based on
> > > the last couple of values is much smarter. The thing would have a
> > > max
> > > value, of course, which, when exceeded should mean an anomaly,
> > > etc, but
> > > almost anything else is better than merely asking the user to
> > > make an
> > > educated guess.
> > 
> > You need a plausible start point for the "when to worry the user"
> > message.  Maybe that is your "max value"?
> > 
> > So if the system has a couple of excursions above temperature
> > lasting
> > 1 second and then 2 seconds ... would you like to see those ignored
> > (because they are below the initial max)? But now we have a couple
> > of data points pick some new value to be the threshold for
> > reporting?
> > 
> > What value should we pick (based on 1 sec, then 2 sec)?
> > 
> > I would be worried that it would self tune to the point where it
> > does report something that it really didn't need to (e.g. as a
> > result
> > of a few consecutive very short excursions).
> 
> I'm guessing Boris is thinking of a simple IIR like avg filter.
> 
> 	avg = avg + (sample-avg) / 4
> 
> And then only print when sample > 2*avg. If you initialize that with
> some appropriately large value, it should settle down into what it
> 'normal' for that particular piece of hardware.
I will take a shot with some IIR implementation.

> 
> Still, I'm boggled by the whole idea that hitting critical hard
> throttle
> is considered 'normal' at all.
As explained in my previous email, this is not so called TJMax, where
it will shutdown. If you keep this temperature for longer time, cooling
needs be adjusted.

> 
> > We also need to take into account the "typical sampling interval"
> > for user space thermal control software.
> 
> Why is control of critical thermal crud in userspace? That seems like
> a
> massive design fail.
The TjMax is taken care by the embedded firmware or kernel depending on
how OEM wants it to be controlled. User space is for mostly balancing
non CPU parts, which are not urgent. For example you run CPU at high
temperature for long duration, the skin will heat up, which takes much
longer time to cool than CPU itself.

Thanks,
Srinivas 




  reply	other threads:[~2019-10-15 13:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2c2b65c23be3064504566c5f621c1f37bf7e7326.camel@redhat.com>
2019-10-14 21:21 ` [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages Srinivas Pandruvada
2019-10-14 21:21   ` [PATCH 2/2] x86, mce: Add additional kernel boot parameter Srinivas Pandruvada
2019-10-14 21:36   ` [PATCH 1/2] x86, mce, therm_throt: Optimize logging of thermal throttle messages Borislav Petkov
2019-10-14 22:27     ` Luck, Tony
2019-10-15  8:36       ` Borislav Petkov
2019-10-15  8:52       ` Peter Zijlstra
2019-10-15 13:43         ` Srinivas Pandruvada [this message]
2019-10-14 22:41     ` Srinivas Pandruvada
2019-10-15  8:46       ` Borislav Petkov
2019-10-15 14:01         ` Srinivas Pandruvada
2019-10-15  8:48   ` Peter Zijlstra
2019-10-15 13:31     ` Srinivas Pandruvada
2019-10-16  8:14       ` Peter Zijlstra
2019-10-16 14:00         ` Borislav Petkov
2019-10-17 21:31           ` Luck, Tony
2019-10-17 21:44             ` Borislav Petkov
2019-10-17 23:53               ` Luck, Tony
2019-10-18  6:46                 ` Borislav Petkov
2019-10-18  7:17               ` Peter Zijlstra
2019-10-18 12:26               ` Srinivas Pandruvada
2019-10-18 13:23                 ` Borislav Petkov
2019-10-18 15:55                   ` Srinivas Pandruvada
2019-10-18 19:40                     ` Borislav Petkov
2019-10-18 18:02                   ` Luck, Tony
2019-10-18 19:45                     ` Borislav Petkov
2019-10-18 20:38                       ` Luck, Tony
2019-10-19  8:10                         ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d1ab837e757375374f2a45655dbe8aba42aeee5.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=bberg@redhat.com \
    --cc=bp@alien8.de \
    --cc=ckellner@redhat.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).