All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Tillman <toff.tillman@gmail.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Subject: Re: Fwd: coretemp seems to reset immediately
Date: Thu, 16 Feb 2017 06:12:40 +1300	[thread overview]
Message-ID: <CAO299FsydTd2cGy768Gfworz1df3X7gCm-1s1bjKiyf6q_WjpQ@mail.gmail.com> (raw)
In-Reply-To: <0f23f434-e7f4-851d-ba45-ff0d6ed988c0@roeck-us.net>

Thank you very much for your research and input as to the underlying
cause. I had also found those references, and when I opened it up it
looked like the fan had dust on it, so I agree that was the real issue
for my machine. However I also found some references to Linux having
problems with overheating on the machine when Windows did not.

The only reason I sent the mail to this list was that the log output
made it seem like coretemp was reporting a problem, and then reporting
no problem, in the same millisecond. If coretemp is saying there is no
problem, then machine check software will remain inactive. And the
machine will continue to get hotter.

Do you know what kernel module would be involved for the machine check
which would rely on coretemp? I could forward it to them.

On Wed, Feb 15, 2017 at 11:48 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 02/15/2017 02:00 AM, Chris Tillman wrote:
>>
>> Hi,
>>
>> I had the awful experience of having my computer fry before my eyes
>> the other day. It was running quite hot (building llvm), and it
>> stopped accepting mouse inputs. I tried to regain control, and after
>> 20 seconds or so it switched me to virtual console 1. But very shortly
>> after that it died, and now it won't even start to boot.
>>
>> Anyway, the reason I'm writing: I retrieved the disk out of it, and on
>> another machine, looked through the syslog. I saw that coretemp was
>> reporting over temps every five minutes, and claiming the cpu was
>> being throttled. But then the very next message in the log says the
>> temperature is normal. I'm wondering, does this mean the throttling
>> was also being cancelled immediately? If so it could explain how the
>> machine got so hot that it died on the spot.
>>
>> I've attached the log. Here's the final refrain of what was occurring
>> every 5 minutes for the fifty minutes previous:
>>
>> Feb 12 19:27:25 ctillman kernel: [137082.856603] CPU3: Core
>> temperature above threshold, cpu clock throttled (total events =
>> 804180)
>> Feb 12 19:27:25 ctillman kernel: [137082.856604] CPU2: Core
>> temperature above threshold, cpu clock throttled (total events =
>> 804180)
>> Feb 12 19:27:25 ctillman kernel: [137082.856607] CPU1: Package
>> temperature above threshold, cpu clock throttled (total events =
>> 862662)
>> Feb 12 19:27:25 ctillman kernel: [137082.856608] CPU0: Package
>> temperature above threshold, cpu clock throttled (total events =
>> 862662)
>> Feb 12 19:27:25 ctillman kernel: [137082.856610] CPU2: Package
>> temperature above threshold, cpu clock throttled (total events =
>> 862662)
>> Feb 12 19:27:25 ctillman kernel: [137082.856621] CPU3: Package
>> temperature above threshold, cpu clock throttled (total events =
>> 862662)
>> Feb 12 19:27:25 ctillman kernel: [137082.857603] CPU3: Core
>> temperature/speed normal
>> Feb 12 19:27:25 ctillman kernel: [137082.857604] CPU2: Core
>> temperature/speed normal
>> Feb 12 19:27:25 ctillman kernel: [137082.857606] CPU0: Package
>> temperature/speed normal
>> Feb 12 19:27:25 ctillman kernel: [137082.857608] CPU1: Package
>> temperature/speed normal
>> Feb 12 19:27:25 ctillman kernel: [137082.857609] CPU2: Package
>> temperature/speed normal
>> Feb 12 19:27:25 ctillman kernel: [137082.857612] CPU3: Package
>> temperature/speed normal
>>
>> Notice how each core gets flagged, and then in the same millisecond
>> gets cleared. For example
>>
>> [137082.856603] CPU3: Core temperature above threshold, cpu clock
>> throttled (total events = 804180)
>> [137082.857603] CPU3: Core temperature/speed normal
>>
>> The machine is an HP Probook 4530s, which I just bought second hand a
>> couple weeks ago. I'd really been enjoying its speed! compared to the
>> older computer I'm writing on now.
>>
>> I'd already had a run-in with overheating, and filed a bug against the
>> gpu because it apparently crashed during the previous event:
>>
>> [Bug 99611] GPU hang after over temperature
>>
>> That log also showed the same pattern.
>>
>
> That has nothing to do with coretemp, which is purely passive.
> Thermal throttling is supported as part of the machine check code.
>
> No idea where you filed the bug (not on bugzilla.kernel.org), but
> I don't really think you can blame software. My guess would be that
> the CPU fan was not operating properly; maybe the thermal paste
> between CPU and heatsink was getting old, or maybe the fan is just
> broken, or maybe there is just enough dust in the machine that it
> no longer cools properly.
>
> There is also mention in some forums that a BIOS update helps with
> overheating issues on this laptop.
>
> Guenter
>



-- 
Chris Tillman
Developer

  reply	other threads:[~2017-02-15 17:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAO299FuK1yX2JRE6E5a4G-8dcJ-RbOjZj_TPyCUJ+tZtM2NWgA@mail.gmail.com>
2017-02-15 10:00 ` Fwd: coretemp seems to reset immediately Chris Tillman
2017-02-15 10:48   ` Guenter Roeck
2017-02-15 17:12     ` Chris Tillman [this message]
2017-02-16  5:17       ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAO299FsydTd2cGy768Gfworz1df3X7gCm-1s1bjKiyf6q_WjpQ@mail.gmail.com \
    --to=toff.tillman@gmail.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.